Permissive Learning Mode 1/6 audit skeleton#584
Conversation
Wide-spread plumbing + Windows-only plm crate skeleton: - wxc-exec --audit flag, console-control handler, named-mutex singleton - AppContainer release-build gate rejecting permissiveLearningMode - �udit field in wire model and config parser - plm crate: clap dispatch for start/stop/log, WPR start/stop, embedded WPR profile materializer, singleton coordination, wpr.exe path resolution - CI/build infra (Azure Pipelines + GitHub Actions) to build plm.exe and stage it next to wxc-exec.exe via build.bat Event parsing, capability extraction, filesystem/UI merging, and the adjusted-config writer arrive in subsequent PRs. Builds + tests pass: cargo fmt --check, cargo clippy -p plm, cargo test -p plm (25 tests). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces the initial (“skeleton”) plumbing for Permissive Learning Mode (PLM) audit tracing on Windows: a new plm tool (WPR start/stop + coordination helpers) and a wxc-exec --audit flag that injects permissiveLearningMode and brackets the workload with trace start/stop plus best-effort cleanup to avoid leaking the host’s NT Kernel Logger session.
Changes:
- Add
wxc-exec --audit(Windows-only) that injectspermissiveLearningMode, spawnsplm.exe start/stop, and adds Ctrl-C / panic /process::exitcleanup for WPR sessions and a host-wide singleton mutex. - Add new
core/plmcrate (Windows-only binary + shared library helpers) implementing WPR lifecycle, safe absolutewpr.exeresolution, embedded WPRP profile staging, and coordination primitives. - Update build scripts and CI (GitHub Actions + ADO) to build/test
plmand run a Windows PLM integration harness; adjust workspace membership/default-members accordingly.
Reviewed changes
Copilot reviewed 22 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/core/wxc/src/main.rs | Adds --audit, plm spawning, WPR cancel cleanup, and singleton coordination. |
| src/core/wxc/Cargo.toml | Adds plm dependency for shared coordination constants/helpers. |
| src/core/wxc_common/src/models.rs | Adds ExecutionRequest.audit to allow release override behavior. |
| src/core/wxc_common/src/config_parser.rs | Adds audit-related warnings and initializes audit: false. |
| src/core/plm/src/wpr_path.rs | Adds safe absolute resolution of wpr.exe (System32 via API) + tests. |
| src/core/plm/src/stop.rs | Adds plm stop skeleton: wpr -stop, log dir creation, testable seams. |
| src/core/plm/src/start.rs | Adds plm start skeleton: start trace, cancel-on-conflict retry, profile sanity tests. |
| src/core/plm/src/profile_gen.rs | Adds embedded WPRP + atomic “write if missing” materializer with tests. |
| src/core/plm/src/main.rs | Adds Windows-only plm.exe CLI + ctrl handler + singleton handling. |
| src/core/plm/src/log.rs | Adds interactive plm log skeleton: prompt/start/prompt/stop into temp ETL. |
| src/core/plm/src/lib.rs | Exposes the PLM library modules with Windows-only gating where needed. |
| src/core/plm/src/coordination.rs | Shared singleton env-var name + ctrl-handler bounded-wait helper + tests. |
| src/core/plm/readme.md | Documents PLM skeleton behavior, layout, CLI, and limitations. |
| src/core/plm/Cargo.toml | Introduces plm crate deps and Windows-target-gated windows crate usage. |
| src/core/plm/build.rs | Embeds Windows VersionInfo for plm.exe. |
| src/Cargo.toml | Adds core/plm to workspace and defines default-members excluding PLM. |
| src/Cargo.lock | Adds plm and new transitive dependencies. |
| src/backends/appcontainer/common/src/appcontainer_runner.rs | Allows permissiveLearningMode in release when request.audit is true. |
| sdk/tests/integration/package-lock.json | Updates local SDK version and a few dev dependency lock entries. |
| README.md | Adds end-user docs for --audit and PLM tool usage. |
| build.bat | Builds/stages plm.exe, adds --with-bfs, and refreshes local SDK file: dep. |
| .github/workflows/Build.Windows.Job.yml | Builds/tests plm and runs PLM integration tests + artifact upload on failure. |
| .github/workflows/Build.Linux.Job.yml | Builds/tests plm library on Linux to enforce cfg-gating hygiene. |
| .azure-pipelines/templates/Rust.Build.Job.yml | Builds/tests plm in ADO Windows jobs to keep it green. |
Files not reviewed (1)
- sdk/tests/integration/package-lock.json: Generated file
| /// invoke `wpr.exe` by absolute path | ||
| /// (`%SystemRoot%\System32\wpr.exe`) rather than as a bare name so | ||
| /// `CreateProcessW`'s implicit CWD-first search order can't be abused | ||
| /// to substitute a planted binary. PLM runs as administrator, so a |
There was a problem hiding this comment.
Unfortunately, it is. It's something that Brian is going to be addressing for next release. WPR has to run as admin in order to gather the logs we need
There was a problem hiding this comment.
Will wxc-exec have to run as admin? Or will wxc-exec run as non-admin but when plm.exe is started the user will get a UAC prompt?
There was a problem hiding this comment.
Currently it needs to be ran as admin, but I like prompt for elevation suggestion and will add it in
There was a problem hiding this comment.
Behavior is now: prompt for elevation and spawns an elevated prompt that runs for that evocation
Replace the workspace's bespoke default-members = members - plm list with the implicit default (all members), and remove the crate-level #![cfg(target_os = "windows")] on plm's main.rs. The Windows-only body is gated per-item; a tiny non-Windows fn main() prints and exits 1 so the binary still links on Linux/macOS. Drops the special-cased -p plm/--lib steps in the GitHub Actions and Azure DevOps templates that existed solely because plm was excluded from default-members; the regular workspace build now covers it, and the Linux gate keeps a -p plm build/test step so the cross-platform contract is still CI-enforced. Also adds core/mxc-sdk to the implicit default (Copilot review #12). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| /// The 2s budget is chosen so the combined budget of the wxc-exec | ||
| /// handler (`2 * CTRL_HANDLER_DRAIN_TIMEOUT`) stays under the | ||
| /// ~5s OS-imposed kill budget for `CTRL_CLOSE_EVENT` / | ||
| /// `CTRL_LOGOFF_EVENT` / `CTRL_SHUTDOWN_EVENT`, with ~500ms of | ||
| /// slack for the actual `wpr -cancel` spawn. Pinned by | ||
| /// `tests::ctrl_handler_drain_timeout_respects_os_budget`. |
There was a problem hiding this comment.
thought: these times might be machine specific so they could differ depending on the machine.
Addresses MGudgin / Copilot review feedback on PR #584: #2 --audit lifecycle chatter (spawn banner, non-zero-exit / spawn-fail lines, capability-injection warning) now writes only to the log buffer by default; add --audit-verbose to also surface them on stderr for pipeline-debugging. #3 Replace the encode_utf16+PCWSTR mutex-name literal with the windows::core::w!() macro in both wxc-exec (Global\\Mxc_Plm_Audit) and plm.exe. #4 Scope chrono to default-features = false, features = std+clock so it stops dragging js-sys / wasm-bindgen / iana-time-zone into the dep tree. #5 Clarify the "PLM runs as administrator" comment — the elevation lives in "launched --audit from an elevated context" (needed for WPR kernel session), not a PLM-internal auto-elevate. #6 Drop stale build.rs comment referencing the pre-embedded WPR profile staging. #8 run_plm_command doc no longer claims plm failures never abort; documents that --audit's "plm start" caller does abort while "plm stop" falls through to wpr -cancel cleanup. #9 --audit's "is capability already present?" check is now case- sensitive (matching the downstream AppContainer runner) so an operator's mis-cased spelling doesn't silently disable the injection. #10 Move mark_plm_trace_active() out of plm's main.rs into a callback that log::run fires only AFTER wpr -start engages. Prevents a Ctrl+C during stdin-prompt or spawn-fail from issuing wpr -cancel against an unrelated host WPR session. #11 WprExeStopper::stop no longer builds wpr_command() twice — reuse the single builder instead of constructing a throwaway just to read get_program(). #13 Remove the extra "// " in dacl_ctrl_handler's comment header. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… deps, extract wxc audit module - Drop future-PR/past-state phrasing from plm module doc-comments - Prepend MIT copyright headers to coordination.rs, profile_gen.rs, wpr_path.rs - Simplify plm log trace filename (drop PID and parse-preview note) - Move serde_json/chrono/roxmltree to workspace [workspace.dependencies] - Use wxc_common::string_util::from_wide in plm::wpr_path - Add plm::coordination::singleton with shared try_acquire/release helpers; both plm::main and wxc::audit now delegate to it - Extract PLM audit lifecycle (~230 lines) from wxc/src/main.rs into wxc/src/audit.rs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per review: --with-bfs / tier2_bfs plumbing belongs with the filesystem extraction PR, not the audit skeleton.
Per review: now that --audit is the intended enablement mechanism, the debug-build carve-out is redundant.
Per review: keep the root README short and defer PLM detail to src/core/plm/readme.md.
Per review: detect filter drivers, quota clamps, and AV rewrites that would otherwise let a corrupted plm.wprp be adopted on every subsequent run via the early-return existence check.
wpr.exe runs elevated (SeSystemProfile for NT Kernel Logger); an attacker-planted binary would execute with full admin rights. Cache the verify result so back-to-back plm log + plm stop only pays the cert-chain walk once.
The step references run_plm_test.ps1 which does not exist on this branch (arrives in the testing PR). A dedicated PLM integration test job belongs on the branch that ships the script.
Removes the runtime feature that reordered every serde_json Map — which was the sole cause of the dev schema on this branch differing from main. The feature was consumed only by a dead type alias in plm/src/config.rs. Regenerated dev schema now matches main byte-for-byte.
Substitute neutral placeholder paths in the plm readme (C:\Tessera\mxc -> C:\src\mxc).
…doc-comment ctrl-handler globals - start.rs / stop.rs: switch wpr.exe spawns from .status() to .output() so successful runs stay silent; replay captured stdout+stderr on non-zero exit via shared start::replay_wpr_output. Fixes the wxc-exec --audit stdio pollution called out in comments r3514558858 / r3514562221 / r3514564478. - wxc/src/audit.rs: capture plm.exe stdio via .output() so the audit-mode background trace never leaks plm chatter into the wrapped workload's console. On failure (and in --verbose) the captured streams are replayed via a shared replay_child_output helper. Fixes r3502009755 / r3502010101 / r3502010465 / r3514551150. - coordination.rs: switch singleton acquire from CreateMutexW(bInitialOwner=true) + ERROR_ALREADY_EXISTS to CreateMutexW + WaitForSingleObject(0) so a previous owner that crashed (WAIT_ABANDONED) no longer blocks the singleton forever and forces a reboot. Addresses r3503721112. - main.rs: expand PLM_TRACE_ACTIVE doc-comment to explain why it's a process-wide static (Windows ctrl-handler callback context has no captured self — must reach state via globals). Expand plm_ctrl_handler doc to explain why polling is used instead of a wait-object (wpr's kernel-session engagement isn't OS-signalled; bounded 40-poll worst case stays well under the ~5s ctrl-handler kill budget). Addresses r3503762141 / r3503729778. - stop.rs: drop redundant _pid<PID> suffix from log-dir stamp; sub-second (%H%M%S%.3f) is already collision-safe. Addresses r3503525035. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run MXC-Update-Feed-Dependencies |
|
No pipelines are associated with this pull request. |
The atomic itself has to stay a process-wide static because the Windows console-control handler (plm_ctrl_handler) is an OS-owned extern "system" callback with no captured self/environment — it can only reach state via process globals. But all non-signal-context mutation now goes through &AcquiredSingleton methods: - mark_trace_active(&self) — was free fn mark_plm_trace_active - clear_trace_active(&self) — was free fn clear_plm_trace_active - cancel_active_trace(&self) — was free fn cancel_active_plm_trace The compile-time invariant is that trace-active can only be flipped while we hold the host-wide Global\\Mxc_Plm_Audit singleton mutex — you can't call these methods without first producing an AcquiredSingleton. The single remaining free-fn escape hatch, cancel_active_plm_trace_from_signal(), is documented as callable only from the ctrl handler (where no &self exists). Also: AcquiredSingleton::drop now cancels any leftover trace before releasing the mutex, so an error mid-flow can't leak the kernel session past exit. The Cmd::Log flow uses the singleton bypass path (when wxc-exec --audit already holds the mutex outer-process) — that branch falls back to touching PLM_TRACE_ACTIVE directly because no local AcquiredSingleton exists in the bypass case. Addresses PR584 review comment r3503762141 more thoroughly than the prior doc-comment-only fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wpr.exe is catalog-signed, not embedded-signed, so WinVerifyTrust with WINTRUST_ACTION_GENERIC_VERIFY_V2 returns TRUST_E_NOSIGNATURE (0x800B0100) on stock Windows installs. Correctly verifying a catalog-signed binary requires a CryptCATAdmin fallback dance. Since we resolve the path via GetSystemDirectoryW (not env-spoofable) and %SystemDirectory% write access requires TrustedInstaller, path resolution already is the trust boundary. Reduce verify_wpr_signed() to a sanity file-existence check that surfaces a clear message when WPT isn't installed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`wpr -stop` streams a `100% [>>>>>>>]` progress bar to stdout as it merges the ETL, which pollutes any wrapping tool's console. `stop::WprExeStopper` already captures via `.output()` and replays via `replay_wpr_output` only on failure; apply the same pattern to `log::stop_wpr_trace`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wpr.exe renders its `100% [>>>>>>]` merge progress via `WriteConsoleW`, which writes to the console handle directly and bypasses any stdio pipe redirection (`.stdout(Stdio::piped())` / `.output()`). So even though `log.rs` / `stop.rs` / `start.rs` all already capture wpr's stdout+stderr via `.output()`, the progress bar was still leaking onto the wrapping tool's terminal. Add CREATE_NO_WINDOW (0x08000000) as a creation flag on the Command returned by wpr_command(), so every wpr.exe child is spawned without an attached console and its WriteConsoleW calls silently no-op. Normal stdout/stderr traffic still flows through the pipes and is replayed on failure via replay_wpr_output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses PR#584 reviewer feedback (bbonaby comments r3503498362, r3503511023, r3503528016, r3503563795, r3503578675, r3503583878, r3507611305). Rewrites the pr1 subcommand summary, `Cmd::Stop` doc, `resolve_bin_path` doc, the `Trace captured at` print, the trailing `verbose logging` message, and the pass-through-options comment so they describe what the code does now rather than deferring to `later PRs`. The extract-caps subcommand doc entry is removed because Cmd::ExtractCaps isn't wired in this PR (it lands in the split-out PR#587). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Detects a non-elevated token at startup (before COM init) and, when --audit was requested, re-launches the same argv with ShellExecuteExW + `runas` so the user sees the standard UAC consent dialog. The child's exit code is propagated so the outer invoker sees the normal return contract. UAC decline (ERROR_CANCELLED = 1223) surfaces a clear `--audit requires elevation` message. Addresses PR#584 review thread 3501401682. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
📖 Description
PR 1 of 6 — stacked vertical split of the
PermissiveLearningModework for readability. Merged together this PR plus PRs 2–6 produce exactly the same tree as the source branch; the split is per-area, not per-self-contained-feature.This PR introduces the cross-cutting plumbing needed for permissive learning mode (PLM):
wxc-exec --auditplumbing: new audit/log-capture flag and the surrounding wiring inwxc-execthat invokesplm.exebefore/after the workload.plmcrate skeleton: a Windows-only binary + library atsrc/core/plm/with:start—wpr -cancel+wpr -start <plm.wprp>!AccessFailureProfile -filemodestop— trace-lifecycle scaffolding (parse + merge land in PR2+)log— interactive Enter-to-start/Enter-to-stop trace lifecycle skeletonplm.wprp) auto-materialized next to the exe on first useplm logandwxc-exec --auditdon't double-start traceswpr.exePATH-spoof-safe resolverCargo.tomlwiring).Subsequent PRs layer the parser, config merge, capability extraction, UI policy, and test artifacts on top.
🔗 References
main🔍 Validation
cargo build -p plm --target x86_64-pc-windows-msvc— cleancargo fmt --all -- --check— cleancargo clippy -p plm --target x86_64-pc-windows-msvc --all-targets -- -D warnings— cleancargo test -p plm --target x86_64-pc-windows-msvc— 25 passed✅ Checklist
📋 Issue Type
GitHub Actions runs the PR validation build automatically. The ADO pipeline
(
MXC-PR-Build) is the official build pipeline that signs the binaries; itruns on merge to
mainand nightly, and Microsoft reviewers can trigger iton a PR with
/azp run. See docs/pull-requests.md.Microsoft Reviewers: Open in CodeFlow