From e4446b4cbbc2809862711c8a0f7c1d04a1f4eeac Mon Sep 17 00:00:00 2001 From: Annie Liang Date: Fri, 26 Jun 2026 12:58:02 -0700 Subject: [PATCH 1/2] Add cosmos integration-test skill under azure-cosmos-tests for agent discovery Place the cosmos-run-integration-tests skill at sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md. The find-package-skill discovery utility inspects only the .github/skills/ directory of the exact package being changed; it does not walk up to the service root or scan sibling packages. azure-cosmos and azure-cosmos-tests are sibling Maven modules, and the skill documents how to run the integration tests that live entirely in azure-cosmos-tests (profiles, testng suites, env vars, and the -pl azure-cosmos-tests verify commands). In the typical workflow (change azure-cosmos, add tests in azure-cosmos-tests, then run them) the agent is in azure-cosmos-tests exactly when it needs this knowledge, so this is the most discoverable home. The directory name matches the frontmatter 'name' field as required by vally lint. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../cosmos-run-integration-tests/SKILL.md | 91 +++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md diff --git a/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md b/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md new file mode 100644 index 000000000000..beed2c50c084 --- /dev/null +++ b/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md @@ -0,0 +1,91 @@ +--- +name: cosmos-run-integration-tests +description: > + Run azure-cosmos integration/customer-workflow tests locally exactly like the CI + pipeline does (build+install on JDK 21, then failsafe `verify` with a test profile). + USE WHEN: asked to run cosmos integration tests, customer-workflow tests + (fi-customer-workflows / fi-sm-customer-workflows), reproduce a CI test failure + locally, or run a specific cosmos test profile via Maven. Covers the JDK-21 + requirement, the two-step build/test split, profile→group→file mapping, and the + required account env vars. + NOT FOR: unit tests only, Spark/Kafka connector tests, or non-cosmos modules. +--- + +# Running azure-cosmos integration tests locally (CI-equivalent) + +## TL;DR +CI runs tests in **two separate Maven invocations**, both on **JDK 21**: +1. **Build + install** (`-DskipTests ... install`) — compiles main + test classes, installs jars. +2. **Test** (`verify -P -DskipCompile=true -DskipTestCompile=true`) — failsafe runs the TestNG suite. + +There is no separate "surefire command" — integration tests run through **failsafe** via `mvn verify -P`. + +## ⚠️ Critical: use JDK 21 +The `azure-cosmos` / `azure-cosmos-test` jars in `~/.m2` are compiled with **JDK 21** (class file v65). +If your shell `JAVA_HOME` is JDK 17 (v61), `javac --release 17` rejects them with **misleading** +errors like: + +``` +cannot access com.azure.cosmos.implementation.TestConfigurations + bad class file: ...azure-cosmos-*.jar(.../TestConfigurations.class) + class file has wrong version 65.0, should be 61.0 +``` + +This is **NOT** a JPMS / module-path / `javaModulesSurefireArgLine` problem. The parent already sets +`useModulePath=false`; `azure-cosmos` lands on `-classpath` correctly. The only cause is the JDK mismatch. + +Local JDK 21: `C:\Program Files\OpenLogic\jdk-21.0.10.7-hotspot` + +Set it at the top of every command: +```powershell +$env:JAVA_HOME='C:\Program Files\OpenLogic\jdk-21.0.10.7-hotspot' +$env:PATH="$env:JAVA_HOME\bin;$env:PATH" +``` + +## Step 1 — Build + install (JDK 21) +Quote **every** `-D` flag in PowerShell (unquoted `.skip` args get mis-parsed as lifecycle phases). +```powershell +mvn --batch-mode --fail-at-end '-DskipTests' '-Dgpg.skip=true' '-Dmaven.javadoc.skip=true' ` + '-Dcodesnippet.skip=true' '-Dspotbugs.skip=true' '-Dcheckstyle.skip=true' '-Drevapi.skip=true' ` + '-Dspotless.apply.skip=true' '-Dspotless.check.skip=true' '-Djacoco.skip=true' '-Denforcer.skip=true' ` + '-T' '2C' '-pl' 'com.azure:azure-cosmos,com.azure:azure-cosmos-tests' '-am' 'install' +``` + +## Step 2 — Run a test profile via failsafe (skip compile like CI) +```powershell +mvn '-pl' 'azure-cosmos-tests' 'verify' '-Pfi-sm-customer-workflows' ` + '-DskipCompile=true' '-DskipTestCompile=true' '-DcreateSourcesJar=false' ` + "-DACCOUNT_HOST=$env:ACCOUNT_HOST" "-DACCOUNT_KEY=$env:ACCOUNT_KEY" ` + '-DACCOUNT_CONSISTENCY=Session' '-DCOSMOS.CLIENT_LEAK_DETECTION_ENABLED=true' ` + '-Dgpg.skip=true' '-Dspotbugs.skip=true' '-Dcheckstyle.skip=true' '-Drevapi.skip=true' ` + '-Dspotless.apply.skip=true' '-Dspotless.check.skip=true' '-Djacoco.skip=true' '-Denforcer.skip=true' ` + 2>&1 | Tee-Object -FilePath fi-sm-run1.log +``` +- Failsafe report: `azure-cosmos-tests/target/failsafe-reports/TestSuite.txt` +- Summary line to look for: `Tests run: N, Failures: F, Errors: E, Skipped: S` + +## Profile → group → file mapping (customer workflows) +| Profile | Test group | Account shape | Files | +|---|---|---|---| +| `fi-customer-workflows` | `fi-customer-workflows` | multi-master | 9 test files | +| `fi-sm-customer-workflows` | `fi-sm-customer-workflows` | single-master, multi-region | 1 file: `CustomerWorkflowSingleMasterAvailabilityTest` | + +Each profile sets a `suiteXmlFile` (e.g. `src/test/resources/fi-sm-customer-workflows-testng.xml`) +in `azure-cosmos-tests/pom.xml`. Other profiles (`direct`, `multi-master`, `fi-multi-master`, +`thinclient`, etc.) follow the same `-P` pattern. + +## Required env vars +| Var | Notes | +|---|---| +| `ACCOUNT_HOST` | e.g. `https://.documents.azure.com:443/` | +| `ACCOUNT_KEY` | primary key (88 chars) | +| `ACCOUNT_CONSISTENCY` | CI passes `Session` for these profiles; defaults to `Strong` if unset | + +## Notes +- `javaModulesSurefireArgLine` (azure-cosmos-tests/pom.xml) is the **runtime** `--add-opens` block + for reflective test access; the parent injects it into the surefire/failsafe argLine. It does not + affect compilation. +- CI source of truth: `sdk/cosmos/tests.yml` (`TestGoals: verify`, + `TestOptions: $(ProfileFlag) -DskipCompile=true -DskipTestCompile=true -DcreateSourcesJar=false`). +- To run a single test, add `'-Dit.test=CustomerWorkflowSingleMasterAvailabilityTest#'` + (failsafe uses `it.test`, not `test`). \ No newline at end of file From 76f846a0a2254e5027cffe56e62b954a76ddb016 Mon Sep 17 00:00:00 2001 From: Annie Liang Date: Fri, 26 Jun 2026 14:01:00 -0700 Subject: [PATCH 2/2] Address review feedback: use consistent JDK for build+test, fix module paths - Reframe the 'use JDK 21' requirement to 'use the same JDK for build (step 1) and test (step 2)'. Step 2 skips compilation, so it runs the classes step 1 installed; mixing JDKs is what causes the class-file version mismatch. Don't hard-code 21 (CI's JavaTestVersion is now 1.25) and make JAVA_HOME illustrative. - Reference JavaTestVersion (globals.yml) instead of a fixed JDK version. - Fix step 2 -pl to the com.azure:azure-cosmos-tests coordinate so it works from the repo root (matches step 1). - Prefix report/pom.xml references with sdk/cosmos/ so paths resolve from root. - Keep the local-convenience flags (e.g. -Denforcer.skip=true) but note they are intentional local deviations, not a byte-for-byte CI build. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../cosmos-run-integration-tests/SKILL.md | 55 ++++++++++++------- 1 file changed, 36 insertions(+), 19 deletions(-) diff --git a/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md b/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md index beed2c50c084..776448a138da 100644 --- a/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md +++ b/sdk/cosmos/azure-cosmos-tests/.github/skills/cosmos-run-integration-tests/SKILL.md @@ -1,12 +1,13 @@ --- name: cosmos-run-integration-tests description: > - Run azure-cosmos integration/customer-workflow tests locally exactly like the CI - pipeline does (build+install on JDK 21, then failsafe `verify` with a test profile). + Run azure-cosmos integration/customer-workflow tests locally, closely following the CI + pipeline (build+install, then failsafe `verify` with a test profile) using one consistent + JDK for both steps. USE WHEN: asked to run cosmos integration tests, customer-workflow tests (fi-customer-workflows / fi-sm-customer-workflows), reproduce a CI test failure - locally, or run a specific cosmos test profile via Maven. Covers the JDK-21 - requirement, the two-step build/test split, profile→group→file mapping, and the + locally, or run a specific cosmos test profile via Maven. Covers the same-JDK + build/test requirement, the two-step build/test split, profile→group→file mapping, and the required account env vars. NOT FOR: unit tests only, Spark/Kafka connector tests, or non-cosmos modules. --- @@ -14,16 +15,27 @@ description: > # Running azure-cosmos integration tests locally (CI-equivalent) ## TL;DR -CI runs tests in **two separate Maven invocations**, both on **JDK 21**: +CI runs tests in **two separate Maven invocations**, both on the **same JDK** (the version CI pins +via `JavaTestVersion` in `eng/pipelines/templates/variables/globals.yml`): 1. **Build + install** (`-DskipTests ... install`) — compiles main + test classes, installs jars. 2. **Test** (`verify -P -DskipCompile=true -DskipTestCompile=true`) — failsafe runs the TestNG suite. +Because step 2 skips compilation, it runs the classes/jars step 1 produced — so **both steps must use +the same JDK** (see below). + There is no separate "surefire command" — integration tests run through **failsafe** via `mvn verify -P`. -## ⚠️ Critical: use JDK 21 -The `azure-cosmos` / `azure-cosmos-test` jars in `~/.m2` are compiled with **JDK 21** (class file v65). -If your shell `JAVA_HOME` is JDK 17 (v61), `javac --release 17` rejects them with **misleading** -errors like: +## ⚠️ Critical: use the *same* JDK for build (step 1) and test (step 2) +Step 1 compiles main + test classes and installs the `azure-cosmos` / `azure-cosmos-tests` jars into +`~/.m2` using whatever `JAVA_HOME` you build with. A local incremental `install` packages those classes +at the **build JDK's class-file version** (the `java9plus` profile's `default-compile` / +`default-testCompile` use `${java.vm.specification.version}`, i.e. the build VM's +version). Step 2 then **skips compilation** (`-DskipCompile=true -DskipTestCompile=true`) and runs those +already-built classes. + +So if step 1 and step 2 use **different** JDKs you get class-file version mismatches. Example: build on a +newer JDK (class file v65/v69), then run step 2 on JDK 17 (v61) and `javac` / the runtime rejects the +jars with **misleading** errors like: ``` cannot access com.azure.cosmos.implementation.TestConfigurations @@ -32,17 +44,18 @@ cannot access com.azure.cosmos.implementation.TestConfigurations ``` This is **NOT** a JPMS / module-path / `javaModulesSurefireArgLine` problem. The parent already sets -`useModulePath=false`; `azure-cosmos` lands on `-classpath` correctly. The only cause is the JDK mismatch. - -Local JDK 21: `C:\Program Files\OpenLogic\jdk-21.0.10.7-hotspot` +`useModulePath=false`; `azure-cosmos` lands on `-classpath` correctly. The only cause is the JDK mismatch +between the two steps. -Set it at the top of every command: +**Fix:** pick one JDK and use it for *both* steps — ideally the major version CI uses for tests +(`JavaTestVersion` in `eng/pipelines/templates/variables/globals.yml`). Point `JAVA_HOME` at that JDK and +prepend it to `PATH` at the top of every command (adjust the path to your local install): ```powershell -$env:JAVA_HOME='C:\Program Files\OpenLogic\jdk-21.0.10.7-hotspot' +$env:JAVA_HOME='' # e.g. C:\Program Files\OpenLogic\jdk-21.0.10.7-hotspot $env:PATH="$env:JAVA_HOME\bin;$env:PATH" ``` -## Step 1 — Build + install (JDK 21) +## Step 1 — Build + install (use the same JDK as step 2) Quote **every** `-D` flag in PowerShell (unquoted `.skip` args get mis-parsed as lifecycle phases). ```powershell mvn --batch-mode --fail-at-end '-DskipTests' '-Dgpg.skip=true' '-Dmaven.javadoc.skip=true' ` @@ -53,7 +66,7 @@ mvn --batch-mode --fail-at-end '-DskipTests' '-Dgpg.skip=true' '-Dmaven.javadoc. ## Step 2 — Run a test profile via failsafe (skip compile like CI) ```powershell -mvn '-pl' 'azure-cosmos-tests' 'verify' '-Pfi-sm-customer-workflows' ` +mvn '-pl' 'com.azure:azure-cosmos-tests' 'verify' '-Pfi-sm-customer-workflows' ` '-DskipCompile=true' '-DskipTestCompile=true' '-DcreateSourcesJar=false' ` "-DACCOUNT_HOST=$env:ACCOUNT_HOST" "-DACCOUNT_KEY=$env:ACCOUNT_KEY" ` '-DACCOUNT_CONSISTENCY=Session' '-DCOSMOS.CLIENT_LEAK_DETECTION_ENABLED=true' ` @@ -61,7 +74,7 @@ mvn '-pl' 'azure-cosmos-tests' 'verify' '-Pfi-sm-customer-workflows' ` '-Dspotless.apply.skip=true' '-Dspotless.check.skip=true' '-Djacoco.skip=true' '-Denforcer.skip=true' ` 2>&1 | Tee-Object -FilePath fi-sm-run1.log ``` -- Failsafe report: `azure-cosmos-tests/target/failsafe-reports/TestSuite.txt` +- Failsafe report: `sdk/cosmos/azure-cosmos-tests/target/failsafe-reports/TestSuite.txt` - Summary line to look for: `Tests run: N, Failures: F, Errors: E, Skipped: S` ## Profile → group → file mapping (customer workflows) @@ -71,7 +84,7 @@ mvn '-pl' 'azure-cosmos-tests' 'verify' '-Pfi-sm-customer-workflows' ` | `fi-sm-customer-workflows` | `fi-sm-customer-workflows` | single-master, multi-region | 1 file: `CustomerWorkflowSingleMasterAvailabilityTest` | Each profile sets a `suiteXmlFile` (e.g. `src/test/resources/fi-sm-customer-workflows-testng.xml`) -in `azure-cosmos-tests/pom.xml`. Other profiles (`direct`, `multi-master`, `fi-multi-master`, +in `sdk/cosmos/azure-cosmos-tests/pom.xml`. Other profiles (`direct`, `multi-master`, `fi-multi-master`, `thinclient`, etc.) follow the same `-P` pattern. ## Required env vars @@ -82,10 +95,14 @@ in `azure-cosmos-tests/pom.xml`. Other profiles (`direct`, `multi-master`, `fi-m | `ACCOUNT_CONSISTENCY` | CI passes `Session` for these profiles; defaults to `Strong` if unset | ## Notes -- `javaModulesSurefireArgLine` (azure-cosmos-tests/pom.xml) is the **runtime** `--add-opens` block +- `javaModulesSurefireArgLine` (sdk/cosmos/azure-cosmos-tests/pom.xml) is the **runtime** `--add-opens` block for reflective test access; the parent injects it into the surefire/failsafe argLine. It does not affect compilation. - CI source of truth: `sdk/cosmos/tests.yml` (`TestGoals: verify`, `TestOptions: $(ProfileFlag) -DskipCompile=true -DskipTestCompile=true -DcreateSourcesJar=false`). +- The commands above add a few **local-convenience flags CI does not use** (e.g. `-Denforcer.skip=true` + plus the various `*.skip` flags) to speed up local runs. They are intentional — this skill reproduces + the test run, not a byte-for-byte CI build. Drop `-Denforcer.skip=true` if you also want CI's enforcer + checks locally. - To run a single test, add `'-Dit.test=CustomerWorkflowSingleMasterAvailabilityTest#'` (failsafe uses `it.test`, not `test`). \ No newline at end of file