advanced-security · Copilot · Apr 22, 2026 · Apr 22, 2026 · Apr 22, 2026
@@ -0,0 +1,155 @@
+---
+name: create-model-pack
+description: Create or update a CodeQL model pack of `.model.yml` data extension files for an unmodeled (or under-modeled) library or framework, including local repo-scoped extensions under `.github/codeql/extensions/` and reusable model packs under `languages/<language>/custom/src/`. Use when a user asks to "model a library", "add a data extension", "add sources/sinks/summaries/barriers/barrier-guards for <library>", "create a model pack", or wants CodeQL to recognize calls in a third-party package that currently produce no findings.
+---
+
+# Create a CodeQL Model Pack
+
+This skill describes the end-to-end procedure for authoring a CodeQL data extension (a `.model.yml` file) and packaging it either as a repo-local extension or as a reusable model pack. It complements the reference documentation in [`.github/prompts/data_extensions_development.prompt.md`](../../prompts/data_extensions_development.prompt.md) and the language-specific data extension prompts (e.g. [`python_data_extension_development.prompt.md`](../../prompts/python_data_extension_development.prompt.md), [`java_data_extension_development.prompt.md`](../../prompts/java_data_extension_development.prompt.md)).
+
+Once the model pack is ready to ship to other repositories or to org-wide Default Setup, follow up with the [`publish-model-pack`](../publish-model-pack/SKILL.md) skill.
+
+## When to use this skill
+
+Trigger this skill when the user wants to:
+
+- Add CodeQL coverage for a library/framework that produces no findings today.
+- Add or correct sources, sinks, summaries, barriers (sanitizers), or barrier guards (validators) for a specific package.
+- Bootstrap a new `.model.yml` file under `.github/codeql/extensions/` (single-repo) or under `languages/<language>/custom/src/` (reusable pack).
+
+If the user instead wants to write a custom CodeQL `.ql` query, use the query development prompts rather than this skill.
+
+## Prerequisites
+
+- The `codeql` CLI is available (preinstalled in this template's environment via [`.github/workflows/copilot-setup-steps.yml`](../../workflows/copilot-setup-steps.yml)).
+- A CodeQL database for the target language is available, or sample code from which one can be built with `codeql database create`.
+- Familiarity with the two tuple formats:
+  - **API Graph format** — Python, Ruby, JavaScript/TypeScript (3–5 columns).
+  - **MaD format** — Java/Kotlin, C#, Go, C/C++ (9–10 columns; includes `subtypes` and `provenance`).
+
+See the "Two Model Formats" and "Quick reference" tables in [`data_extensions_development.prompt.md`](../../prompts/data_extensions_development.prompt.md) for the canonical column layouts and examples.
+
+## Procedure
+
+### 1. Identify the target library and language
+
+- Confirm the library name, version, and the CodeQL language it targets (`python`, `ruby`, `javascript`, `java`, `csharp`, `go`, `cpp`, `actions`).
+- Confirm whether the language uses **API Graph** or **MaD** tuples — pick the wrong format and the extension will silently fail to load.
+- Skim the library's public API surface (docs, type stubs, or source) so you can classify methods in the next step.
+
+### 2. Classify the API surface
+
+For each public method, function, or class on the library, ask:
+
+1. Does it return data from outside the program (network, file, env, stdin)? → **sourceModel** (pick a `kind` in the appropriate threat model — usually `remote`).
+2. Does it consume data in a security-sensitive operation (SQL, exec, path, redirect, eval, deserialize)? → **sinkModel** (pick a `kind` matching the vulnerability class, e.g. `sql-injection`, `command-injection`, `path-injection`).
+3. Does it pass data through opaque library code (encode, decode, wrap, copy, iterate)? → **summaryModel** with `kind: taint` (derived) or `kind: value` (identity).
+4. Does it sanitize data so its output is safe for a specific sink kind? → **barrierModel** (`kind` must match the sink kind it neutralizes).
+5. Does it return a boolean indicating whether data is safe? → **barrierGuardModel** with the appropriate `acceptingValue` (`"true"` or `"false"`) and matching `kind`.
+6. Is the type a subclass of something already modeled? → **typeModel** (API Graph languages only) or set `subtypes: True` in the MaD tuple.
+7. Did the auto-generated model assign a wrong summary? → **neutralModel** to suppress it.
+
+A complete chain of source → (summary\*) → sink is required for end-to-end findings; missing a single hop will cause false negatives.
+
+### 3. Choose the deployment scope
+
+Decide between two paths and the directory layout follows:
+
+- **Single-repo shortcut** — drop `.model.yml` files directly under `.github/codeql/extensions/<pack-name>/` in the consuming repo. **No `qlpack.yml` is required**; Code Scanning auto-loads extensions from this directory. Use this when the models only need to apply to one repo and you do not want to version-publish them.
+- **Reusable model pack** — create the files under a pack directory in this template (e.g. `languages/<language>/custom/src/models/`) with a `qlpack.yml` declaring `extensionTargets` and `dataExtensions`. Use this when the models will be consumed by multiple repos or by org-wide Default Setup. Publishing is handled by the [`publish-model-pack`](../publish-model-pack/SKILL.md) skill.
+
+### 4. Author the `.model.yml` file(s)
+
+- Use the naming convention `<library>-<module>.model.yml` (lowercase, hyphen-separated). Split per logical module rather than putting an entire ecosystem in one file — e.g. `databricks-sql.model.yml`, `databricks-sdk.model.yml`.
+- Begin each file with the standard header and the extensible predicates that apply, for example:
+
+```yaml
+extensions:
+  - addsTo:
+      pack: codeql/<language>-all
+      extensible: sinkModel
+    data:
+      # API Graph (Python/Ruby/JS): [type, path, kind]
+      - ['mylib', 'Member[connect].ReturnValue.Member[execute].Argument[0]', 'sql-injection']
+      # MaD (Java/C#/Go/C++): [package, type, subtypes, name, signature, ext, input, kind, provenance]
+      # - ['java.sql', 'Statement', true, 'execute', '(String)', '', 'Argument[0]', 'sql-injection', 'manual']
+  - addsTo:
+      pack: codeql/<language>-all
+      extensible: summaryModel
+    data: []
+```
+
+- Every row must have the exact column count for that extensible predicate — see the "Two Model Formats" tables in [`data_extensions_development.prompt.md`](../../prompts/data_extensions_development.prompt.md). An invalid row will fail the engine.
+- Use `provenance: 'manual'` (MaD) for hand-written rows; reserve `'df-generated'` for output of the model generator.
+
+### 5. Configure `qlpack.yml` (model-pack path only)
+
+Skip this step if you chose the `.github/codeql/extensions/` shortcut in step 3.
+
+For a reusable pack (e.g. `languages/<language>/custom/src/qlpack.yml`), add or confirm:
+
+```yaml
+name: <org>/<language>-<pack-name>
+version: 0.0.1
+library: true
+extensionTargets:
+  codeql/<language>-all: '*'
+dataExtensions:
+  - models/**/*.yml
+```
+
+- `library: true` — model packs are always libraries, never queries.
+- `extensionTargets` — names the upstream pack (and version range) the extensions extend.
+- `dataExtensions` — a glob that picks up every `.model.yml` you author in step 4.
+
+### 6. Test locally with `codeql query run`
+
+Validate the model pack against a real database before relying on it:
+
+```bash
+codeql query run \
+    --database=/path/to/db \
+    --additional-packs=<path-to-pack-dir> \
+    --output=/tmp/results.bqrs \
+    -- <path-to-relevant-query>.ql
+
+codeql bqrs decode --format=text /tmp/results.bqrs
+```
+
+- For published packs, swap `--additional-packs=<dir>` for `--model-packs=<org>/<pack>@<range>`.
+- Pick a query whose sink kind matches what you modeled (e.g. a `sql-injection` query when adding SQL sinks). See [`codeql query run`](../../../resources/cli/codeql/codeql_query_run.prompt.md).
+
+### 7. Run unit tests with `codeql test run`
+
+`codeql test run` does **not** accept `--model-packs`; data extensions are wired in via `qlpack.yml`. The test pack must depend on the model pack, then:
+
+```bash
+codeql test run \
+    --additional-packs=<path-to-model-pack-dir> \
+    --keep-databases \
+    --show-extractor-output \
+    -- languages/<language>/<pack-basename>/test/<QueryBasename>/
+```
+
+Add a small test case under `languages/<language>/custom/test/` (or your project's equivalent) that exercises the new source/sink/summary chain end-to-end and accept its `.expected` output once you have confirmed it is correct. See [`codeql test run`](../../../resources/cli/codeql/codeql_test_run.prompt.md).
+
+### 8. Decide on next steps
+
+- If the `.model.yml` lives under `.github/codeql/extensions/` of the consuming repo, you are done — Code Scanning will load it on the next analysis.
+- If you authored a reusable model pack and want it to apply across an organization, continue with the [`publish-model-pack`](../publish-model-pack/SKILL.md) skill.
+
+## Validation checklist
+
+- [ ] Correct tuple format for the language (API Graph vs MaD).
+- [ ] Every row has the exact column count for its extensible predicate.
+- [ ] Sink/barrier `kind` values match across the chain (e.g. a `sql-injection` barrier must guard a `sql-injection` sink).
+- [ ] At least one end-to-end test exercises the new model and produces the expected finding.
+- [ ] `qlpack.yml` `dataExtensions` glob actually matches the new files (verify by running `codeql resolve library-path`).
+- [ ] No regressions in pre-existing tests under the same pack.
+
+## Related resources
+
+- [`data_extensions_development.prompt.md`](../../prompts/data_extensions_development.prompt.md) — reference for tuple formats, threat models, and access path syntax.
+- Language-specific data extension prompts in [`.github/prompts/`](../../prompts/) (one per supported language).
+- [`publish-model-pack`](../publish-model-pack/SKILL.md) — follow-up skill for shipping the pack to GHCR and Default Setup.
+- [`codeql query run`](../../../resources/cli/codeql/codeql_query_run.prompt.md) and [`codeql test run`](../../../resources/cli/codeql/codeql_test_run.prompt.md) — CLI references used in steps 6 and 7.
@@ -0,0 +1,131 @@
+---
+name: publish-model-pack
+description: Publish an existing CodeQL model pack to GitHub Container Registry (GHCR) with `codeql pack create` / `codeql pack publish`, and configure it for org-wide use under Code Scanning Default Setup. Use when a user asks to "publish a model pack", "push a model pack to GHCR", "release a new version of <pack>", "add a model pack to Default Setup", or "make my custom data extensions apply across the organization".
+---
+
+# Publish a CodeQL Model Pack
+
+This skill describes the procedure for shipping an existing CodeQL model pack — built with the [`create-model-pack`](../create-model-pack/SKILL.md) skill or already present under `languages/<language>/custom/src/` — to GHCR and wiring it into org-wide Code Scanning Default Setup.
+
+This is the right skill **only when the consumers must include other repositories** in your organization. If the data extensions are needed only by one repository, prefer the `.github/codeql/extensions/` shortcut described in the [`create-model-pack`](../create-model-pack/SKILL.md) skill — no publish step is required.
+
+## When to use this skill
+
+Trigger this skill when the user wants to:
+
+- Push a new or updated model pack to GHCR.
+- Release a new semver version of an existing model pack.
+- Configure an org so Default Setup automatically picks up a custom model pack.
+- Diagnose why a published model pack is not being applied during Code Scanning analyses.
+
+## Prerequisites
+
+- The model pack already exists locally and has at least one valid `.model.yml`. If not, run the [`create-model-pack`](../create-model-pack/SKILL.md) skill first.
+- The `codeql` CLI is available and authenticated to GHCR. On agent runners, the standard `GITHUB_TOKEN` (with `packages: write`) is sufficient; locally you may need `gh auth login` or a PAT exported as `CODEQL_REGISTRIES_AUTH` / `GITHUB_TOKEN`.
+- You have write access (`packages: write`) to the GHCR namespace named in the pack's `name` field (e.g. `<org>/<language>-<pack-name>`).
+- For the org-wide configuration step, you must have organization-owner or "Manage Code Security settings" permission for the target org.
+
+## Procedure
+
+### 1. Verify `qlpack.yml` is publish-ready
+
+Open the pack's `qlpack.yml` (typically `languages/<language>/custom/src/qlpack.yml`) and confirm:
+
+```yaml
+name: <org>/<language>-<pack-name> # must match the GHCR org/repo namespace you can publish to
+version: 1.0.0 # semver — see step 5 for version bumps
+library: true # model packs are always libraries
+extensionTargets:
+  codeql/<language>-all: '*' # or a tighter range like ^1.0.0
+dataExtensions:
+  - models/**/*.yml # glob must actually match your .model.yml files
+```
+
+Sanity checks:
+
+- `name` is fully qualified (`<scope>/<pack>`); the scope must be a GHCR namespace you can push to.
+- `version` is a valid semver string and is **strictly greater** than the latest version already on GHCR (publishing the same version will fail).
+- `extensionTargets` references the upstream pack the extensions extend (`codeql/<language>-all`). The version range determines which CodeQL releases the pack is compatible with.
+- `dataExtensions` glob resolves to the expected file list — confirm with:
+
+```bash
+ls -1 $(dirname <path-to-qlpack.yml>)/models/**/*.yml
+```
+
+### 2. Build the pack with `codeql pack create`
+
+From the directory containing `qlpack.yml`:
+
+```bash
+codeql pack create \
+    --output=/tmp/codeql-pack-out \
+    .
+```
+
+- The output directory will contain a versioned subtree (`<scope>/<pack>/<version>/`) ready for upload.
+- `codeql pack create` will fail fast on malformed `.model.yml` rows or unresolved `extensionTargets`. Fix any reported errors before proceeding. Run `codeql pack create -h -vv` for full help.
+
+### 3. Publish to GHCR with `codeql pack publish`
+
+```bash
+codeql pack publish .
+```
+
+- `codeql pack publish` re-runs the build then pushes the resulting OCI artifact to `ghcr.io/<scope>/<pack>:<version>` (and updates the `latest` tag).
+- Authentication: ensure `GITHUB_TOKEN` (or a PAT with `write:packages`) is exported. On a workflow runner, set `permissions: { packages: write }` on the job. Run `codeql pack publish -h -vv` for full help.
+- Confirm the push by either checking the package under `https://github.com/orgs/<scope>/packages` or running:
+
+```bash
+codeql pack download <scope>/<pack>@<version>
+```
+
+### 4. Configure org-wide Default Setup
+
+To apply the published model pack to every Default Setup analysis in the org:
+
+1. Navigate to the org settings: **Code security → Global settings → CodeQL analysis** (also accessible via **Security → Advanced Security → Global settings → Expand CodeQL analysis** depending on the UI version).
+2. Under **Model packs**, click **Add model pack** and enter `<scope>/<pack>` (optionally pinned to a version range, e.g. `<scope>/<pack>@^1.0.0`).
+3. Save. Default Setup will pick up the pack on the next scheduled or push-triggered analysis for repos that target the relevant language.
+
+References:
+
+- [Configure organization-level CodeQL model packs](https://github.blog/changelog/2024-04-16-configure-organization-level-codeql-model-packs-for-github-code-scanning/)
+- [Extending CodeQL coverage with model packs in Default Setup](https://docs.github.com/en/code-security/how-tos/find-and-fix-code-vulnerabilities/manage-your-configuration/editing-your-configuration-of-default-setup#extending-codeql-coverage-with-codeql-model-packs-in-default-setup)
+- [Configuring Default Setup at scale](https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/configure-specific-tools/configuring-default-setup-for-code-scanning-at-scale)
+
+### 5. Version management
+
+- Use **semver** for `version`. Bump `patch` for additive rows that don't change semantics, `minor` for new model categories or substantial new coverage, `major` for breaking changes (renames, removals, format changes).
+- For each release, bump `version` in `qlpack.yml` **before** running `codeql pack publish` — re-publishing the same version fails.
+- If the org-level configuration uses a range (e.g. `@^1.0.0` or no pin at all), Default Setup automatically resolves the **latest matching** version on every run; consumers do not need to take any action to receive a new minor/patch release.
+- If the org-level configuration is pinned to an exact version, you must update it after each release.
+
+### 6. Validate the published pack is being applied
+
+Pick a repository covered by Default Setup that contains code exercising the new models, then:
+
+1. Trigger a Code Scanning run (push to the default branch or click **Re-run all jobs** on the latest CodeQL workflow).
+2. Open the workflow logs for the CodeQL Analyze job and look for log lines confirming the pack was downloaded and its data extensions were loaded — typically lines containing `<scope>/<pack>` and the resolved version, alongside extension counts.
+3. Confirm that new alerts attributable to the new sources/sinks/summaries appear in the Code Scanning alerts view (or, if you intentionally added barriers/neutrals, that previously-flagged false-positive alerts are now suppressed).
+
+If the pack does not appear in the logs:
+
+- Re-check that `name` in `qlpack.yml` matches exactly what is configured in the org settings.
+- Verify the version range in org settings (or `extensionTargets` in the pack) is satisfiable by what's published.
+- Confirm the consumer repo's language is included in the pack's `extensionTargets` (e.g. a `codeql/python-all` extension only fires for Python repos).
+- Pull the pack manually with `codeql pack download <scope>/<pack>@<version>` to rule out access/visibility problems.
+
+## Validation checklist
+
+- [ ] `qlpack.yml` `version` strictly greater than the previously published version.
+- [ ] `codeql pack create` succeeds with no errors or warnings about unknown rows.
+- [ ] `codeql pack publish` reports a successful push and the package is visible under the org's GHCR packages.
+- [ ] The pack is listed under the org's Default Setup model packs configuration.
+- [ ] A subsequent CodeQL workflow run logs the pack as loaded and surfaces the expected new alerts (or suppressions).
+
+## Related resources
+
+- [`create-model-pack`](../create-model-pack/SKILL.md) — upstream skill that produces the model pack consumed here.
+- [`data_extensions_development.prompt.md`](../../prompts/data_extensions_development.prompt.md) — reference for `qlpack.yml` shape (`extensionTargets`, `dataExtensions`) and the workflow context.
+- [`codeql pack install`](../../../resources/cli/codeql/codeql_pack_install.prompt.md) — companion CLI reference; for `pack create`, `pack publish`, and `pack download` use `codeql <subcommand> -h -vv`.
+- [CodeQL now supports sanitizers and validators in models-as-data](https://github.blog/changelog/2026-04-21-codeql-now-supports-sanitizers-and-validators-in-models-as-data/) — recent capability that may motivate a pack version bump.
@@ -0,0 +1,32 @@
+---
+lockVersion: 1.0.0
+dependencies:
+  codeql/actions-all:
+    version: 0.4.33
+  codeql/concepts:
+    version: 0.0.21
+  codeql/controlflow:
+    version: 2.0.31
+  codeql/dataflow:
+    version: 2.1.3
+  codeql/javascript-all:
+    version: 2.6.27
+  codeql/mad:
+    version: 1.0.47
+  codeql/regex:
+    version: 1.0.47
+  codeql/ssa:
+    version: 2.0.23
+  codeql/threat-models:
+    version: 1.0.47
+  codeql/tutorial:
+    version: 1.0.47
+  codeql/typetracking:
+    version: 2.0.31
+  codeql/util:
+    version: 2.0.34
+  codeql/xml:
+    version: 1.0.47
+  codeql/yaml:
+    version: 1.0.47
+compiled: false