A Go CLI tool for performing audit-related tasks in the MongoDB documentation monorepo.
This CLI tool helps with maintenance and audit-related tasks across MongoDB's documentation by:
- Extracting code examples or procedures from RST files into individual, testable files
- Searching files for specific patterns or substrings
- Analyzing reference relationships to understand file dependencies
- Comparing file contents across documentation versions to identify differences
- Following include directives to process entire documentation trees
- Counting documentation pages or tested code examples to track coverage and quality metrics
This CLI provides built-in handling for MongoDB-specific conventions like steps files, extracts, version comprehension, and template variables.
cd audit-cli/bin
go build ../This creates an audit-cli executable in the bin directory.
cd audit-cli
go run main.go [command] [flags]Some commands require a monorepo path (e.g., analyze composables, count tested-examples, count pages). You can configure the monorepo path in three ways, listed in order of priority:
Pass the path directly to the command:
./audit-cli analyze composables /path/to/docs-monorepo
./audit-cli count tested-examples /path/to/docs-monorepo
./audit-cli count pages /path/to/docs-monorepoSet the AUDIT_CLI_MONOREPO_PATH environment variable:
export AUDIT_CLI_MONOREPO_PATH=/path/to/docs-monorepo
./audit-cli analyze composables
./audit-cli count tested-examples
./audit-cli count pagesCreate a .audit-cli.yaml file in either:
- Current directory:
./.audit-cli.yaml - Home directory:
~/.audit-cli.yaml
Config file format:
monorepo_path: /path/to/docs-monorepoExample:
# Create config file
cat > .audit-cli.yaml << EOF
monorepo_path: /Users/username/mongodb/docs-monorepo
EOF
# Now you can run commands without specifying the path
./audit-cli analyze composables
./audit-cli count tested-examples --for-product pymongo
./audit-cli count pages --count-by-projectPriority Example:
If you have all three configured, the command-line argument takes precedence:
# Config file has: monorepo_path: /config/path
# Environment has: AUDIT_CLI_MONOREPO_PATH=/env/path
# Command-line argument: /cmd/path
./audit-cli analyze composables /cmd/path # Uses /cmd/path
./audit-cli analyze composables # Uses /env/path (env overrides config)File-based commands (e.g., extract code-examples, analyze usage, compare file-contents) support flexible path resolution. Paths can be specified in three ways:
1. Absolute Path
./audit-cli extract code-examples /full/path/to/file.rst
./audit-cli analyze usage /full/path/to/includes/fact.rst2. Relative to Monorepo Root (if monorepo is configured)
If you have a monorepo path configured (via config file or environment variable), you can use paths relative to the monorepo root:
# With monorepo_path configured as /Users/username/mongodb/docs-monorepo
./audit-cli extract code-examples manual/manual/source/tutorial.rst
./audit-cli analyze usage manual/manual/source/includes/fact.rst
./audit-cli compare file-contents manual/manual/source/file.rst3. Relative to Current Directory (fallback)
If the path doesn't exist relative to the monorepo, it falls back to the current directory:
./audit-cli extract code-examples ./local-file.rst
./audit-cli analyze includes ../other-dir/file.rstPriority Order:
- If path is absolute → use as-is
- If monorepo is configured and path exists relative to monorepo → use monorepo-relative path
- Otherwise → resolve relative to current directory
This makes it convenient to work with files in the monorepo without typing full paths every time!
The CLI is organized into parent commands with subcommands:
audit-cli
├── extract # Extract content from RST files
│ ├── code-examples
│ └── procedures
├── search # Search through extracted content or source files
│ └── find-string
├── analyze # Analyze RST file structures
│ ├── includes
│ ├── usage
│ ├── procedures
│ └── composables
├── compare # Compare files across versions
│ └── file-contents
└── count # Count code examples and documentation pages
├── tested-examples
└── pages
Extract code examples from reStructuredText files into individual files. For details about what code example directives are supported and how, refer to the Supported rST Directives - Code Example Extraction section below.
Use Cases:
This command helps writers:
- Examine all the code examples that make up a specific page or section
- Split out code examples into individual files for migration to test infrastructure
- Report on the number of code examples by language
- Report on the number of code examples by directive type
- Use additional commands, such as search, to find strings within specific code examples
Basic Usage:
# Extract from a single file
./audit-cli extract code-examples path/to/file.rst -o ./output
# Extract from a directory (non-recursive)
./audit-cli extract code-examples path/to/docs -o ./output
# Extract recursively from all subdirectories
./audit-cli extract code-examples path/to/docs -o ./output -r
# Extract recursively and preserve directory structure
./audit-cli extract code-examples path/to/docs -o ./output -r --preserve-dirs
# Follow include directives
./audit-cli extract code-examples path/to/file.rst -o ./output -f
# Combine recursive scanning and include following
./audit-cli extract code-examples path/to/docs -o ./output -r -f
# Dry run (show what would be extracted without writing files)
./audit-cli extract code-examples path/to/file.rst -o ./output --dry-run
# Verbose output
./audit-cli extract code-examples path/to/file.rst -o ./output -vFlags:
-o, --output <dir>- Output directory for extracted files (default:./output)-r, --recursive- Recursively scan directories for RST files. If you do not provide this flag, the tool will only extract code examples from the top-level RST file. If you do provide this flag, the tool will recursively scan all subdirectories for RST files and extract code examples from all files.--preserve-dirs- Preserve directory structure in output (use with--recursive). By default, all extracted files are written to a flat structure in the output directory. When this flag is enabled with--recursive, the tool will preserve the directory structure relative to the input directory. For example, if extracting fromdocs/source/and a file is located atdocs/source/includes/example.rst, the output will be written tooutput/includes/example.*.extinstead ofoutput/example.*.ext.-f, --follow-includes- Follow.. include::directives in RST files. If you do not provide this flag, the tool will only extract code examples from the top-level RST file. If you do provide this flag, the tool will follow any.. include::directives in the RST file and extract code examples from all included files. When combined with-r, the tool will recursively scan all subdirectories for RST files and follow.. include::directives in all files. If an include filepath is outside the input directory, the-rflag would not parse it, but the-fflag would follow the include directive and parse the included file. This effectively lets you parse all the files that make up a single page, if you start from the page's root.txtfile.--dry-run- Show what would be extracted without writing files-v, --verbose- Show detailed processing information
Output Format:
Extracted files are named: {source-base}.{directive-type}.{index}.{ext}
Examples:
my-doc.code-block.1.js- First code-block from my-doc.rstmy-doc.literalinclude.2.py- Second literalinclude from my-doc.rstmy-doc.io-code-block.1.input.js- Input from first io-code-blockmy-doc.io-code-block.1.output.json- Output from first io-code-block
Report:
After extraction, the code extraction report shows:
- Number of files traversed
- Number of output files written
- Code examples by language
- Code examples by directive type
Extract unique procedures from reStructuredText files into individual files. This command parses procedures and creates one file per unique procedure (grouped by heading and content). Each procedure file represents a distinct piece of content, even if it appears in multiple selections or variations.
Use Cases:
This command helps writers:
- Extract all unique procedures from a page for testing or migration
- Generate individual procedure files for each distinct procedure
- Understand how many different procedures exist in a document
- Create standalone procedure files for reuse or testing
- See which selections each procedure appears in
Basic Usage:
# Extract all unique procedures from a file
./audit-cli extract procedures path/to/file.rst -o ./output
# Extract only procedures that appear in a specific selection
./audit-cli extract procedures path/to/file.rst -o ./output --selection "driver, nodejs"
# Dry run (show what would be extracted without writing files)
./audit-cli extract procedures path/to/file.rst -o ./output --dry-run
# Verbose output (shows all selections each procedure appears in)
./audit-cli extract procedures path/to/file.rst -o ./output -v
# Expand include directives inline
./audit-cli extract procedures path/to/file.rst -o ./output --expand-includesFlags:
-o, --output <dir>- Output directory for extracted procedure files (default:./output)--selection <value>- Extract only procedures that appear in a specific selection (e.g., "python", "driver, nodejs")--expand-includes- Expand include directives inline instead of preserving them--dry-run- Show what would be extracted without writing files-v, --verbose- Show detailed processing information including all selections each procedure appears in
Output Format:
Extracted files are named: {heading}_{first-step-title}_{hash}.rst
The filename includes:
- Heading: The section heading above the procedure
- First step title: The title of the first step (for readability)
- Hash: A short 6-character hash of the content (for uniqueness)
Examples:
before-you-begin_pull-the-mongodb-docker-image_e8eeec.rstinstall-mongodb-community-edition_download-the-tarball_44c437.rstconfiguration_create-the-data-and-log-directories_f1d35b.rst
Verbose Output:
With the -v flag, the command shows detailed information about each procedure:
Found 36 unique procedures:
1. Before You Begin
Output file: before-you-begin-pull-the-mongodb-docker-image-e8eeec.rst
Steps: 5
Appears in 2 selections:
- docker, None, None, None, None, None, without-search-docker
- docker, None, None, None, None, None, with-search-docker
2. Install MongoDB Community Edition
Output file: install-mongodb-community-edition-download-the-tarball-44c437.rst
Steps: 4
Appears in 1 selections:
- macos, None, None, tarball, None, None, None
Supported Procedure Types:
The command recognizes and extracts:
.. procedure::directives with.. step::directives- Ordered lists (numbered or lettered) as procedures
.. tabs::directives with:tabid:options for variations.. composable-tutorial::directives with.. selected-content::blocks- Sub-procedures (ordered lists within steps)
- YAML steps files (automatically converted to RST format)
How Uniqueness is Determined:
Procedures are grouped by:
- Heading: The section heading above the procedure
- Content hash: A hash of the procedure's steps and content
This means:
- Procedures with the same heading but different content are treated as separate unique procedures
- Procedures with identical content that appear in multiple selections are extracted once
- The output file shows all selections where that procedure appears (visible with
-vflag)
Report:
After extraction, the report shows:
- Number of unique procedures extracted
- Number of files written
- Detailed list of procedures with step counts and selections (with
-vflag)
Search through files for a specific substring. Can search through extracted code example or procedure files or RST source files.
Default Behavior:
- Case-insensitive search (matches "curl", "CURL", "Curl", etc.)
- Exact word matching (excludes partial matches like "curl" in "libcurl")
Use --case-sensitive to make the search case-sensitive, or --partial-match to allow matching the substring as part
of larger words.
Use Cases:
This command helps writers:
- Find specific strings across documentation files or pages
- Search for product names, command names, API methods, or other strings that may need to be updated
- Understand the number of references and impact of changes across documentation files or pages
- Identify files that need to be updated when a string needs to be changed
- Scope work related to specific changes
Basic Usage:
# Search in a single file (case-insensitive, exact word match)
./audit-cli search find-string path/to/file.js "curl"
# Search in a directory (non-recursive)
./audit-cli search find-string path/to/output "substring"
# Search recursively
./audit-cli search find-string path/to/output "substring" -r
# Search an RST file and all files it includes
./audit-cli search find-string path/to/source.rst "substring" -f
# Search a directory recursively and follow includes in RST files
./audit-cli search find-string path/to/source "substring" -r -f
# Verbose output (show file paths and language breakdown)
./audit-cli search find-string path/to/output "substring" -r -v
# Case-sensitive search (only matches exact case)
./audit-cli search find-string path/to/output "CURL" --case-sensitive
# Partial match (includes "curl" in "libcurl")
./audit-cli search find-string path/to/output "curl" --partial-match
# Combine flags for case-sensitive partial matching
./audit-cli search find-string path/to/output "curl" --case-sensitive --partial-matchFlags:
-r, --recursive- Recursively scan directories for RST files. If you do not provide this flag, the tool will only search within the top-level RST file or directory. If you do provide this flag, the tool will recursively scan all subdirectories for RST files and search across all files.-f, --follow-includes- Follow.. include::directives in RST files. If you do not provide this flag, the tool will search only the top-level RST file or directory. If you do provide this flag, the tool will follow any.. include::directives in any RST file in the input path and search across all included files. When combined with-r, the tool will recursively scan all subdirectories for RST files and follow.. include::directives in all files. If an include filepath is outside the input directory, the-rflag would not parse it, but the-fflag would follow the include directive and search the included file. This effectively lets you parse all the files that make up a single page, if you start from the page's root.txtfile.-v, --verbose- Show file paths and language breakdown--case-sensitive- Make search case-sensitive (default: case-insensitive)--partial-match- Allow partial matches within words (default: exact word matching)
Report:
The search report shows:
- Number of files scanned
- Number of files containing the substring (each file counted once)
With -v flag, also shows:
- List of file paths where substring appears
- Count broken down by language (file extension)
Analyze include directive relationships in RST files to understand file dependencies.
This command recursively follows .. include:: directives to show all files that are referenced from a starting file.
This helps you understand which content is transcluded into a page.
Use Cases:
This command helps writers:
- Understand the impact of changes to widely-included files
- Identify files included multiple times
- Document file relationships for maintenance
- Plan refactoring of complex include structures
- See what content is actually pulled into a page
Basic Usage:
# Analyze a single file (shows summary)
./audit-cli analyze includes path/to/file.rst
# Show hierarchical tree structure
./audit-cli analyze includes path/to/file.rst --tree
# Show flat list of all included files
./audit-cli analyze includes path/to/file.rst --list
# Show both tree and list
./audit-cli analyze includes path/to/file.rst --tree --list
# Verbose output (show processing details)
./audit-cli analyze includes path/to/file.rst --tree -vFlags:
--tree- Display results as a hierarchical tree structure--list- Display results as a flat list of all files-v, --verbose- Show detailed processing information
Output Formats:
Summary (default - no flags):
============================================================
INCLUDE ANALYSIS SUMMARY
============================================================
Root File: /path/to/file.rst
Unique Files: 18
Include Directives: 56
Max Depth: 2
============================================================
Use --tree to see the hierarchical structure
Use --list to see a flat list of all files
- Root file path
- Number of unique files discovered
- Total number of include directive instances (counting duplicates)
- Maximum depth of include nesting
- Hints to use --tree or --list for more details
Tree (--tree flag):
- Hierarchical tree structure showing include relationships
- Uses box-drawing characters for visual clarity
- Shows which files include which other files
- Displays directory paths to help disambiguate files with the same name
- Files in
includesdirectories:includes/filename.rst - Files outside
includes:path/from/source/filename.rst
- Files in
List (--list flag):
- Flat numbered list of all unique files
- Files listed in depth-first traversal order
- Shows absolute paths to all files
Verbose (-v flag):
- Shows complete dependency tree with all nodes (including duplicates)
- Each file displays the number of include directives it contains
- Uses visual indicators to show duplicate includes:
•(filled bullet) - First occurrence of a file◦(hollow bullet) - Subsequent occurrences (duplicates)
- Example output:
• get-started.txt (24 include directives)
• get-started/node/language-connection-steps.rst (3 include directives)
• includes/load-sample-data.rst
• includes/connection-string-note.rst
• includes/application-output.rst
• includes/next-steps.rst
• get-started/python/language-connection-steps.rst (3 include directives)
◦ includes/load-sample-data.rst
◦ includes/connection-string-note.rst
◦ includes/application-output.rst
◦ includes/next-steps.rst
Note on File Counting:
The command reports two distinct metrics:
-
Unique Files: Number of distinct files discovered through include directives. If a file is included multiple times (e.g., file A includes file C, and file B also includes file C), the file is counted only once.
-
Include Directives: Total number of include directive instances across all files. This counts every occurrence, including duplicates. For example, if
load-sample-data.rstis included 12 times across different files, it contributes 12 to this count.
In verbose mode, the tree view shows files in all locations where they appear. Duplicate occurrences are marked with
a hollow bullet (◦) to help you identify files that are included multiple times.
Note on Toctree:
This command does not follow .. toctree:: entries. Toctree entries are navigation links to other pages, not content
that's transcluded into the page. If you need to find which files reference a target file through toctree entries, use
the analyze usage command with the --include-toctree flag.
Find all files that use a target file through RST directives. This performs reverse dependency analysis, showing which
files reference the target file through include, literalinclude, io-code-block, or toctree directives.
The command searches all RST files (.rst and .txt extensions) and YAML files (.yaml and .yml extensions) in the
source directory tree. YAML files are included because extract and release files contain RST directives within their
content blocks.
Use Cases:
By default, this command searches for content inclusion directives (include, literalinclude,
io-code-block) that transclude content into pages. Use --include-toctree to also search
for toctree entries, which are navigation links rather than content transclusion.
This command helps writers:
- Understand the impact of changes to a file (what pages will be affected)
- Find all usages of an include file across the documentation
- Track where code examples are referenced
- Plan refactoring by understanding file dependencies
Basic Usage:
# Find what uses an include file (content inclusion only)
./audit-cli analyze usage path/to/includes/fact.rst
# Find what uses a code example
./audit-cli analyze usage path/to/code-examples/example.js
# Include toctree references (navigation links)
./audit-cli analyze usage path/to/file.rst --include-toctree
# Get JSON output for automation
./audit-cli analyze usage path/to/file.rst --format json
# Show detailed information with line numbers
./audit-cli analyze usage path/to/file.rst --verboseFlags:
--format <format>- Output format:text(default) orjson-v, --verbose- Show detailed information including line numbers and reference paths-c, --count-only- Only show the count of usages (useful for quick checks and scripting)--paths-only- Only show the file paths, one per line (useful for piping to other commands)--summary- Only show summary statistics (total files and usages by type, without file list)-t, --directive-type <type>- Filter by directive type:include,literalinclude,io-code-block, ortoctree--include-toctree- Include toctree entries (navigation links) in addition to content inclusion directives--exclude <pattern>- Exclude paths matching this glob pattern (e.g.,*/archive/*or*/deprecated/*)
Understanding the Counts:
The command shows two metrics:
- Total Files: Number of unique files that use the target (deduplicated)
- Total Usages: Total number of directive occurrences (includes duplicates)
When a file includes the target multiple times, it counts as:
- 1 file (in Total Files)
- Multiple usages (in Total Usages)
This helps identify both the impact scope (how many files) and duplicate includes (when usages > files).
Supported Directive Types:
By default, the command tracks content inclusion directives:
-
.. include::- RST content includes (transcluded).. include:: /includes/intro.rst
-
.. literalinclude::- Code file references (transcluded).. literalinclude:: /code-examples/example.py :language: python
-
.. io-code-block::- Input/output examples with file arguments (transcluded).. io-code-block:: .. input:: /code-examples/query.js :language: javascript .. output:: /code-examples/result.json :language: json
With --include-toctree, also tracks:
.. toctree::- Table of contents entries (navigation links, not transcluded).. toctree:: :maxdepth: 2 intro getting-started
Note: Only file-based references are tracked. Inline content (e.g., .. input:: with :language: but no file path)
is not tracked since it doesn't reference external files.
Output Formats:
Text (default):
============================================================
USAGE ANALYSIS
============================================================
Target File: /path/to/includes/intro.rst
Total Files: 3
Total Usages: 4
============================================================
include : 3 files, 4 usages
1. [include] duplicate-include-test.rst (2 usages)
2. [include] include-test.rst
3. [include] page.rst
Text with --verbose:
============================================================
USAGE ANALYSIS
============================================================
Target File: /path/to/includes/intro.rst
Total Files: 3
Total Usages: 4
============================================================
include : 3 files, 4 usages
1. [include] duplicate-include-test.rst (2 usages)
Line 6: /includes/intro.rst
Line 13: /includes/intro.rst
2. [include] include-test.rst
Line 6: /includes/intro.rst
3. [include] page.rst
Line 12: /includes/intro.rst
JSON (--format json):
{
"target_file": "/path/to/includes/intro.rst",
"source_dir": "/path/to/source",
"total_files": 3,
"total_usages": 4,
"using_files": [
{
"file_path": "/path/to/duplicate-include-test.rst",
"directive_type": "include",
"usage_path": "/includes/intro.rst",
"line_number": 6
},
{
"file_path": "/path/to/duplicate-include-test.rst",
"directive_type": "include",
"usage_path": "/includes/intro.rst",
"line_number": 13
},
{
"file_path": "/path/to/include-test.rst",
"directive_type": "include",
"usage_path": "/includes/intro.rst",
"line_number": 6
}
]
}Examples:
# Check if an include file is being used
./audit-cli analyze usage ~/docs/source/includes/fact-atlas.rst
# Find all pages that use a specific code example
./audit-cli analyze usage ~/docs/source/code-examples/connect.py
# Get machine-readable output for scripting
./audit-cli analyze usage ~/docs/source/includes/fact.rst --format json | jq '.total_usages'
# See exactly where a file is referenced (with line numbers)
./audit-cli analyze usage ~/docs/source/includes/intro.rst --verbose
# Quick check: just show the count
./audit-cli analyze usage ~/docs/source/includes/fact.rst --count-only
# Output: 5
# Show summary statistics only
./audit-cli analyze usage ~/docs/source/includes/fact.rst --summary
# Output:
# Total Files: 3
# Total Usages: 5
#
# By Type:
# include : 3 files, 5 usages
# Get list of files for piping to other commands
./audit-cli analyze usage ~/docs/source/includes/fact.rst --paths-only
# Output:
# page1.rst
# page2.rst
# page3.rst
# Filter to only show include directives (not literalinclude or io-code-block)
./audit-cli analyze usage ~/docs/source/includes/fact.rst --directive-type include
# Filter to only show literalinclude usages
./audit-cli analyze usage ~/docs/source/code-examples/example.py --directive-type literalinclude
# Combine filters: count only literalinclude usages
./audit-cli analyze usage ~/docs/source/code-examples/example.py -t literalinclude -c
# Combine filters: list files that use this as an io-code-block
./audit-cli analyze usage ~/docs/source/code-examples/query.js -t io-code-block --paths-only
# Exclude archived or deprecated files from search
./audit-cli analyze usage ~/docs/source/includes/fact.rst --exclude "*/archive/*"
./audit-cli analyze usage ~/docs/source/includes/fact.rst --exclude "*/deprecated/*"Analyze procedures in reStructuredText files to understand procedure complexity, uniqueness, and how they appear across different selections.
This command parses procedures from RST files and provides statistics about:
- Total number of unique procedures (grouped by heading and content)
- Total number of procedure appearances across all selections
- Implementation types (procedure directive vs ordered list)
- Step counts for each procedure
- Detection of sub-procedures (ordered lists within steps)
- All selections where each procedure appears
Use Cases:
This command helps writers:
- Understand the complexity of procedures in a document
- Count how many unique procedures exist vs. how many times they appear
- Identify procedures that use different implementation approaches
- See which selections each procedure appears in
- Plan testing coverage for procedure variations
- Scope work related to procedure updates
Basic Usage:
# Get summary count of unique procedures and total appearances
./audit-cli analyze procedures path/to/file.rst
# Show summary with incremental reporting flags
./audit-cli analyze procedures path/to/file.rst --list-summary
# List all unique procedures with full details
./audit-cli analyze procedures path/to/file.rst --list-all
# Expand include directives inline before analyzing
./audit-cli analyze procedures path/to/file.rst --expand-includesFlags:
--list-summary- Show summary statistics plus a list of procedure headings--list-all- Show full details for each procedure including steps, selections, and implementation--expand-includes- Expand include directives inline instead of preserving them
Output:
Default output (summary only):
File: path/to/file.rst
Total unique procedures: 36
Total procedure appearances: 93
With --list-summary:
File: path/to/file.rst
Total unique procedures: 36
Total procedure appearances: 93
Unique Procedures:
1. Before You Begin
2. Install MongoDB Community Edition
3. Configuration
4. Run MongoDB Community Edition
...
With --list-all:
File: path/to/file.rst
Total unique procedures: 36
Total procedure appearances: 93
================================================================================
Procedure Details
================================================================================
1. Before You Begin
Line: 45
Implementation: procedure-directive
Steps: 5
Contains sub-procedures: no
Appears in 2 selections:
- docker, None, None, None, None, None, without-search-docker
- docker, None, None, None, None, None, with-search-docker
Steps:
1. Pull the MongoDB Docker Image
2. Run the MongoDB Docker Container
3. Verify MongoDB is Running
4. Connect to MongoDB
5. Stop the MongoDB Docker Container
2. Install MongoDB Community Edition
Line: 123
Implementation: ordered-list
Steps: 4
Contains sub-procedures: yes
Appears in 10 selections:
- linux, None, None, tarball, None, None, with-search
- linux, None, None, tarball, None, None, without-search
...
Steps:
1. Download the tarball
2. Extract the files from the tarball
3. Ensure the binaries are in a directory listed in your PATH
4. Run MongoDB Community Edition
Understanding the Counts:
The command reports two key metrics:
-
Total unique procedures: Number of distinct procedures (grouped by heading and content hash)
- Procedures with the same heading but different content are counted separately
- Procedures with identical content are counted once, even if they appear in multiple selections
-
Total procedure appearances: Total number of times procedures appear across all selections
- If a procedure appears in 5 different selections, it contributes 5 to this count
- This represents the total number of procedure instances a user might encounter
Example:
- A file might have 36 unique procedures that appear a total of 93 times across different selections
- This means some procedures appear in multiple selections (e.g., a "Before You Begin" procedure that's the same for Docker with and without search)
Supported Procedure Types:
The command recognizes:
.. procedure::directives with.. step::directives- Ordered lists (numbered or lettered) as procedures
.. tabs::directives with:tabid:options for variations.. composable-tutorial::directives with.. selected-content::blocks- Sub-procedures (ordered lists within steps)
- YAML steps files (automatically converted to RST format)
Deterministic Parsing:
The parser ensures deterministic results by:
- Sorting all map iterations to ensure consistent ordering
- Sorting procedures by line number
- Computing content hashes in a consistent manner
- This guarantees the same file will always produce the same counts and groupings
For more details about procedure parsing logic, refer to docs/PROCEDURE_PARSING.md.
Analyze composable definitions in snooty.toml files across the MongoDB documentation monorepo. This command helps identify consolidation opportunities and track composable usage.
Composables are configuration elements in snooty.toml that define content variations for different contexts (e.g., different programming languages, deployment types, or interfaces). They're used in .. composable-tutorial:: directives to create context-specific documentation.
Use Cases:
This command helps writers:
- Inventory all composables across projects and versions
- Identify identical composables that could be consolidated across projects
- Find similar composables with different IDs but overlapping options (potential consolidation candidates)
- Track where composables are used in RST files
- Identify unused composables that may be candidates for removal
- Understand the scope of changes when updating a composable
Basic Usage:
# Analyze all composables in the monorepo
./audit-cli analyze composables /path/to/docs-monorepo
# Use configured monorepo path (from config file or environment variable)
./audit-cli analyze composables
# Analyze composables for a specific project
./audit-cli analyze composables --for-project atlas
# Analyze only current versions
./audit-cli analyze composables --current-only
# Show full option details with titles
./audit-cli analyze composables --verbose
# Find consolidation candidates
./audit-cli analyze composables --find-similar
# Find where composables are used
./audit-cli analyze composables --find-usages
# Include canonical rstspec.toml composables
./audit-cli analyze composables --with-rstspec --find-similar
# Combine flags for comprehensive analysis
./audit-cli analyze composables --for-project atlas --find-similar --find-usages --verboseFlags:
--for-project <project>- Only analyze composables for a specific project--current-only- Only analyze composables in current versions (skips versioned directories)-v, --verbose- Show full option details with titles instead of just IDs--find-similar- Show identical and similar composables for consolidation--find-usages- Show where each composable is used in RST files--with-rstspec- Include composables from the canonical rstspec.toml file in the snooty-parser repository
Output:
Default output (summary and table):
Composables Analysis
====================
Total composables found: 24
Composables by ID:
- deployment-type: 1
- interface: 1
- language: 1
...
All Composables
===============
Project Version ID Title Options
------------------------------------------------------------------------------------------------------------------------
atlas (none) deployment-type Deployment Type atlas, local, self, local-onprem
atlas (none) interface Interface compass, mongosh, atlas-ui, driver
atlas (none) language Language c, csharp, cpp, go, java-async, ...
With --find-similar:
Shows two types of consolidation opportunities:
-
Identical Composables - Same ID, title, and options across different projects/versions
Identical Composables (Consolidation Candidates) ================================================ ID: connection-mechanism Occurrences: 15 Title: Connection Mechanism Default: connection-string Options: connection-string, mongocred Found in: - java/current - java/v5.1 - kotlin/current ... -
Similar Composables - Different IDs but similar option sets (60%+ overlap)
Similar Composables (Review Recommended) ======================================== Similar Composables (100.0% similarity) Composables: 2 Composables in this group: 1. ID: interface-atlas-only Location: atlas Title: Interface Default: driver Options: atlas-ui, driver, mongosh 2. ID: interface-local-only Location: atlas Title: Interface Default: driver Options: atlas-ui, driver, mongosh
With --find-usages:
Shows where each composable is used in .. composable-tutorial:: directives:
Composable Usages
=================
Composable ID: deployment-type
Total usages: 28
atlas: 28 usages
Composable ID: interface
Total usages: 35
atlas: 35 usages
Unused Composables
------------------
connection-type:
- atlas
With --verbose and --find-usages:
Shows file paths where each composable is used:
Composable ID: interface-atlas-only
Total usages: 1
atlas: 1 usages
- content/atlas/source/atlas-vector-search/tutorials/vector-search-quick-start.txt
Understanding Composables:
Composables are defined in snooty.toml files:
[[composables]]
id = "language"
title = "Language"
default = "nodejs"
[[composables.options]]
id = "python"
title = "Python"
[[composables.options]]
id = "nodejs"
title = "Node.js"They're used in RST files with .. composable-tutorial:: directives:
.. composable-tutorial::
:options: language, interface
:defaults: nodejs, driver
.. procedure::
.. step:: Install dependencies
.. selected-content::
:selections: language=nodejs
npm install mongodb
.. selected-content::
:selections: language=python
pip install pymongoConsolidation Analysis:
The command uses Jaccard similarity (intersection / union) to compare option sets between composables with different IDs. A 60% similarity threshold is used to identify potential consolidation candidates.
For example, if you have:
languagewith 15 optionslanguage-atlas-onlywith 14 options (13 in common withlanguage)language-local-onlywith 14 options (13 in common withlanguage)
These would be flagged as similar composables (93.3% similarity) and potential consolidation candidates.
Compare file contents to identify differences between files. Supports two modes:
- Direct comparison - Compare two specific files
- Version comparison - Compare the same file across multiple documentation versions
Use Cases:
This command helps writers:
- Identify content drift across documentation versions
- Verify that updates have been applied consistently
- Scope maintenance work when updating shared content
- Understand how files have diverged over time
Basic Usage:
# Direct comparison of two files
./audit-cli compare file-contents file1.rst file2.rst
# Compare with diff output
./audit-cli compare file-contents file1.rst file2.rst --show-diff
# Version comparison - auto-discovers all versions
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst
# Version comparison - specific versions only
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst \
--versions manual,upcoming,v8.0,v7.0
# Show which files differ
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst \
--show-paths
# Show detailed diffs
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst \
--show-diff
# Verbose output (show processing details and auto-discovered versions)
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst \
-vFlags:
-V, --versions <list>- Comma-separated list of versions (optional; auto-discovers all versions if not specified)--show-paths- Display file paths grouped by status (matching, differing, not found)-d, --show-diff- Display unified diff output (implies--show-paths)-v, --verbose- Show detailed processing information (including auto-discovered versions and product directory)
Comparison Modes:
1. Direct Comparison (Two Files)
Provide two file paths as arguments:
./audit-cli compare file-contents path/to/file1.rst path/to/file2.rstThis mode:
- Compares exactly two files
- Reports whether they are identical or different
- Can show unified diff with
--show-diff
2. Version Comparison (Product Directory)
Provide one file path. The product directory and versions are automatically detected from the file path:
# Auto-discover all versions
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst
# Or specify specific versions
./audit-cli compare file-contents \
/path/to/manual/manual/source/includes/example.rst \
--versions manual,upcoming,v8.0This mode:
- Automatically detects the product directory from the file path
- Auto-discovers all available versions (unless
--versionsis specified) - Extracts the relative path from the reference file
- Resolves the same relative path in each version directory
- Compares all versions against the reference file
- Reports matching, differing, and missing files
Version Directory Structure:
The tool expects MongoDB documentation to be organized as:
product-dir/
├── manual/
│ └── source/
│ └── includes/
│ └── example.rst
├── upcoming/
│ └── source/
│ └── includes/
│ └── example.rst
└── v8.0/
└── source/
└── includes/
└── example.rst
Output Formats:
Summary (default - no flags):
- Total number of versions compared
- Count of matching, differing, and missing files
- Hints to use
--show-pathsor--show-difffor more details
With --show-paths:
- Summary (as above)
- List of files that match (with ✓)
- List of files that differ (with ✗)
- List of files not found (with -)
With --show-diff:
- Summary and paths (as above)
- Unified diff output for each differing file
- Shows added lines (prefixed with +)
- Shows removed lines (prefixed with -)
- Shows context lines around changes
Examples:
# Check if a file is consistent across all versions (auto-discovered)
./audit-cli compare file-contents \
~/workspace/docs-mongodb-internal/content/manual/manual/source/includes/fact-atlas-search.rst
# Find differences and see what changed (all versions)
./audit-cli compare file-contents \
~/workspace/docs-mongodb-internal/content/manual/manual/source/includes/fact-atlas-search.rst \
--show-diff
# Compare across specific versions only
./audit-cli compare file-contents \
~/workspace/docs-mongodb-internal/content/manual/manual/source/includes/fact-atlas-search.rst \
--versions manual,upcoming,v8.0,v7.0,v6.0
# Compare two specific versions of a file directly
./audit-cli compare file-contents \
~/workspace/docs-mongodb-internal/content/manual/manual/source/includes/example.rst \
~/workspace/docs-mongodb-internal/content/manual/v8.0/source/includes/example.rst \
--show-diffExit Codes:
0- Success (files compared successfully, regardless of whether they match)1- Error (invalid arguments, file not found, read error, etc.)
Note on Missing Files:
Files that don't exist in certain versions are reported separately and do not cause errors. This is expected behavior since features may be added or removed across versions.
Count tested code examples in the MongoDB documentation monorepo.
This command navigates to the content/code-examples/tested directory from the monorepo root and counts all files recursively. The tested directory has a two-level structure: L1 (language directories) and L2 (product directories).
Use Cases:
This command helps writers and maintainers:
- Track the total number of tested code examples
- Monitor code example coverage by product
- Identify products with few or many examples
- Count only source files (excluding output files)
Basic Usage:
# Get total count of all tested code examples
./audit-cli count tested-examples /path/to/docs-monorepo
# Use configured monorepo path (from config file or environment variable)
./audit-cli count tested-examples
# Count examples for a specific product
./audit-cli count tested-examples --for-product pymongo
# Show counts broken down by product
./audit-cli count tested-examples --count-by-product
# Count only source files (exclude .txt and .sh output files)
./audit-cli count tested-examples --exclude-outputFlags:
--for-product <product>- Only count code examples for a specific product--count-by-product- Display counts for each product--exclude-output- Only count source files (exclude .txt and .sh files)
Current Valid Products:
mongosh- MongoDB Shellcsharp/driver- C#/.NET Drivergo/driver- Go Drivergo/atlas-sdk- Atlas Go SDKjava/driver-sync- Java Sync Driverjavascript/driver- Node.js Driverpymongo- PyMongo Driver
Output:
By default, prints a single integer (total count) for use in CI or scripting. With --count-by-product, displays a formatted table with product names and counts.
Count documentation pages (.txt files) in the MongoDB documentation monorepo.
This command navigates to the content directory and recursively counts all .txt files, which represent documentation pages that resolve to unique URLs. The command automatically excludes certain directories and file types that don't represent actual documentation pages.
Use Cases:
This command helps writers and maintainers:
- Track the total number of documentation pages across the monorepo
- Monitor documentation coverage by product/project
- Identify projects with extensive or minimal documentation
- Exclude auto-generated or deprecated content from counts
- Count only current versions of versioned documentation
- Compare page counts across different documentation versions
Automatic Exclusions:
The command automatically excludes:
- Files in
code-examplesdirectories at the root ofcontentorsource(these contain plain text examples, not pages) - Files in the following directories at the root of
content:404- Error pagesdocs-platform- Documentation for the MongoDB website and meta contentmeta- MongoDB Meta Documentation - style guide, tools, etc.table-of-contents- Navigation files
- All non-
.txtfiles (configuration files, YAML, etc.)
Basic Usage:
# Get total count of all documentation pages
./audit-cli count pages /path/to/docs-monorepo
# Use configured monorepo path (from config file or environment variable)
./audit-cli count pages
# Count pages for a specific project
./audit-cli count pages --for-project manual
# Show counts broken down by project
./audit-cli count pages --count-by-project
# Exclude specific directories from counting
./audit-cli count pages --exclude-dirs api-reference,generated
# Count only current versions (for versioned projects)
./audit-cli count pages --current-only
# Show counts by project and version
./audit-cli count pages --by-version
# Combine flags: count pages for a specific project, excluding certain directories
./audit-cli count pages /path/to/docs-monorepo --for-project atlas --exclude-dirs deprecatedFlags:
--for-project <project>- Only count pages for a specific project (directory name undercontent/)--count-by-project- Display counts for each project in a formatted table--exclude-dirs <dirs>- Comma-separated list of directory names to exclude from counting (e.g.,deprecated,archive)--current-only- Only count pages in the current version (for versioned projects, counts onlycurrentormanualversion directories; for non-versioned projects, counts all pages)--by-version- Display counts grouped by project and version (shows version breakdown for versioned projects; non-versioned projects show as "(no version)")
Output:
By default, prints a single integer (total count) for use in CI or scripting. With --count-by-project, displays a formatted table with project names and counts. With --by-version, displays a hierarchical breakdown by project and version.
Versioned Documentation:
Some MongoDB documentation projects contain multiple versions, represented as distinct directories between the project directory and the source directory:
- Versioned project structure:
content/{project}/{version}/source/... - Non-versioned project structure:
content/{project}/source/...
Version directory names follow these patterns:
currentormanual- The current/latest versionupcoming- Pre-release versionv{number}- Specific version (e.g.,v8.0,v7.0)
The --current-only flag counts only files in the current version directory (current or manual) for versioned projects, while counting all files for non-versioned projects.
The --by-version flag shows a breakdown of page counts for each version within each project.
Note: The --current-only and --by-version flags are mutually exclusive.
Examples:
# Quick count for CI/CD
TOTAL_PAGES=$(./audit-cli count pages ~/docs-monorepo)
echo "Total documentation pages: $TOTAL_PAGES"
# Detailed breakdown by project
./audit-cli count pages ~/docs-monorepo --count-by-project
# Output:
# Page Counts by Project:
#
# app-services 245
# atlas 512
# manual 1024
# ...
#
# Total: 2891
# Count only Atlas pages
./audit-cli count pages ~/docs-monorepo --for-project atlas
# Output: 512
# Exclude deprecated content
./audit-cli count pages ~/docs-monorepo --exclude-dirs deprecated,archive --count-by-project
# Count only current versions
./audit-cli count pages ~/docs-monorepo --current-only
# Output: 1245 (only counts current/manual versions)
# Show breakdown by version
./audit-cli count pages ~/docs-monorepo --by-version
# Output:
# Project: drivers
# manual 150
# upcoming 145
# v8.0 140
# v7.0 135
#
# Project: atlas
# (no version) 200
#
# Total: 770
# Count current version for a specific project
./audit-cli count pages ~/docs-monorepo --for-project drivers --current-only
# Output: 150audit-cli/
├── main.go # CLI entry point
├── commands/ # Command implementations
│ ├── extract/ # Extract parent command
│ │ ├── extract.go # Parent command definition
│ │ ├── code-examples/ # Code examples subcommand
│ │ │ ├── code_examples.go # Command logic
│ │ │ ├── code_examples_test.go # Tests
│ │ │ ├── parser.go # RST directive parsing
│ │ │ ├── writer.go # File writing logic
│ │ │ ├── report.go # Report generation
│ │ │ ├── types.go # Type definitions
│ │ │ └── language.go # Language normalization
│ │ └── procedures/ # Procedures extraction subcommand
│ │ ├── procedures.go # Command logic
│ │ ├── procedures_test.go # Tests
│ │ ├── parser.go # Filename generation and filtering
│ │ ├── writer.go # RST file writing
│ │ └── types.go # Type definitions
│ ├── search/ # Search parent command
│ │ ├── search.go # Parent command definition
│ │ └── find-string/ # Find string subcommand
│ │ ├── find_string.go # Command logic
│ │ ├── types.go # Type definitions
│ │ └── report.go # Report generation
│ ├── analyze/ # Analyze parent command
│ │ ├── analyze.go # Parent command definition
│ │ ├── composables/ # Composables analysis subcommand
│ │ │ ├── composables.go # Command logic
│ │ │ ├── composables_test.go # Tests
│ │ │ ├── analyzer.go # Composable analysis logic
│ │ │ ├── parser.go # Snooty.toml parsing
│ │ │ ├── rstspec_adapter.go # Rstspec.toml adapter
│ │ │ ├── rstspec_adapter_test.go # Rstspec adapter tests
│ │ │ ├── usage_finder.go # Usage finding logic
│ │ │ ├── output.go # Output formatting
│ │ │ └── types.go # Type definitions
│ │ ├── includes/ # Includes analysis subcommand
│ │ │ ├── includes.go # Command logic
│ │ │ ├── analyzer.go # Include tree building
│ │ │ ├── output.go # Output formatting
│ │ │ └── types.go # Type definitions
│ │ ├── procedures/ # Procedures analysis subcommand
│ │ │ ├── procedures.go # Command logic
│ │ │ ├── procedures_test.go # Tests
│ │ │ ├── analyzer.go # Procedure analysis logic
│ │ │ ├── output.go # Output formatting
│ │ │ └── types.go # Type definitions
│ │ └── usage/ # Usage analysis subcommand
│ │ ├── usage.go # Command logic
│ │ ├── usage_test.go # Tests
│ │ ├── analyzer.go # Reference finding logic
│ │ ├── output.go # Output formatting
│ │ └── types.go # Type definitions
│ ├── compare/ # Compare parent command
│ │ ├── compare.go # Parent command definition
│ │ └── file-contents/ # File contents comparison subcommand
│ │ ├── file_contents.go # Command logic
│ │ ├── file_contents_test.go # Tests
│ │ ├── comparer.go # Comparison logic
│ │ ├── differ.go # Diff generation
│ │ ├── output.go # Output formatting
│ │ ├── types.go # Type definitions
│ │ └── version_resolver.go # Version path resolution
│ └── count/ # Count parent command
│ ├── count.go # Parent command definition
│ ├── tested-examples/ # Tested examples counting subcommand
│ │ ├── tested_examples.go # Command logic
│ │ ├── tested_examples_test.go # Tests
│ │ ├── counter.go # Counting logic
│ │ ├── output.go # Output formatting
│ │ └── types.go # Type definitions
│ └── pages/ # Pages counting subcommand
│ ├── pages.go # Command logic
│ ├── pages_test.go # Tests
│ ├── counter.go # Counting logic
│ ├── output.go # Output formatting
│ └── types.go # Type definitions
├── internal/ # Internal packages
│ ├── config/ # Configuration management
│ │ ├── config.go # Config loading and path resolution
│ │ └── config_test.go # Config tests
│ ├── projectinfo/ # Project structure and info utilities
│ │ ├── pathresolver.go # Core path resolution
│ │ ├── pathresolver_test.go # Tests
│ │ ├── source_finder.go # Source directory detection
│ │ ├── version_resolver.go # Version path resolution
│ │ └── types.go # Type definitions
│ └── rst/ # RST parsing utilities
│ ├── parser.go # Generic parsing with includes
│ ├── include_resolver.go # Include directive resolution
│ ├── directive_parser.go # Directive parsing
│ ├── directive_regex.go # Directive regex patterns
│ ├── parse_procedures.go # Procedure parsing (core logic)
│ ├── parse_procedures_test.go # Procedure parsing tests
│ ├── get_procedure_variations.go # Variation extraction logic
│ ├── get_procedure_variations_test.go # Variation tests
│ ├── procedure_types.go # Procedure type definitions
│ ├── rstspec.go # Rstspec.toml fetching and parsing
│ ├── rstspec_test.go # Rstspec tests
│ └── file_utils.go # File utilities
└── testdata/ # Test fixtures
├── input-files/ # Test RST files
│ └── source/ # Source directory (required)
│ ├── *.rst # Test files
│ ├── includes/ # Included RST files
│ └── code-examples/ # Code files for literalinclude
├── expected-output/ # Expected extraction results
├── composables-test/ # Composables analysis test data
│ └── content/ # Test monorepo structure
├── compare/ # Compare command test data
│ ├── product/ # Version structure tests
│ │ ├── manual/ # Manual version
│ │ ├── upcoming/ # Upcoming version
│ │ └── v8.0/ # v8.0 version
│ └── *.txt # Direct comparison tests
├── count-test-monorepo/ # Count command test data
│ └── content/code-examples/tested/ # Tested examples structure
└── search-test-files/ # Search command test data
Example: Adding extract tables subcommand
-
Create the subcommand directory:
mkdir -p commands/extract/tables
-
Create the command file (
commands/extract/tables/tables.go):package tables import ( "github.com/spf13/cobra" ) func NewTablesCommand() *cobra.Command { cmd := &cobra.Command{ Use: "tables [filepath]", Short: "Extract tables from RST files", Args: cobra.ExactArgs(1), RunE: func(cmd *cobra.Command, args []string) error { // Implementation here return nil }, } // Add flags cmd.Flags().StringP("output", "o", "./output", "Output directory") return cmd }
-
Register the subcommand in
commands/extract/extract.go:import ( "github.com/grove-platform/audit-cli/commands/extract/tables" ) func NewExtractCommand() *cobra.Command { cmd := &cobra.Command{...} cmd.AddCommand(codeexamples.NewCodeExamplesCommand()) cmd.AddCommand(tables.NewTablesCommand()) // Add this line return cmd }
Example: Adding analyze parent command
-
Create the parent directory:
mkdir -p commands/analyze
-
Create the parent command (
commands/analyze/analyze.go):package analyze import ( "github.com/spf13/cobra" ) func NewAnalyzeCommand() *cobra.Command { cmd := &cobra.Command{ Use: "analyze", Short: "Analyze extracted content", } // Add subcommands here return cmd }
-
Register in main.go:
import ( "github.com/grove-platform/audit-cli/commands/analyze" ) func main() { rootCmd.AddCommand(extract.NewExtractCommand()) rootCmd.AddCommand(search.NewSearchCommand()) rootCmd.AddCommand(analyze.NewAnalyzeCommand()) // Add this line }
# Run all tests
cd audit-cli
go test ./...
# Run tests for a specific package
go test ./commands/extract/code-examples -v
# Run a specific test
go test ./commands/extract/code-examples -run TestRecursiveDirectoryScanning -v
# Run tests with coverage
go test ./... -coverTests use a table-driven approach with test fixtures in the testdata/ directory:
- Input files:
testdata/input-files/source/- RST files and referenced code - Expected output:
testdata/expected-output/- Expected extracted files - Test pattern: Compare actual extraction output against expected files
Note: The testdata directory name is special in Go - it's automatically ignored during builds, which is important
since it contains non-Go files (.cpp, .rst, etc.).
-
Create test input files in
testdata/input-files/source/:# Create a new test RST file cat > testdata/input-files/source/my-test.rst << 'EOF' .. code-block:: javascript console.log("Hello, World!"); EOF
-
Generate expected output:
./audit-cli extract code-examples testdata/input-files/source/my-test.rst \ -o testdata/expected-output
-
Verify the output is correct before committing
-
Add test case in the appropriate
*_test.gofile:func TestMyNewFeature(t *testing.T) { testDataDir := filepath.Join("..", "..", "..", "testdata") inputFile := filepath.Join(testDataDir, "input-files", "source", "my-test.rst") expectedDir := filepath.Join(testDataDir, "expected-output") tempDir, err := os.MkdirTemp("", "test-*") if err != nil { t.Fatalf("Failed to create temp directory: %v", err) } defer os.RemoveAll(tempDir) report, err := RunExtract(inputFile, tempDir, false, false, false, false) if err != nil { t.Fatalf("RunExtract failed: %v", err) } // Add assertions here }
- Relative paths: Tests use
filepath.Join("..", "..", "..", "testdata")to reference test data (three levels up fromcommands/extract/code-examples/) - Temporary directories: Use
os.MkdirTemp()for test output, clean up withdefer os.RemoveAll() - Exact content matching: Tests compare byte-for-byte content
- No trailing newlines: Expected output files should not have trailing blank lines
If you've changed the parsing logic and need to regenerate expected output:
cd audit-cli
# Update all expected outputs
./audit-cli extract code-examples testdata/input-files/source/literalinclude-test.rst \
-o testdata/expected-output
./audit-cli extract code-examples testdata/input-files/source/code-block-test.rst \
-o testdata/expected-output
./audit-cli extract code-examples testdata/input-files/source/nested-code-block-test.rst \
-o testdata/expected-output
./audit-cli extract code-examples testdata/input-files/source/io-code-block-test.rst \
-o testdata/expected-output
./audit-cli extract code-examples testdata/input-files/source/include-test.rst \
-o testdata/expected-output -fImportant: Always verify the new output is correct before committing!
All commands follow this pattern:
package mycommand
import "github.com/spf13/cobra"
func NewMyCommand() *cobra.Command {
var flagVar string
cmd := &cobra.Command{
Use: "my-command [args]",
Short: "Brief description",
Long: "Detailed description",
Args: cobra.ExactArgs(1), // Or MinimumNArgs, etc.
RunE: func(cmd *cobra.Command, args []string) error {
// Get flag values
flagValue, _ := cmd.Flags().GetString("flag-name")
// Call the main logic function
return RunMyCommand(args[0], flagValue)
},
}
// Define flags
cmd.Flags().StringVarP(&flagVar, "flag-name", "f", "default", "Description")
return cmd
}
// Separate logic function for testability
func RunMyCommand(arg string, flagValue string) error {
// Implementation here
return nil
}Why this pattern?
- Separates command definition from logic
- Makes logic testable without Cobra
- Consistent across all commands
Use descriptive error wrapping:
import "fmt"
// Wrap errors with context
file, err := os.Open(filePath)
if err != nil {
return fmt.Errorf("failed to open file %s: %w", filePath, err)
}
// Check for specific conditions
if !fileInfo.IsDir() {
return fmt.Errorf("path %s is not a directory", path)
}Use the scanner pattern for line-by-line processing:
import (
"bufio"
"os"
)
func processFile(filePath string) error {
file, err := os.Open(filePath)
if err != nil {
return fmt.Errorf("failed to open file: %w", err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
lineNum := 0
for scanner.Scan() {
lineNum++
line := scanner.Text()
// Process line
}
if err := scanner.Err(); err != nil {
return fmt.Errorf("error reading file: %w", err)
}
return nil
}Use filepath.Walk for recursive traversal:
import (
"os"
"path/filepath"
)
func traverseDirectory(rootPath string, recursive bool) ([]string, error) {
var files []string
err := filepath.Walk(rootPath, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
// Skip subdirectories if not recursive
if !recursive && info.IsDir() && path != rootPath {
return filepath.SkipDir
}
// Collect files
if !info.IsDir() {
files = append(files, path)
}
return nil
})
return files, err
}Path Resolution for File-Based Commands:
Commands that accept file paths should use config.ResolveFilePath() to support flexible path resolution:
import "github.com/grove-platform/audit-cli/internal/config"
RunE: func(cmd *cobra.Command, args []string) error {
// Resolve file path (supports absolute, monorepo-relative, or cwd-relative)
filePath, err := config.ResolveFilePath(args[0])
if err != nil {
return err
}
// Use the resolved absolute path
return processFile(filePath)
}This allows users to specify paths as:
- Absolute:
/full/path/to/file.rst - Monorepo-relative:
manual/manual/source/file.rst(if monorepo configured) - Current directory-relative:
./file.rst
Use table-driven tests where appropriate:
func TestLanguageNormalization(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{"TypeScript", "ts", "typescript"},
{"C++", "c++", "cpp"},
{"Golang", "golang", "go"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := NormalizeLanguage(tt.input)
if result != tt.expected {
t.Errorf("NormalizeLanguage(%q) = %q, want %q",
tt.input, result, tt.expected)
}
})
}
}Use a consistent pattern for verbose logging:
func processWithVerbose(filePath string, verbose bool) error {
if verbose {
fmt.Printf("Processing: %s\n", filePath)
}
// Do work
if verbose {
fmt.Printf("Completed: %s\n", filePath)
}
return nil
}The tool extracts code examples from the following reStructuredText directives:
Extracts code from external files with support for partial extraction and dedenting.
Syntax:
.. literalinclude:: /path/to/file.py
:language: python
:start-after: start-tag
:end-before: end-tag
:dedent:Supported Options:
:language:- Specifies the programming language (normalized:ts→typescript,c++→cpp,golang→go):start-after:- Extract content after this tag (skips the entire line containing the tag):end-before:- Extract content before this tag (cuts before the entire line containing the tag):dedent:- Remove common leading whitespace from the extracted content
Example:
Given code-examples/example.py:
def main():
# start-example
result = calculate(42)
print(result)
# end-exampleAnd RST:
.. literalinclude:: /code-examples/example.py
:language: python
:start-after: start-example
:end-before: end-example
:dedent:Extracts:
result = calculate(42)
print(result)Inline code blocks with automatic dedenting based on the first line's indentation.
Syntax:
.. code-block:: javascript
:copyable: false
:emphasize-lines: 2,3
const greeting = "Hello, World!";
console.log(greeting);Supported Options:
- Language argument -
.. code-block:: javascript(optional, defaults totxt) :language:- Alternative way to specify language:copyable:- Parsed but not used for extraction:emphasize-lines:- Parsed but not used for extraction
Automatic Dedenting:
The content is automatically dedented based on the indentation of the first content line. For example:
.. note::
.. code-block:: python
def hello():
print("Hello")The code has 6 spaces of indentation (3 from note, 3 from code-block). The tool automatically removes these 6 spaces,
resulting in:
def hello():
print("Hello")Input/output code blocks for interactive examples with nested sub-directives.
Syntax:
.. io-code-block::
:copyable: true
.. input::
:language: javascript
db.restaurants.aggregate([
{ $match: { category: "cafe" } }
])
.. output::
:language: json
[
{ _id: 1, category: 'café', status: 'Open' }
]Supported Options:
:copyable:- Parsed but not used for extraction- Nested
.. input::sub-directive (required)- Can have filepath argument:
.. input:: /path/to/file.js - Or inline content with
:language:option
- Can have filepath argument:
- Nested
.. output::sub-directive (optional)- Can have filepath argument:
.. output:: /path/to/output.txt - Or inline content with
:language:option
- Can have filepath argument:
File-based Content:
.. io-code-block::
.. input:: /code-examples/query.js
:language: javascript
.. output:: /code-examples/result.json
:language: jsonOutput Files:
Generates two files:
{source}.io-code-block.{index}.input.{ext}- The input code{source}.io-code-block.{index}.output.{ext}- The output (if present)
Example: my-doc.io-code-block.1.input.js and my-doc.io-code-block.1.output.json
Follows include directives to process entire documentation trees (when -f flag is used).
Syntax:
.. include:: /includes/intro.rstSpecial MongoDB Conventions:
The tool handles several MongoDB-specific include patterns:
Converts directory-based paths to filename-based paths:
- Input:
/includes/steps/run-mongodb-on-linux.rst - Resolves to:
/includes/steps-run-mongodb-on-linux.yaml
Resolves ref-based includes by searching YAML files:
- Input:
/includes/extracts/install-mongodb.rst - Searches:
/includes/extracts-*.yamlforref: install-mongodb - Resolves to: The YAML file containing that ref
Resolves template variables from YAML replacement sections:
replacement:
release_specification_default: "/includes/release/install-windows-default.rst"- Input:
{{release_specification_default}} - Resolves to:
/includes/release/install-windows-default.rst
Source Directory Resolution:
The tool walks up the directory tree to find a directory named "source" or containing a "source" subdirectory. This is used as the base for resolving relative include paths.
Provides configuration management for the CLI tool:
- Config file loading - Loads
.audit-cli.yamlfrom current or home directory - Environment variable support - Reads
AUDIT_CLI_MONOREPO_PATHenvironment variable - Monorepo path resolution - Resolves monorepo path with priority: CLI arg > env var > config file
- File path resolution - Resolves file paths as absolute, monorepo-relative, or cwd-relative
Key Functions:
LoadConfig()- Loads configuration from file or environmentGetMonorepoPath(cmdLineArg string)- Resolves monorepo path with priority orderResolveFilePath(pathArg string)- Resolves file paths with flexible resolution
Priority Order for Monorepo Path:
- Command-line argument (highest priority)
- Environment variable
AUDIT_CLI_MONOREPO_PATH - Config file
.audit-cli.yaml(lowest priority)
Priority Order for File Paths:
- Absolute path (used as-is)
- Relative to monorepo root (if monorepo configured and file exists there)
- Relative to current directory (fallback)
See the code in internal/config/ for implementation details.
Provides centralized utilities for understanding MongoDB documentation project structure:
- Source directory detection - Finds the documentation root by walking up the directory tree
- Project info detection - Identifies product directory, version, and whether a project is versioned
- Version discovery - Automatically discovers all available versions in a product directory
- Version path resolution - Resolves file paths across multiple documentation versions
- Relative path resolution - Resolves paths relative to the source directory
Key Functions:
FindSourceDirectory(filePath string)- Finds the source directory for a given fileDetectProjectInfo(filePath string)- Detects project structure informationDiscoverAllVersions(productDir string)- Discovers all available versions in a productResolveVersionPaths(referenceFile, productDir string, versions []string)- Resolves paths across versionsResolveRelativeToSource(sourceDir, relativePath string)- Resolves relative paths
See the code in internal/projectinfo/ for implementation details.
Provides reusable utilities for parsing and processing RST files:
- Include resolution - Handles all include directive patterns
- Directory traversal - Recursive file scanning
- Directive parsing - Extracts structured data from RST directives
- Procedure parsing - Parses procedure directives, ordered lists, and variations
- Procedure variations - Extracts variations from composable tutorials and tabs
- Rstspec.toml fetching - Fetches and parses canonical composable definitions from snooty-parser
- Template variable resolution - Resolves YAML-based template variables
- Source directory detection - Finds the documentation root
Key Functions:
ParseFileWithIncludes(filePath string)- Parses RST file with include expansionParseDirectives(content string)- Extracts directive information from RST contentParseProcedures(filePath string, expandIncludes bool)- Parses procedures from RST fileGetProcedureVariations(filePath string)- Extracts procedure variationsFetchRstspec()- Fetches and parses canonical rstspec.toml from snooty-parser repository
Rstspec.toml Support:
The FetchRstspec() function retrieves the canonical composable definitions from the snooty-parser repository. This provides:
- Standard composable IDs (e.g.,
interface,language,deployment-type) - Composable titles and descriptions
- Default values for each composable
- Available options for each composable
This is used by the analyze composables command to show canonical definitions alongside project-specific ones.
See the code in internal/rst/ for implementation details.
The tool normalizes language identifiers to standard file extensions:
| Input | Normalized | Extension |
|---|---|---|
bash |
bash |
.sh |
c |
c |
.c |
c++ |
cpp |
.cpp |
c# |
csharp |
.cs |
console |
console |
.sh |
cpp |
cpp |
.cpp |
cs |
csharp |
.cs |
csharp |
csharp |
.cs |
go |
go |
.go |
golang |
go |
.go |
java |
java |
.java |
javascript |
javascript |
.js |
js |
javascript |
.js |
kotlin |
kotlin |
.kt |
kt |
kotlin |
.kt |
php |
php |
.php |
powershell |
powershell |
.ps1 |
ps1 |
powershell |
.ps1 |
ps5 |
ps5 |
.ps1 |
py |
python |
.py |
python |
python |
.py |
rb |
ruby |
.rb |
rs |
rust |
.rs |
ruby |
ruby |
.rb |
rust |
rust |
.rs |
scala |
scala |
.scala |
sh |
shell |
.sh |
shell |
shell |
.sh |
swift |
swift |
.swift |
text |
text |
.txt |
ts |
typescript |
.ts |
txt |
text |
.txt |
typescript |
typescript |
.ts |
| (empty string) | undefined |
.txt |
none |
undefined |
.txt |
| (unknown) | (unchanged) | .txt |
Notes:
- Language identifiers are case-insensitive
- Unknown languages are returned unchanged by
NormalizeLanguage()but map to.txtextension - The normalization handles common aliases (e.g.,
ts→typescript,golang→go,c++→cpp)
When contributing to this project:
- Follow the established patterns - Use the command structure, error handling, and testing patterns described above
- Write tests - All new functionality should have corresponding tests
- Update documentation - Keep this README up to date with new features
- Run tests before committing - Ensure
go test ./...passes - Use meaningful commit messages - Describe what changed and why