-
Notifications
You must be signed in to change notification settings - Fork 389
Description
📦 Epic: Support Bundle Generation - Automated Diagnostics Collection
Goal
Implement a support bundle generation feature that allows administrators and users to collect comprehensive diagnostic information for troubleshooting MCP Gateway issues. The feature should automatically sanitize sensitive data (passwords, tokens, API keys, secrets) while providing all necessary technical details for support teams.
Why Now?
As MCP Gateway deployments grow in complexity and scale, troubleshooting production issues becomes increasingly challenging:
- Manual Diagnostic Collection is Error-Prone: Support teams currently ask users to manually collect logs, configuration files, and system information, leading to incomplete or inconsistent data
- Sensitive Data Exposure Risk: Users may accidentally share passwords, API keys, or tokens when sending logs or configuration files
- Time-Consuming Troubleshooting: Without standardized diagnostics, support teams spend significant time requesting additional information
- Missing Context: System metrics, platform details, and service status are often missing from user-provided diagnostics
- Operational Efficiency: A standardized support bundle accelerates issue resolution and improves user experience
By implementing automated support bundle generation with built-in sanitization, we enable:
- One-click diagnostic collection
- Guaranteed sensitive data redaction
- Comprehensive troubleshooting context
- Faster issue resolution
- Better user experience
📖 User Stories
US-1: Platform Admin - Generate Support Bundle via CLI
As a Platform Administrator
I want to generate a support bundle from the command line
So that I can collect diagnostics for troubleshooting without accessing the UI
Acceptance Criteria:
Given I have CLI access to the MCP Gateway server
When I run the command:
mcpgateway --support-bundle --output-dir /tmp --log-lines 1000
Then a ZIP file should be created at /tmp/mcpgateway-support-YYYY-MM-DD-HHMMSS.zip
And the bundle should contain:
- Version information
- System diagnostics
- Configuration (sanitized)
- Last 1000 lines of logs (sanitized)
- Platform details
- Service status
And I should see a success message with the bundle path
And I should see a security notice about reviewing before sharingTechnical Requirements:
- CLI flag:
--support-bundle - Optional parameters:
--output-dir,--log-lines,--no-logs,--no-env,--no-system - Exit with status 0 on success, 1 on failure
- Timestamped filename format:
mcpgateway-support-YYYY-MM-DD-HHMMSS.zip - Display bundle size after generation
US-2: Support Engineer - Download Bundle via Admin UI
As a Support Engineer
I want to download a support bundle from the Admin UI
So that I can provide easy instructions to users without CLI access
Acceptance Criteria:
Given I am logged into the Admin UI
When I navigate to the Diagnostics tab
Then I should see a "Support Bundle" card with:
- Description of bundle contents
- Security notice about data sanitization
- "Download Support Bundle" button
When I click the "Download Support Bundle" button
Then a ZIP file should download to my browser
And the filename should be mcpgateway-support-YYYY-MM-DD-HHMMSS.zip
And the download should complete within 10 seconds for typical deploymentsTechnical Requirements:
- Located in Diagnostics/Version tab of Admin UI
- Visual design consistent with existing UI components
- Download button with loading state
- Shows CLI alternative command for reference
- Displays bundle contents checklist with green checkmarks
US-3: API Consumer - Generate Bundle via REST API
As an API Consumer
I want to generate support bundles programmatically via API
So that I can integrate diagnostics collection into monitoring/alerting systems
Acceptance Criteria:
Given I have valid authentication credentials
When I make a GET request to /admin/support-bundle/generate
Then I should receive:
- HTTP 200 status code
- Content-Type: application/zip
- Content-Disposition header with filename
- ZIP file containing sanitized diagnostics
And the response should be authenticated (require valid JWT or Basic Auth)Technical Requirements:
- Endpoint:
GET /admin/support-bundle/generate - Query parameters:
log_lines,include_logs,include_env,include_system - Authentication required (JWT bearer token or Basic Auth)
- Response headers:
Content-Type: application/zipContent-Disposition: attachment; filename="..."Content-Length: <size>X-Content-Type-Options: nosniff
US-4: Security Officer - Verify Data Sanitization
As a Security Officer
I want to ensure all sensitive data is automatically redacted
So that users can safely share support bundles without exposing credentials
Acceptance Criteria:
Given a support bundle has been generated
When I extract and review the bundle contents
Then the following should be redacted with "*****":
- Passwords (DATABASE_PASSWORD, BASIC_AUTH_PASSWORD, etc.)
- API keys (API_KEY, OPENAI_API_KEY, etc.)
- Tokens (JWT_SECRET_KEY, BEARER tokens, etc.)
- Secrets (AUTH_ENCRYPTION_SECRET, etc.)
- Database URLs with passwords (postgresql://user:PASS@host → postgresql://user:*****@host)
- Redis URLs with passwords
- JWT tokens (eyJ... patterns)
- Authorization headers
And public configuration values should remain visible:
- HOST, PORT, LOG_LEVEL
- Feature flags (UI_ENABLED, etc.)
- Transport settingsTechnical Requirements:
- Regex-based pattern matching for sensitive data
- URL credential sanitization (preserve username, remove password)
- Environment variable filtering based on naming patterns
- Log line-by-line sanitization
- Configuration field exclusion (sensitive Pydantic fields)
- Test coverage for all sanitization patterns
US-5: DevOps - Automate Bundle Collection on Errors
As a DevOps Engineer
I want to programmatically collect support bundles when errors occur
So that I can attach diagnostics to incident reports automatically
Acceptance Criteria:
Given I have monitoring/alerting configured
When an error threshold is exceeded
Then I can call the API to generate a support bundle
And store it in an incident management system
And attach it to the alert/ticketTechnical Requirements:
- Scriptable API endpoint
- Configurable bundle contents
- Fast generation time (< 10 seconds typical)
- Predictable error handling
- Machine-readable success/failure responses
🏗 Architecture
Component Overview
graph TB
subgraph "Entry Points"
A[CLI Command]
B[Admin UI Button]
C[REST API Endpoint]
end
subgraph "Support Bundle Service"
D[SupportBundleService]
E[Data Collection]
F[Sanitization Engine]
G[ZIP Generator]
end
subgraph "Data Sources"
H[Version Info]
I[System Metrics]
J[Configuration]
K[Logs]
L[Services Status]
end
A --> D
B --> C
C --> D
D --> E
E --> H
E --> I
E --> J
E --> K
E --> L
E --> F
F --> G
G --> M[ZIP File Output]
Data Flow
sequenceDiagram
participant User
participant UI/CLI/API
participant Service as SupportBundleService
participant Sanitizer as Sanitization Engine
participant FS as File System
User->>UI/CLI/API: Request support bundle
UI/CLI/API->>Service: generate_bundle(config)
Service->>Service: Collect version info
Service->>Service: Collect system metrics
Service->>Service: Collect configuration
Service->>Service: Collect logs
Service->>Sanitizer: Sanitize environment vars
Sanitizer-->>Service: Redacted env vars
Service->>Sanitizer: Sanitize log lines
Sanitizer-->>Service: Redacted logs
Service->>Sanitizer: Sanitize URLs
Sanitizer-->>Service: Redacted URLs
Service->>Service: Create manifest
Service->>Service: Generate ZIP
Service->>FS: Write ZIP file
Service-->>UI/CLI/API: Return bundle path
UI/CLI/API-->>User: Download/display bundle
📋 Implementation Tasks
Phase 1: Core Service Implementation ✅
-
Create Service Module
- Create
mcpgateway/services/support_bundle_service.py - Define
SupportBundleServiceclass - Define
SupportBundleConfigPydantic model - Implement service initialization with timestamp and hostname
- Create
-
Data Collection Methods
-
_collect_version_info(): app name, version, MCP protocol, Python version, platform -
_collect_system_info(): CPU, memory, disk (using psutil if available) -
_collect_env_config(): environment variables with secret filtering -
_collect_settings(): Pydantic settings with sensitive field exclusion -
_collect_logs(): log file reading with size limits and line tailing
-
-
Sanitization Engine
- Define
SENSITIVE_PATTERNSregex list:- Password patterns:
password[:=]"value" - Token patterns:
token[:=]"value" - API key patterns:
api_key[:=]"value" - Secret patterns:
secret[:=]"value" - Bearer token patterns:
Bearer <token> - Authorization headers
- Database URL patterns:
postgresql://user:pass@host - JWT token patterns:
eyJ...
- Password patterns:
- Implement
_is_secret(key): detect secret env var names - Implement
_sanitize_url(url): remove passwords from URLs - Implement
_sanitize_line(line): apply regex patterns to log lines
- Define
-
Bundle Generation
- Implement
_create_manifest(): bundle metadata and warnings - Implement
generate_bundle(): orchestrate collection and ZIP creation - Create ZIP file with timestamped filename
- Add all collected data as JSON files
- Add logs/ directory with sanitized logs
- Add README.md with bundle description
- Return Path to generated ZIP file
- Implement
-
Configuration
-
SupportBundleConfigfields:include_logs: bool = Trueinclude_env: bool = Trueinclude_system_info: bool = Truemax_log_size_mb: float = 10.0log_tail_lines: int = 1000(0 = all)output_dir: Optional[Path] = None(default: /tmp)
-
Phase 2: CLI Integration ✅
-
CLI Command Implementation
- Add
--support-bundleflag tomcpgateway/cli.py - Implement
_handle_support_bundle()function - Parse command-line options:
--output-dir <path>--log-lines <n>--no-logs--no-env--no-system
- Call
SupportBundleService.generate_bundle() - Display success message with bundle path and size
- Display security notice
- Exit with proper status codes (0=success, 1=failure)
- Add
-
Error Handling
- Catch and display user-friendly error messages
- Handle permission errors (output directory)
- Handle disk space errors
- Handle log file access errors
Phase 3: API Endpoint ✅
-
REST API Endpoint
- Add route:
GET /admin/support-bundle/generate - Implement
admin_generate_support_bundle()handler inmcpgateway/admin.py - Query parameters:
log_lines: int = 1000include_logs: bool = Trueinclude_env: bool = Trueinclude_system: bool = True
- Require authentication:
user=Depends(get_current_user_with_permissions) - Generate bundle in temporary directory
- Read ZIP file contents
- Return Response with:
content: bytes(ZIP file)media_type: "application/zip"headers: Content-Disposition, Content-Length, X-Content-Type-Options
- Clean up temporary file after response
- Add route:
-
Error Responses
- HTTP 401 if not authenticated
- HTTP 500 if generation fails
- Include error message in JSON response
Phase 4: Admin UI Integration ✅
-
UI Component
- Add support bundle card to
mcpgateway/templates/version_info_partial.html - Create card with sections:
- Header: "Troubleshooting Support"
- Description of bundle contents
- Bundle contents checklist (6 items with checkmarks)
- Security notice (yellow warning box)
- Download button (prominent, centered)
- CLI alternative command (code block)
- Style with Tailwind CSS classes (consistent with existing UI)
- Download button links to:
/admin/support-bundle/generate?log_lines=1000 - Add SVG icons (download icon, checkmarks, warning icon)
- Add support bundle card to
-
Dark Mode Support
- Use Tailwind dark mode classes:
dark:bg-gray-800,dark:text-gray-200 - Test in both light and dark themes
- Use Tailwind dark mode classes:
Phase 5: Testing ✅
-
Unit Tests (
tests/unit/mcpgateway/services/test_support_bundle_service.py)- Test service initialization
- Test
_is_secret()detection (10+ cases) - Test
_sanitize_url()with various URL formats - Test
_sanitize_line()with various patterns - Test
_collect_version_info()structure - Test
_collect_system_info()structure - Test
_collect_env_config()sanitization - Test
_collect_settings()field exclusion - Test
_collect_logs()with missing files - Test
_create_manifest()structure - Test
generate_bundle()creates valid ZIP - Test ZIP contents (files present)
- Test custom configuration (exclusions)
- Test convenience function
create_support_bundle() - Test end-to-end sanitization in bundle
- Target: 15+ tests, 90%+ coverage
-
Integration Tests
- Test CLI command execution
- Test API endpoint (authenticated request)
- Test API endpoint (unauthenticated request → 401)
- Test bundle download via browser simulation
-
Edge Cases
- Test with log file > max_log_size_mb (should warn/skip)
- Test with missing log directory
- Test with read-only output directory (should fail gracefully)
- Test with 0 log lines (all logs)
- Test with all features disabled (minimal bundle)
Phase 6: Documentation ✅
-
User Documentation (
CLAUDE.md)- Add "Generating Support Bundles" section
- Document CLI usage with examples
- Document API endpoint with curl examples
- Document Admin UI location
- List bundle contents
- Highlight security features (automatic sanitization)
-
Code Documentation
- Comprehensive docstrings with examples
- Doctests in service methods
- Type hints for all functions
- README.md in bundle (generated content)
-
API Documentation
- OpenAPI/Swagger docs for
/admin/support-bundle/generateendpoint - Parameter descriptions
- Response schema
- OpenAPI/Swagger docs for
Phase 7: Quality Assurance ✅
-
Code Quality
- Run
make autoflake isort black(formatting) - Run
make flake8(linting, pass with 0 errors) - Run
make pylint(static analysis) - Run
make doctest(verify docstring examples) - Pass
make verify(comprehensive checks)
- Run
-
Security Review
- Verify all SENSITIVE_PATTERNS catch real-world patterns
- Test with actual production .env files (redacted)
- Verify no sensitive data leaks in any scenario
- Review for path traversal vulnerabilities
- Review for zip bomb vulnerabilities
-
Performance Testing
- Measure bundle generation time (target: < 10 seconds)
- Test with large log files (100MB+)
- Test with many environment variables (1000+)
- Verify memory usage is bounded
⚙️ CLI Usage Examples
Basic Usage
# Generate with default settings
mcpgateway --support-bundle
# Custom output directory
mcpgateway --support-bundle --output-dir /var/tmp
# Limit log lines
mcpgateway --support-bundle --log-lines 500
# Exclude components
mcpgateway --support-bundle --no-logs
mcpgateway --support-bundle --no-env --no-system
# Get all logs (no limit)
mcpgateway --support-bundle --log-lines 0API Usage
# Using curl with JWT token
export TOKEN="your-jwt-token"
curl -H "Authorization: Bearer $TOKEN" \
"http://localhost:4444/admin/support-bundle/generate?log_lines=1000" \
-o support-bundle.zip
# Using curl with Basic Auth
curl -u admin:password \
"http://localhost:4444/admin/support-bundle/generate" \
-o support-bundle.zip
# Customized bundle
curl -H "Authorization: Bearer $TOKEN" \
"http://localhost:4444/admin/support-bundle/generate?log_lines=500&include_system=true" \
-o support-bundle.zipPython Usage
from pathlib import Path
from mcpgateway.services.support_bundle_service import (
SupportBundleService,
SupportBundleConfig,
create_support_bundle
)
# Using convenience function
bundle_path = create_support_bundle()
print(f"Bundle created: {bundle_path}")
# Using service with custom config
config = SupportBundleConfig(
output_dir=Path("/tmp"),
log_tail_lines=500,
include_logs=True,
include_env=True,
include_system_info=True,
max_log_size_mb=20.0
)
service = SupportBundleService()
bundle_path = service.generate_bundle(config)
print(f"Bundle created: {bundle_path}")📦 Bundle Structure
mcpgateway-support-2025-01-09-120000.zip
├── MANIFEST.json # Bundle metadata and warnings
├── README.md # Usage instructions
├── version.json # App version, Python, FastAPI versions
├── system_info.json # CPU, memory, disk, platform details
├── settings.json # Application settings (sanitized)
├── environment.json # Environment variables (secrets redacted)
└── logs/
└── mcpgateway.log # Application logs (sanitized)
MANIFEST.json
{
"bundle_version": "1.0",
"generated_at": "2025-01-09T12:00:00+00:00",
"hostname": "mcp-gateway-prod-01",
"app_version": "0.8.0",
"configuration": {
"include_logs": true,
"include_env": true,
"include_system_info": true,
"log_tail_lines": 1000
},
"warning": "This bundle may contain sensitive information. Review before sharing."
}✅ Success Criteria
-
Functionality
- CLI command generates valid ZIP bundles
- API endpoint returns downloadable ZIP files
- Admin UI button initiates bundle download
- All three methods produce identical bundle structure
-
Security
- 100% of tested secret patterns are redacted
- No actual passwords, tokens, or API keys in generated bundles
- Security notice displayed/included in all interfaces
- Bundle README includes security warning
-
Performance
- Bundle generation completes in < 10 seconds for typical deployments
- Memory usage bounded (< 100MB for generation)
- Works with large log files (100MB+)
-
Usability
- One-command CLI usage
- One-click UI download
- Clear error messages on failures
- Timestamped filenames for easy identification
-
Quality
- 15+ unit tests with 90%+ coverage
- Pass all linting/formatting checks
- Comprehensive documentation
- Zero security vulnerabilities
🏁 Definition of Done
-
SupportBundleServiceimplemented with all collection methods - Sanitization engine with 8+ regex patterns
- CLI command
--support-bundlewith optional parameters - API endpoint
GET /admin/support-bundle/generate - Admin UI card in Diagnostics tab with download button
- 15+ unit tests with 90%+ coverage
- All tests passing (pytest)
- Code passes
make verifychecks - Documentation updated (CLAUDE.md)
- Security review completed (no sensitive data leaks)
- Performance benchmarked (< 10 seconds)
- Manual testing on dev/staging environment
- Team review and approval
📝 Additional Notes
🔹 Security Considerations
Automatic Sanitization:
- Regex patterns cover common secret naming conventions
- URL parsing removes passwords while preserving connection info
- Environment variable filtering based on naming patterns
- Log line-by-line sanitization for in-line credentials
Residual Risk:
- Custom/unusual secret naming may not be caught
- Secrets in freeform text (comments, descriptions) may leak
- Users should still review bundles before sharing publicly
Mitigation:
- Clear warnings in UI, CLI, and bundle README
- Documentation emphasizes review before sharing
- Consider allow-list approach for production deployments
🔹 Performance Optimization
Bundle Generation Time:
- Version info: < 1ms (in-memory)
- System metrics: < 100ms (psutil calls)
- Configuration: < 10ms (Pydantic serialization)
- Environment: < 10ms (dictionary filtering)
- Logs: < 5 seconds (depends on file size, tail optimization)
- ZIP creation: < 1 second (compression)
- Total: 3-10 seconds typical
Memory Usage:
- Stream log files instead of loading entirely (when possible)
- Use iterative ZIP writing (no double-buffering)
- Bound log tail lines (default 1000)
- Peak memory: 50-100MB typical
🔹 Future Enhancements
- Selective Bundling: UI checkboxes to include/exclude components
- Bundle History: Store last N bundles, allow re-download
- Scheduled Bundles: Cron/timer to generate bundles periodically
- Incident Integration: Auto-attach bundles to incident tickets
- Bundle Analysis: Built-in diagnostics scanner (log error patterns, config issues)
- Multi-Server Bundles: Aggregate bundles from federated gateways
- Custom Sanitization Rules: User-defined regex patterns via config
- Compression Levels: Configurable ZIP compression (speed vs size)
🔹 Testing Strategy
Unit Tests (15+):
- Individual method testing (collection, sanitization)
- Edge case coverage (missing files, permissions, etc.)
- Configuration validation
Integration Tests:
- CLI end-to-end (run command, verify ZIP)
- API endpoint (auth, download, error responses)
- UI interaction (Playwright/Selenium)
Manual Testing Checklist:
- Generate bundle in development environment
- Extract and review all files
- Verify no actual secrets present
- Test with large log files
- Test with missing log directory
- Test with read-only filesystem
- Test CLI from different directories
- Test API with invalid auth
- Test UI download in Chrome, Firefox, Safari
🔗 Related Issues
- [Feature]: Content Limit Plugin - Resource Exhaustion Protection #1191 - Content Limit Plugin (pattern for epic structure)
📊 Implementation Progress
Estimated Effort
- Phase 1 (Core Service): 4-6 hours
- Phase 2 (CLI): 1-2 hours
- Phase 3 (API): 1-2 hours
- Phase 4 (UI): 2-3 hours
- Phase 5 (Testing): 3-4 hours
- Phase 6 (Documentation): 1-2 hours
- Phase 7 (QA): 2-3 hours
Total: 14-22 hours (2-3 days)
Risks & Mitigation
| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| Sensitive data leaks | High | Low | Comprehensive test cases, security review |
| Performance issues with large logs | Medium | Medium | Log tailing, size limits, streaming |
| Platform-specific errors | Medium | Medium | Cross-platform testing, psutil fallbacks |
| User confusion | Low | Medium | Clear documentation, in-app guidance |
🎯 Acceptance Testing
Test Scenario 1: Happy Path
Given I am a platform admin with CLI access
When I run: mcpgateway --support-bundle --output-dir /tmp
Then I see: "✅ Support bundle created: /tmp/mcpgateway-support-2025-01-09-120000.zip"
And I see: "📦 Bundle size: 9.27 KB"
And I see: "⚠️ Security Notice: The bundle has been sanitized..."
And the ZIP file contains: MANIFEST.json, version.json, system_info.json, settings.json, environment.json, logs/mcpgateway.log, README.md
And no sensitive data is present in any fileTest Scenario 2: API Download
Given I have a valid JWT token
When I make a GET request to /admin/support-bundle/generate
Then I receive HTTP 200 with Content-Type: application/zip
And the response body is a valid ZIP file
And the Content-Disposition header includes a timestamped filenameTest Scenario 3: Sanitization Verification
Given my .env file contains DATABASE_PASSWORD=secret123
And my logs contain "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
When I generate a support bundle
Then environment.json shows DATABASE_PASSWORD: "*****"
And logs/mcpgateway.log shows "Bearer *****"
And no occurrence of "secret123" or "eyJhbGciOi" exists in any file