Skip to content

Add gh devlake query command with extensible query engine #62

@ewega

Description

@ewega

Problem

Users can only view DevLake's aggregated data through Grafana dashboards. There is no CLI-native way to retrieve DORA metrics, Copilot usage data, or pipeline status in a structured format. This blocks:

  1. Agents ΓÇö coding agents cannot access DevLake data without scraping Grafana
  2. Scripting ΓÇö CI/CD pipelines and automation cannot pull metrics without the Grafana API
  3. AI features ΓÇö the planned Copilot SDK integration (Add gh devlake query command with extensible query engine #62) needs a programmatic data layer to feed custom tools

Proposed Solution

Add a gh devlake query command with pre-canned metric subcommands. Design the underlying data layer as an extensible query engine so future subcommands (or custom queries) can be added without rewriting the foundation.

Command surface

gh devlake query dora    --project my-team [--timeframe 30d] [--json]
gh devlake query copilot --project my-team [--timeframe 30d] [--json]
gh devlake query pipelines --project my-team [--limit 20] [--json]

Default output is JSON (since this is a data retrieval command). --format table for human-readable output.

Architecture ΓÇö extensible query layer

Rather than hardcoding SQL in each subcommand, build an internal query abstraction that pre-canned commands use and that can be extended later:

internal/
  query/
    engine.go       # QueryEngine: holds DB connection, executes QueryDef
    types.go        # QueryDef: name, description, SQL template, parameters, output schema
    registry.go     # Built-in query registry (DORA, Copilot, pipelines)
    dora.go         # DORA metric queries
    copilot.go      # Copilot usage queries
    pipelines.go    # Pipeline status queries

QueryDef type

// QueryDef describes a reusable, parameterized query against DevLake's domain layer.
type QueryDef struct {
    Name        string            // e.g. "dora_metrics"
    Description string            // human-readable description
    SQL         string            // parameterized SQL template
    Params      []QueryParam      // declared parameters with types and defaults
    OutputCols  []string          // column names in result set
}

type QueryParam struct {
    Name     string
    Type     string // "string", "int", "duration"
    Required bool
    Default  string
}

Registry

var registry = map[string]*QueryDef{
    "dora":      doraQuery,
    "copilot":   copilotQuery,
    "pipelines": pipelinesQuery,
}

// Future: load custom QueryDefs from a YAML/JSON file
// Future: gh devlake query custom --file my-query.yaml

This design ensures:

  • Pre-canned queries are the starting point ΓÇö users get value immediately
  • New queries are additive ΓÇö add a QueryDef to the registry, wire a subcommand
  • Custom queries are a natural extension ΓÇö load QueryDef from user-supplied files in a future version
  • No spaghetti ΓÇö SQL lives in typed structs, not scattered across command handlers

Data source: direct DB connection

The query engine connects directly to DevLake's MySQL database using connection parameters from:

  1. --db-url flag (explicit)
  2. .devlake-local.json or .devlake-azure.json state file (auto-detected)
  3. DB_URL environment variable

Queries run read-only against DevLake's domain layer tables (not tool-layer or raw-layer). The domain layer is the standardized, cross-plugin data model ΓÇö see apache/incubator-devlake/backend/core/models/domainlayer/.

DORA metrics query (example)

Source: the DORA Grafana dashboards in apache/incubator-devlake/grafana/dashboards/ contain the SQL queries that power the dashboard panels. These are the reference implementation for what the pre-canned dora query should return.

Output:

{
  "project": "my-team",
  "timeframe": "30d",
  "metrics": {
    "deploymentFrequency": { "value": 4.2, "unit": "per_week", "rating": "high" },
    "leadTimeForChanges": { "value": 2.3, "unit": "hours", "rating": "elite" },
    "changeFailureRate": { "value": 0.08, "unit": "ratio", "rating": "high" },
    "meanTimeToRestore": { "value": 1.5, "unit": "hours", "rating": "elite" }
  }
}

Copilot usage query (example)

Source: the Copilot plugin tables in the DevExpGBB fork ΓÇö _tool_gh_copilot_* tables store seat counts, acceptance rates, language breakdowns, and editor usage. The Grafana dashboards for Copilot adoption (in the fork's grafana/dashboards/ directory) contain reference SQL.

Output:

{
  "project": "my-team",
  "timeframe": "30d",
  "metrics": {
    "totalSeats": 45,
    "activeUsers": 38,
    "acceptanceRate": 0.34,
    "topLanguages": [
      { "language": "TypeScript", "acceptances": 1200, "suggestions": 3500 },
      { "language": "Python", "acceptances": 800, "suggestions": 2100 }
    ],
    "topEditors": [
      { "editor": "vscode", "users": 30 },
      { "editor": "jetbrains", "users": 8 }
    ]
  }
}

Files to create/modify

File Change
internal/query/engine.go NEW ΓÇö QueryEngine with DB connection and execute method
internal/query/types.go NEW ΓÇö QueryDef, QueryParam, QueryResult types
internal/query/registry.go NEW ΓÇö built-in query registry
internal/query/dora.go NEW ΓÇö DORA metric query definitions
internal/query/copilot.go NEW ΓÇö Copilot usage query definitions
internal/query/pipelines.go NEW ΓÇö Pipeline status query definitions
cmd/query.go NEW ΓÇö parent query command
cmd/query_dora.go NEW ΓÇö query dora subcommand
cmd/query_copilot.go NEW ΓÇö query copilot subcommand
cmd/query_pipelines.go NEW ΓÇö query pipelines subcommand
go.mod Add MySQL driver dependency

Acceptance Criteria

  • gh devlake query dora --project my-team returns DORA metrics as JSON
  • gh devlake query copilot --project my-team returns Copilot usage as JSON
  • gh devlake query pipelines --project my-team returns recent pipeline runs
  • --format table renders human-readable output
  • --timeframe flag controls the date range (default: 30d)
  • DB connection is auto-detected from state files
  • Queries run read-only against domain layer tables
  • QueryDef abstraction is in place ΓÇö adding a new query type = add a struct + subcommand
  • No hardcoded SQL in command handlers ΓÇö all SQL lives in internal/query/
  • go build ./... and go test ./... pass
  • README updated with query commands

Future Extensions (not in scope for this issue)

  • gh devlake query custom --file my-query.yaml ΓÇö load user-defined QueryDefs
  • gh devlake query --sql "SELECT ..." ΓÇö raw SQL for power users (with safety boundaries)
  • Query result caching for expensive aggregations
  • Time-series output format for trend visualization

Target Version

v0.5.0 ΓÇö Foundation for AI features and data-driven CLI.

Dependencies

References

  • DevLake domain layer models: apache/incubator-devlake/backend/core/models/domainlayer/ ΓÇö the standardized tables (cicd_deployments, pull_requests, issues, etc.) that queries target
  • DevLake three-layer data model: AGENTS.md ΓÇö raw ΓåÆ tool ΓåÆ domain layer architecture. Queries should only read from the domain layer.
  • DORA dashboard SQL: apache/incubator-devlake/grafana/dashboards/ ΓÇö Grafana panel SQL is the reference implementation for DORA metric calculations
  • Copilot plugin: DevExpGBB/incubator-devlake fork ΓÇö _tool_gh_copilot_* tables and associated Grafana dashboards contain the SQL for Copilot usage metrics
  • internal/devlake/client.go ΓÇö existing generic helpers (doGet, doPost) for API-based alternatives
  • internal/devlake/state.go ΓÇö state file parsing for DB URL auto-detection

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions