Skip to content

REF_READ_ONLY_DATABASE: provider.configure() still writes to the DB at startup #31

@lewisjared

Description

@lewisjared

Split-off from #29.

Summary

When REF_READ_ONLY_DATABASE=true the API opens the DB via sqlite:///file:<path>?mode=ro&immutable=1&uri=true, but startup crashes before the first request because the provider registry still tries to register diagnostics.

DEBUG | climate_ref_core.providers:configure:82 - Configuring provider esmvaltool ...
...
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database
[SQL: INSERT INTO diagnostic (slug, name, provider_id, enabled) VALUES (?, ?, ?, ?) RETURNING id, created_at, updated_at]
[parameters: ('ozone-annual-cycle', 'Ozone Diagnostics', 1, 1)]

So REF_READ_ONLY_DATABASE today only changes the connection string; the startup code path still mutates the DB. Until this is fixed, deployments can't actually mount /ref read-only — which was the whole point of the flag.

Where

backend/src/ref_backend/core/ref.py::get_provider_registry:

def get_provider_registry(ref_config: Config) -> ProviderRegistry:
    database = get_database(ref_config)
    return ProviderRegistry.build_from_config(ref_config, database)

ProviderRegistry.build_from_config calls provider.configure(config) per provider (in climate_ref_core.providers), which upserts diagnostic rows.

Proposed fix

Pick one, not all:

  1. Skip registration in read-only mode. If ref_config.db.read_only (or the equivalent flag), construct the registry by loading existing diagnostics out of the DB instead of calling provider.configure(). The API only needs to read what the CLI/workers registered — the writable path stays for ref providers setup and the workers.
  2. Split provider.configure() into two phases. A read path (hydrate the registry from the DB) and a write path (register/update diagnostic rows). The API uses the read path; workers + CLI continue to use the write path.
  3. Idempotent upsert that tolerates a read-only session. Weaker — it still writes on a fresh DB. Only viable combined with an "assume-already-registered" short-circuit when the session is read-only.

(1) is the cleanest: registration becomes an explicit operator action (ref providers setup), and the API stays a pure reader.

Acceptance

  • With a DB already populated by ref db migrate + ref providers setup, the API starts cleanly when:
    • REF_READ_ONLY_DATABASE=true
    • the /ref volume is mounted readOnly: true
  • Existing writable-mode behavior unchanged: ref providers setup still registers diagnostics, workers still register on first start.
  • Ideally: a test similar to tests/test_core/test_ref.py::test_get_database_read_only_rejects_writes that starts the full provider registry against a read-only DB and asserts no write is attempted.

Context

Deployed chart: climate-ref-aft 0.1.0 (PR Climate-REF/climate-ref-aft#7), image ghcr.io/climate-ref/climate-ref-frontend:v0.3.0. Full trail in #29.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions