Skip to content

Quantum Clifford Circuit Synthesis Environment#506

Open
y-richie-y wants to merge 2 commits intoPufferAI:3.0from
y-richie-y:clifford-ocean-pr
Open

Quantum Clifford Circuit Synthesis Environment#506
y-richie-y wants to merge 2 commits intoPufferAI:3.0from
y-richie-y:clifford-ocean-pr

Conversation

@y-richie-y
Copy link
Copy Markdown

This PR adds a native Clifford synthesis env to pufferlib.ocean, plus a small reference env used for correctness tests and native/reference parity checks.

Clifford synthesis is a task in quantum computing: given a target Clifford tableau, find a sequence of Clifford gates that implements it.

This can be framed as a reinforcement learning problem where the agent learns to transform a tableau to the identity tableau.

         Initial tableau                          Identity tableau
        x0 x1 x2 | z0 z1 z2                     x0 x1 x2 | z0 z1 z2
      +-------------------+                    +-------------------+
  r0  | 1  0  0 | 0  0  1 |                    | 1  0  0 | 0  0  0 |
  r1  | 0  0  0 | 0  1  0 |                    | 0  1  0 | 0  0  0 |
  r2  | 0  0  1 | 1  0  1 |  --CZ(0,2),        | 0  0  1 | 0  0  0 |
      |---------+---------|      S(2), H(1)->  |---------+---------|
  r3  | 0  0  0 | 1  0  0 |                    | 0  0  0 | 1  0  0 |
  r4  | 0  1  0 | 0  0  0 |                    | 0  0  0 | 0  1  0 |
  r5  | 0  0  0 | 0  0  1 |                    | 0  0  0 | 0  0  1 |
      +-------------------+                    +-------------------+

For related RL-based Clifford synthesis work, see Kremer et al., Practical and efficient quantum circuit synthesis and transpiling with Reinforcement Learning (arXiv:2405.13196, 2024): https://arxiv.org/abs/2405.13196

The env models synthesis over binary symplectic residuals:

  • observation: flattened 2n x 2n binary residual matrix
  • action space: H, S, V, HS, HV on each qubit, plus CZ(i, j) for each unordered qubit pair
  • reward: -single_qubit_cost for single-qubit gates, -1 for CZ, optional goal_bonus on reaching identity. There is also a hamming-weight penalty which works well in practice

API

Native env:

  • Clifford(num_envs=1, n_qubits=6, difficulty=10, max_steps=200, single_qubit_cost=0.01, goal_bonus=0.0, use_reset_pool=True, log_interval=128, buf=None, seed=0, render_mode=None)

Runtime controls kept on the native env:

  • set_difficulty(...)
  • set_max_steps(...)
  • set_matrix(...)
  • flush_logs()

difficulty is part of the public interface because curriculum learning is useful for this task, and adjusting scramble depth during training/evaluation is a practical control point.
Fractional difficulties are supported by interpolating between two integer difficulty levels.

Implementation note:

  • the native path stores tableau columns in packed uint64_t form, so the current limit is 2 * n_qubits <= 64

Validation

The added tests cover:

  • action count and deterministic ordering
  • reset behavior at zero and nonzero difficulty
  • matrix injection validation
  • native/reference parity for H, S, V, HS, HV, and CZ
  • native auto-reset behavior after terminal transitions

Performance

The raw throughput on my machine at 6 qubits is about 8.06M SPS at 128 envs.

Reset states are produced by random walks, so higher difficulty means more work per reset. To keep this cheap, the native
implementation uses a reset pool: resets reuse a common prefix of the walk and only diverge in the final steps. That preserves
diversity while cutting most of the reset cost. Measured with num_envs=2048, n_qubits=6:

  • pure reset throughput at difficulty=1000: 33.1 -> 482.3 vec resets/sec with use_reset_pool=True (14.6x)
  • rollout SPS at difficulty=1000, max_steps=200: 5.84M -> 9.73M (1.66x)
  • rollout SPS at difficulty=1000, max_steps=16: 0.98M -> 6.36M (6.47x)

Possible Extensions

  • Custom coupling graphs in addition to the current fully connected action set
  • Richer objective variants beyond the current gate-cost formulation
  • Visualisation + interactive demo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant