Quantum random number generation has a latency problem. Submitting a circuit to IBM Quantum hardware, waiting for execution in the job queue, and retrieving results takes seconds to minutes - far too slow for an API that needs to respond in milliseconds. TrueEntropy solves this with a pre-generated entropy pool: quantum random bytes generated in advance, stored on disk, and served to API requests at disk-read speed.
The Architecture
The entropy pool system has two sides: a Python producer that generates quantum entropy and writes it to a binary file, and a PHP consumer that reads from that file to serve API requests. They communicate through the filesystem, using file locking for concurrency safety.
The Producer: pool.py
The Python pool manager (qrng/pool.py) is responsible for keeping the entropy pool full. It operates in three modes:
pool.py fill- One-shot fill: generates quantum entropy and writes it to the pool file until the target size is reachedpool.py status- Reports the current pool size in bytes and the percentage of capacitypool.py daemon- Continuous monitoring mode: watches the pool file and automatically triggers a refill when the level drops below a threshold
When generating entropy, the pool manager invokes the full TrueEntropy pipeline:
- Read the QuBitLang QRNG circuit (
qrng_hadamard.ql) - Compile it through the QuBitLang compiler pipeline (Lexer → Parser → Semantic Analyser → IR Builder → Optimiser)
- Execute the resulting Qiskit circuit on IBM Quantum hardware (or the local simulator)
- Extract measurement bitstrings from the results
- Shuffle expanded bitstrings to eliminate ordering artifacts
- Run the NIST SP 800-22 test suite (7 tests, α = 0.01)
- Convert verified bitstrings to raw bytes
- Append to the pool file
Only entropy that passes all 7 NIST tests is written to the pool. Failed batches are discarded and regenerated.
The Pool File: entropy.bin
The pool is a simple binary file at qrng/pool_data/entropy.bin. Each byte is a quantum random value (0–255), packed sequentially with no headers or metadata. This format was chosen for simplicity and performance - PHP can read bytes directly without parsing.
A companion metadata file (entropy.meta.json) tracks generation timestamps, source backend, and NIST verification status for auditing purposes.
The Consumer: quantum_pool.php
On the PHP side, the quantum_pool.php module provides four core functions:
quantum_random_bytes($count)- Read N raw bytes from the poolquantum_random_int($min, $max)- Generate a random integer in a range using rejection sampling (powers the /v1/integers endpoint)quantum_random_float()- Generate a random float with 53-bit mantissa precisionquantum_random_uuid()- Generate a v4 UUID from quantum bytes (powers the /v1/uuid endpoint)
Every read operation follows the same pattern:
- Open the pool file in read-write mode (
c+b) - Acquire an exclusive lock (
LOCK_EX) to prevent race conditions - Check the pool has enough bytes
- Read the requested bytes from the start of the file
- Read the remaining bytes
- Truncate and rewrite the file with only the remaining bytes
- Release the lock
This atomic read-and-consume pattern ensures that no two concurrent API requests ever receive the same entropy bytes. Each byte is consumed exactly once.
The CSPRNG Fallback
If the pool is empty - because traffic spiked faster than the daemon could refill, or because the pool file doesn't exist - TrueEntropy falls back to PHP's built-in CSPRNG (random_bytes()). The API response still succeeds, but the metadata field source changes from "quantum" to "fallback", and a warning header is included.
This fallback is a conscious design choice. We believe it's better to serve cryptographically strong random numbers with a transparency warning than to return an error and break your application. PHP's CSPRNG sources entropy from the operating system's entropy pool (/dev/urandom on Linux), which is suitable for most purposes - just not verifiably quantum.
In practice, the daemon mode ensures the pool stays well above the consumption rate. Fallback events are logged for monitoring, and the pool status can be checked at any time.
Why Not Generate On-Demand?
The obvious question: why not generate quantum entropy for each API request in real time?
The answer is latency. Submitting a job to IBM Quantum involves:
- Circuit transpilation - Converting to native gates for the target QPU (seconds)
- Queue time - Waiting for availability on shared quantum hardware (seconds to minutes)
- Execution - Running the circuit with 4,096 shots (seconds)
- Result retrieval - Downloading measurement data (seconds)
Total round-trip time for a single IBM Quantum job is typically 30 seconds to several minutes. For an API that promises fast responses, that's unacceptable.
The entropy pool decouples generation from consumption. Quantum jobs run asynchronously in the background, filling the pool. API requests read from the pool synchronously, at the speed of a local file read. The quantum hardware operates at its own pace; your API responses are never blocked by it.
Consumption Rates
Different API endpoints consume different amounts of entropy:
- Integer - Variable; depends on range. A random integer from 1–100 requires ~7 bits plus rejection sampling overhead
- Float - 7 bytes (53-bit mantissa precision)
- UUID - 16 bytes per UUID
- Byte - 1 byte per byte requested
- Shuffle - Variable; depends on array size. A 52-card deck shuffle requires approximately 200 bytes
The daemon monitors the pool file size and triggers refill when it drops below the configured threshold. Each refill batch produces 4,096 bytes (4,096 shots × 8 qubits = 32,768 bits = 4,096 bytes), which is enough for hundreds of typical API requests.
A Simple System
The entropy pool is deliberately simple. No Redis. No PostgreSQL. No distributed cache. Just a binary file, a file lock, and a daemon. This simplicity is a feature: fewer moving parts means fewer failure modes, easier debugging, and a system that's straightforward to reason about.
As TrueEntropy scales, the architecture can evolve - but we believe in starting with the simplest system that works correctly and adding complexity only when the evidence demands it. You can see the pool in action on our live generator, or explore the full feature set.