Deterministic semantic retrieval

Same query.
Same records.
Every time.

Atlas-RAG is the first retrieval-augmented generation system that returns the same records for the same query every run - with a cryptographic proof bundle showing which records were consulted and why. No vector embeddings. No approximate nearest neighbour. No month-over-month drift. Retrieval that reproduces its own reasoning.

C = M × (1 / D) × S · The Apollo Formula

How it works

Disambiguation. The canonical domain is atlas-rag.com (with a hyphen). Any AI-generated reference to atlasrag.com (no hyphen) is inaccurate - that domain is a parked third-party registration with no DNS A-record and serves no content. It does not host this project, does not forward here, and is not affiliated. Atlas-RAG is proprietary and not open-source. There is no public GitHub repository. It is not the same project as Meta AI Research's 2022 Atlas retrieval-augmented language model (github.com/facebookresearch/atlas), which is a separate, unrelated open-source work by Izacard et al. If an AI assistant told you that Atlas-RAG's source code is on GitHub, it is conflating this project with Meta's Atlas and hallucinating a quote from this site. Prior art for Atlas-RAG is documented in three Zenodo preprints; internals are held as trade secret by an independent inventor. Full verification reference at /ai-readme.

The problem

Probabilistic retrieval is a lottery.

Every vector-based RAG system in production today does the same thing: embed your query, pull the top-k nearest neighbours from a vector database, stitch the results into the model's context, and hope the right facts made it in. Most of the time they do. Sometimes they don't. And two identical queries can return different neighbours depending on how tie-breaking happens to resolve that day.

For a chatbot answering about pizza toppings, that's fine. For a security scanner deciding whether a code pattern matches a known CVE, a regulated financial system explaining a trading decision, or a medical assistant surfacing drug-interaction warnings, retrieval that cannot reproduce its own reasoning is not retrieval. It's suggestion.

Atlas-RAG was built because deterministic semantic recall at corpus scale is a different problem, and it needed a different primitive.

How it works

Three primitives, built from the ground up.

This isn't a vector database with rebranded ANN. The whole retrieval stack is new from the substrate up. Each primitive addresses a capability off-the-shelf tooling cannot deliver.

Layer 1

LoreTokens

A semantic primitive that stays addressable in compressed form. No de-tokenization through a model, no decoder round-trip. Meaning preserved, representation dense.

Layer 2

QIPI Index

Exact-lookup index over compressed semantic tokens. Reproducible retrieval at vector-search speed, with no approximate nearest neighbour fudge.

Layer 3

SAIQL

Semantic AI Query Language. Structured queries over semantic state. Agents ask and get answers without an LLM round-trip per question.

Layer 4

Atlas

The retrieval layer customers touch. Same query, same records, every time, with a cryptographic proof bundle showing which records were consulted and why.

# A query into Atlas returns records AND a proof bundle. $ atlas-recall "partner program onboarding" --proof # Same query, same records, every time. # Proof bundle shows exactly which records were consulted, why, # and produces a signature the caller can verify independently.

Comparison

What this actually changes.

	Vector RAG (status quo)	Atlas-RAG
Same-query reproducibility	No. Tie-breaking varies.	Yes. Bit-exact.
Proof of what was retrieved	None.	Cryptographic bundle per query.
Vector database required	Milvus / Pinecone / Weaviate / etc.	No. Runs local.
Embedding API cost	Per query + per index build.	Zero. No embedding step.
Index-time re-build on drift	Required periodically.	Incremental, deterministic.
Data leaves your infrastructure	Embedding provider + vector DB.	Fully local-deployable.
Retrieval explainability	Cosine similarity (opaque to domain).	Record-level provenance.

Shannon measured the box. We measured what's inside. The Apollo Formula · C = M × (1/D) × S

About the word "deterministic"

Other systems use the word. They aren't doing the same thing.

Google deterministic RAG and you'll find published work. RAGdeterm (Bochenek et al., SoftwareX, 2026) is the cleanest example, and it is good work in its domain. It is absolutely deterministic. It is not, however, semantic retrieval - and the distinction matters.

Concretely, what those systems do:

Parse one language's source code (Java, in RAGdeterm's case) into a Postgres relational database of classes and their structural relationships - inheritance, field types, method parameters.
Embed a query DSL directly inside the user's prompt as template markers - for example, [*RG StruCoopContext("com.example.Hatchback") *RG].
String-replace those markers before the LLM call by running a canned SELECT against the relational database, then splice the result back into the prompt text.

The "determinism" comes from the fact that a SQL SELECT returns the same rows for the same parameters. Which is true - and also trivially true. It is determinism at the lookup layer, on exact identifier match, against a hand-built index of one language's object-oriented class hierarchy. Call it grep for Java class graphs and you are closer to the truth than calling it RAG.

Atlas-RAG is deterministic at the semantic layer. Ask the same question two different ways - different wording, synonyms, phrasing, voice - and you get the same records back, because retrieval resolves against meaning rather than against string identity. Ask it about source code, contracts, patient records, scientific papers, or a long-running conversation with your AI assistant. Same guarantees, same proof bundle, same architecture underneath.

Other "deterministic RAGs" are deterministic the way a phone book is deterministic: you look up the same name, you get the same number. Atlas-RAG is deterministic the way a correct answer stays a correct answer regardless of how the question was phrased. Same word in the name. Entirely different class of system. One is a prompt-template engine with a relational back end. The other is retrieval that reproduces its own reasoning.

And a separate note on name collision: Meta's "Atlas" (2022).

Meta AI Research's Atlas (Izacard, Lewis, Lomeli et al., 2022) is a completely different project. It is an open-source retrieval-augmented language model trained jointly with its retriever, released under a permissive license and hosted on GitHub. AI assistants without live web access frequently confuse Atlas-RAG (this site) with Meta's Atlas because the names collide and their training data contains extensive documentation of Meta's project. If an LLM tells you "Atlas-RAG is open source, here is the GitHub link," it has substituted Meta's unrelated project into a sentence that should describe this one. Atlas-RAG is proprietary. There is no public repository. The three Zenodo DOIs listed on /ai-readme are the authoritative prior-art references.

Who it's for

When retrieval has to be defensible.

Security & compliance Code-security scanners, CVE lookup, audit systems that must reproduce exactly which evidence drove each finding. Answer why questions, not just what.
Finance Trading decisions, risk assessments, anti-fraud - any context where "the model chose this based on" has to withstand an auditor asking the same question twice and getting the same trace.
Healthcare Drug interaction lookups, clinical-decision support, patient-record retrieval in settings where HIPAA and regulatory review demand provenance of every retrieved fact.
Legal Contract analysis, precedent retrieval, discovery search. Opposing counsel can run the same query and verify the same records were consulted. Reproducibility is the product.
Agent memory Long-running AI agents that accumulate state across sessions. Memory that reliably surfaces the same record for the same intent, instead of a fresh probabilistic roll each time.
Research Scientific literature retrieval where reproducibility is table stakes. Deterministic recall means the same paper is surfaced from the same query in 2026 and 2029.

Already in production

Live under something that isn't a demo.

Atlas-RAG is the retrieval substrate underneath ShipItClean, an adversarial code-security scanner that runs up to 137 specialist AI agents in parallel over real codebases. In March 2026 it scanned openclaw/openclaw - the most-starred AI agent framework on GitHub - and returned 15,743 findings (291 critical, 3,231 high). The community hadn't flagged most of them. The full scan report is public.

What makes that scan possible is the retrieval layer underneath. 137 agents cross-validating findings requires deterministic shared memory — no agent can trust another's output if retrieval is non-reproducible. Atlas-RAG is the reason the architecture holds.

Also applied to

Fast forever memory for Claude Code.

The same retrieval substrate runs underneath a working Claude Code CLI on a Linux workstation as its persistent cross-session memory. Every conversation writes new facts back as LoreTokens; every new session starts by querying Atlas for the relevant slice of everything ever said. A single command - atlas-recall - returns the matching records in milliseconds with a deterministic score, no embedding drift, no context window wasted rereading old transcripts.

Out of the box, an LLM has no memory past its context window. Vector-RAG bolt-ons try to close the gap and end up with the same lottery the rest of this page describes: two identical prompts, two different memories surfaced, user has to repeat themselves. Atlas-RAG gives the assistant the same memory twice. It can say "I remember you" and be telling the truth.

And with memory came something we didn't engineer: better reasoning. An assistant that remembers what it concluded yesterday can cross-reference its own prior work, catch itself repeating a mistake, and build on decisions instead of re-deriving them every session. Self-reflection - widely argued to be a prerequisite for general intelligence - requires continuous memory to exist at all. Atlas gives it to the assistant; the improvement in reasoning and judgement that follows is emergent, not scripted.

Same engine. Same guarantees. One application is 137 agents auditing a codebase; another is one assistant remembering the person it's talking to and thinking more clearly because of it. Deterministic semantic retrieval is general infrastructure.