Storage and Metadata
Storage & Metadata
jjj stores all project management metadata in a shadow graph—a separate, orphaned commit history in your Jujutsu repository.
The Shadow Graph
What is it?
A shadow graph is an orphaned commit history that exists in your repository but is completely separate from your project code:
graph LR subgraph "Project History" A[Initial] --> B[Feature A] B --> C[Feature B] C --> D[Feature C] D --> main((main)) end
subgraph "Shadow Graph (jjj)" root[Orphaned Root] --> init[Initialize jjj] init --> task[Add tasks] task --> meta[Update metadata] meta --> jjj_branch((jjj)) endThese histories never merge. They coexist peacefully in the same repository.
Why Use a Shadow Graph?
Traditional approaches to storing metadata have problems:
❌ Polluting Project History
# Bad: Metadata mixed with code◯ Add user authentication│◯ jjj: Update problem status ← Noise!│◯ Fix login bug│◯ jjj: Create solution ← More noise!This clutters git log and makes project history messy.
❌ Separate Git Repository
project/ # Main codeproject-meta/ # Metadata in separate repoProblems:
- Have to sync two repositories
- Lose atomic operations
- Complex deployment
✅ Shadow Graph (jjj’s Approach)
# Same repo, separate historiesjj log -r main # Clean project historyjj log -r jjj # Metadata history
jj git push --all # Push both at onceBenefits:
- ✅ One repository to manage
- ✅ Atomic push/pull of code + metadata
- ✅ Clean project history
- ✅ Easy to reset or delete metadata
File Structure
When you run jjj init, it creates this structure on the jjj branch:
config.toml # Project configurationmilestones/ # Milestone storage├── 01959c4d-e5f6-7a7b-8c9d-0e1f2a3b4c5d.md└── 01959d5e-f6a7-7b8c-9d0e-1f2a3b4c5d6e.mdproblems/ # Problem storage├── 01957d3e-a8b2-7def-8c3a-9f4e5d6c7b8a.md├── 01958a2b-c3d4-7e5f-6a7b-8c9d0e1f2a3b.md└── 01958b3c-d4e5-7f6a-7b8c-9d0e1f2a3b4c.mdsolutions/ # Solution storage├── 01958c4d-e5f6-7a7b-8c9d-0e1f2a3b4c5d.md└── ...critiques/ # Critique storage├── 01958d5e-f6a7-7b8c-9d0e-1f2a3b4c5d6e.md└── 01958e6f-a7b8-7c9d-0e1f-2a3b4c5d6e7f.mdevents.jsonl # Event logEntity files are named with their full UUID7 identifier. UUID7 is time-ordered, so files naturally sort chronologically.
Storage Layer Implementation
MetadataStore
The MetadataStore struct manages all metadata operations:
pub struct MetadataStore { meta_path: PathBuf, // Path to .jj/jjj-meta jj_client: JjClient, // Main repo client meta_client: JjClient, // Metadata workspace client}Initialization
When you run jjj init:
-
Create orphaned root:
Terminal window jj new --no-parent -m "Initialize jjj metadata" -
Create bookmark:
Terminal window jj bookmark create jjj -
Create workspace:
Terminal window jj workspace add .jj/jjj-meta -r jjj -
Initialize directories:
mkdir -p .jjj/{problems,solutions,critiques,milestones} -
Create default config:
.jj/jjj-meta/config.toml [board]columns = ["TODO", "In Progress", "Review", "Done"][tags]allowed = ["backend", "frontend", "docs", "tests"]
File Formats
TOML for Configuration
[board]columns = ["TODO", "In Progress", "Review", "Done"]
[tags]allowed = ["backend", "frontend", "api", "ui"]
# Review is now per-solution via assigned reviewers.# See solution --submit flag and jjj solution submit.Markdown with YAML Frontmatter for Work Items
Problems, solutions, critiques, and milestones use markdown files with YAML frontmatter:
---id: 01957d3e-a8b2-7def-8c3a-9f4e5d6c7b8atitle: Search is slow on large datasetsstatus: openpriority: highassignee: alicemilestone_id: 01959c4d-e5f6-7a7b-8c9d-0e1f2a3b4c5dgithub_issue: 42created_at: 2025-11-23T10:00:00Zupdated_at: 2025-11-23T15:30:00Z---
Users are reporting slow search results when querying datasets with more than 10,000 records.
## Context
- Search takes 5+ seconds- Server logs show full table scansEntity IDs are UUID7 (time-ordered UUIDs). In listings, truncated prefixes like 01957d are shown for readability, with automatic extension for uniqueness.
Why YAML frontmatter + Markdown?
- Human-readable and writable
- Structured metadata in frontmatter
- Free-form description in markdown body
- Native Rust ecosystem support (serde)
- Easy to edit with any text editor
Transaction Model
Atomic Updates
jjj uses a simple transaction model:
store.with_metadata("Create problem", || { // 1. Perform operations let problem = Problem::new(...); store.save_problem(&problem)?;
// 2. All operations succeed or all fail Ok(())})?;// 3. Metadata committed atomicallyThis translates to writing markdown files to the shadow graph and committing.
Conflict Resolution
If two users modify metadata simultaneously:
User A User B────── ──────jjj problem new "Fix login" jjj problem new "Add search" ↓ ↓Creates 01957d... Creates 01958a... ↓ ↓jj git push jj git push ↓ ↓ └──────── CONFLICT! ────────────┘Because each problem gets a unique UUID7, the actual files never conflict (different filenames). Conflicts only occur when editing the same entity.
Resolution:
# Pull and resolvejj git fetchjj bookmark track jjj@origin
# jj automatically merges file-based changes# If both created different files → no conflict!
# If same file modified → manual merge may be neededSync Model
Push
# Push metadata bookmarkjj git push --bookmark jjj
# Or push all bookmarksjj git push --allWhat gets pushed:
- All metadata commits
- Shadow graph history
- Configuration changes
Pull
# Fetch metadatajj git fetch
# Track remote bookmarkjj bookmark track jjj@origin
# Metadata automatically mergedWorking Offline
jjj is designed for offline-first workflows:
# Create problems offlinejjj problem new "Fix login flow" --priority highjjj problem new "Add test coverage"
# Propose solutions (reference by title)jjj solution new "Refactor auth handler" --problem "login flow"
# Later, when onlinejj git push --allAll metadata is local until you push!
Performance
ID Generation
IDs are UUID7, generated locally without coordination:
pub fn generate_id() -> String { uuid::Uuid::now_v7().to_string()}Time complexity: O(1) - constant time, no scanning required.
UUID7 provides:
- No conflicts: UUIDs are globally unique, so distributed teams can create entities without coordination
- Time ordering: UUID7 encodes creation time, so IDs sort chronologically
- Human-friendly prefixes: The first 6+ hex characters are usually unique enough for display
File System Layout
Each work item is a separate file:
✅ Benefits:
- Parallel access
- Minimal conflicts
- Easy to inspect/edit manually
❌ Trade-offs:
- More files = slower directory listing
- Mitigated by using separate directories per type
SQLite Runtime Cache
jjj maintains a local SQLite database at .jj/jjj.db as a runtime index/cache. The canonical data remains in the shadow graph (markdown files on the jjj bookmark); the SQLite database is derived and can be fully rebuilt at any time via jjj db rebuild.
The SQLite layer provides:
- Full-text search (FTS5): All entity titles and bodies are indexed for fast
jjj searchqueries. - Relational indexes: Foreign-key relationships (solutions to problems, critiques to solutions, problems to milestones) enable fast lookups without scanning files.
- Semantic embeddings: Optional vector embeddings stored alongside entities for similarity search.
- Schema versioning: The database self-manages its schema version (currently v5) and automatically rebuilds when the schema changes or if an interrupted sync left it in a dirty state.
The database is populated by reading all markdown files and events.jsonl from the shadow graph, then inserting them into SQLite tables. This happens automatically when the database is missing or outdated.
GitHub Sync Fields
Entities include optional fields for bidirectional GitHub synchronization:
- Problems:
github_issue— linked GitHub issue number - Solutions:
github_pr(pull request number),github_branch(remote branch name) - Critiques:
github_review_id— linked GitHub review ID
These fields are stored in both the YAML frontmatter (canonical) and the SQLite cache (indexed). The jjj github command uses the gh CLI to push and pull state between the shadow graph and GitHub Issues/PRs.
Backup and Recovery
Export Metadata
# Full backupjj git bundle create jjj-backup.bundle -r jjj
# Or use plain gitcd .jj/jjj-metagit bundle create ~/jjj-backup.bundle --allRestore Metadata
# Restore from bundlejj git bundle unbundle jjj-backup.bundlejj bookmark set jjj -r <restored-commit>Reset Metadata
If metadata gets corrupted:
# Option 1: Reset to earlier statejj bookmark set jjj -r <earlier-commit>
# Option 2: Delete and reinitializejj bookmark delete jjjjjj initYour project code is never affected! The shadow graph is completely separate.
Advantages
vs. Git Notes
Git notes have problems:
- Not pushed by default
- Easy to lose
- No history
- Awkward APIs
jjj’s shadow graph:
- ✅ Pushed with
git push --all - ✅ Full commit history
- ✅ Standard jj operations
vs. GitHub Issues / JIRA
External tools require:
- ❌ Internet connection
- ❌ Account/authentication
- ❌ Separate data store
- ❌ API rate limits
jjj:
- ✅ Works offline
- ✅ Lives in your repo
- ✅ No external dependencies
- ✅ Infinite scalability
vs. Text Files in Repo
Storing .md files in project:
- ❌ Pollutes history
- ❌ Merge conflicts with code
- ❌ Clutters working directory
Shadow graph:
- ✅ Clean project history
- ✅ Independent merge conflicts
- ✅ Hidden from code directory
Future Enhancements
Planned Improvements
- Compression: Use zstd for large datasets
- Partial clone: Fetch only recent metadata
- Garbage collection: Prune old review data
Compatibility
The storage format is designed to evolve:
- YAML frontmatter allows schema evolution
- Version field for migration
- Unknown fields ignored
This means old jjj versions can read newer data (graceful degradation).
Summary
jjj’s storage layer uses a shadow graph (canonical markdown files on the jjj bookmark) plus a SQLite runtime cache (.jj/jjj.db) to achieve:
- ✅ Clean separation of metadata and code
- ✅ Atomic operations
- ✅ Offline-first workflow
- ✅ Standard git push/pull
- ✅ Easy backup and recovery
- ✅ Fast full-text search and relational queries via SQLite
- ✅ Optional GitHub sync via
github_issue,github_pr, andgithub_review_idfields
The SQLite database is always derivable from the shadow graph and can be deleted or rebuilt without data loss. This is only possible because of Jujutsu’s flexible commit graph and workspace model!
See Also
- Design Philosophy - Why these choices were made
- Change ID Tracking - How change IDs enable robust metadata
- CLI Reference - Using the storage layer