WrongoDB

Last updated: 2026-04-27

The database that teaches you how databases work by doing everything wrong first, then fixing it.

Philosophy: Learning by Breaking Things

WrongoDB is a journey into the internals of database systems. Instead of trying to build a production-ready database from scratch, we start with the simplest, most naive implementations possible—approaches that are often "wrong" for a real-world system—and evolve them.

By starting with "wrong" (but working) solutions, we can visualize the problems they cause (performance, data integrity, concurrency) and then implement the "right" solutions (WAL, B-trees, MVCC) to solve those specific problems.

It is a learning resource, a playground, and a documentation of the path from "naive JSON file" to "robust storage engine".

Current State

A learning-oriented MongoDB-compatible database in Rust with primary/secondary replication roles, inspired by WiredTiger's architecture.

Storage Engine

B+tree storage: Fixed-size paged files with slotted leaf/internal pages, arbitrary height, splits, range scans
Page cache: LRU eviction, pin/unpin API, dirty tracking, copy-on-write updates
Checkpoints: Atomic root swap via checkpoint slots, dirty page flush, retired block reuse
WAL (Write-Ahead Logging): Global WAL with LSNs, CRC32 checksums, change-vector logging, recovery replay
MVCC/Transactions: Global transaction state, snapshot isolation, version chains, commit/abort

API

Connection/Session/TableCursor low-level storage API
Explicit transaction scopes via Session::with_transaction()
MongoDB wire protocol server (works with mongosh)
Internal Mongo-style write stack with write_ops, CollectionWritePath, and a storage-backed logical oplog in the reserved local.oplog.rs collection, plus secondary replication state in local.repl_state

TableCursor is intentionally a local/store-level API. Document semantics, namespace-aware collection catalog state, and index creation live above it in the server/document layer.

Future Work

Background maintenance: space reuse, compaction, compression
Additional query operators and aggregation

Source Layout

The codebase is organized by domain to keep storage, catalog, and server concerns separate:

src/storage/: storage API, metadata store, B+tree, page store, WAL, recovery, and checkpoints
src/catalog/: durable namespace-keyed collection catalog stored above the storage metadata layer
src/api/: server-side context above the storage API
src/core/: shared types/utilities (BSON codec, document helpers, namespace types, errors)
src/index/: secondary index implementation and key encoding
src/txn/: global transaction state, snapshot visibility, and transaction runtime
src/api/ddl_path.rs: top-level createCollection / createIndexes path above storage
src/write_ops/: top-level Mongo-style CRUD executor above the collection mutator
src/collection_write_path.rs: low-level in-transaction collection document mutator
src/document_query.rs: internal document/query helpers used by the server path
src/server/: MongoDB wire-protocol server, command handlers, and role-aware startup reconciliation
src/replication/: server-layer replication state, oplog observer, secondary apply runtime, and local.oplog.rs / local.repl_state collection management above the storage connection
src/bin/: server binary entry point

Integration tests are grouped under tests/:

tests/storage_suite.rs, tests/connection_suite.rs, and tests/server_suite.rs: suite entry points for tests/storage/, tests/connection/, and tests/server/
tests/storage_cursor_write_path.rs: focused storage cursor/write-path coverage
tests/public_api_surface.rs: public API surface checks
tests/bench_wire_ab_smoke.rs: wire protocol benchmark smoke tests

See the examples/ directory for additional standalone programs demonstrating recovery, checkpointing, and WAL behavior.

Documentation

See the docs/ directory for detailed architecture documentation:

Documentation Index: entry point for maintained docs
Architecture Overview: current layer map and subsystem boundaries
Storage Engine: storage metadata, B+tree/page/block stack, checkpointing, and recovery
Server Stack: server-layer services, durable catalog, read path, and write path
Command And Query Capabilities: implemented MongoDB commands, query behavior, and current limits
Architecture Guidelines: rules for keeping architecture docs current
Design Decisions: recorded architectural decisions

Install

Download and install the prebuilt server binary from GitHub Releases (Linux/macOS):

curl -sSL https://github.com/gabrielelanaro/wrongodb/releases/latest/download/wrongodb-installer.sh | sh

This installer installs the wrongodb-server binary.

Quickstart

As a Library

use wrongodb::{Connection, ConnectionConfig};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Open connection (WAL enabled by default)
    let conn = Connection::open("data/db", ConnectionConfig::default())?;
    let mut session = conn.open_session();
    session.create_table("table:test", Vec::new())?;

    // Execute one transactional write unit
    session.with_transaction(|session| {
        let mut cursor = session.open_table_cursor("table:test")?;
        cursor.insert(b"alice", b"age=30")?;
        cursor.insert(b"bob", b"age=25")?;
        Ok(())
    })?;

    // Read back values
    let mut cursor = session.open_table_cursor("table:test")?;
    println!("{:?}", cursor.get(b"alice")?);
    println!("{:?}", cursor.get(b"bob")?);
    Ok(())
}

Schema objects are URI-based at the low-level API:

session.create_table("table:users", Vec::new())?

Index creation is handled by the Mongo-style internal write/schema path rather than the WT-style public Session API.

As a Server

For server architecture and command-layer structure, see Server Stack.

Run the server directly:

wrongodb-server

You can also set the listen address with --addr, --port, WRONGO_ADDR, or WRONGO_PORT. For benchmark isolation, you can set data path with --db-path or WRONGO_DB_PATH. Replication mode is configured with --role or WRONGO_ROLE, --node-name or WRONGO_NODE_NAME, and --sync-source or WRONGO_SYNC_SOURCE; secondary mode requires a sync source.

Run tests with cargo test.

Development

# Build
cargo build
just build

# Run tests
cargo test
just test

# Check compilation (faster than full build)
cargo check
just check

# Lint (clippy with warnings as errors)
cargo clippy -- -D warnings
just clippy

# Format code
cargo fmt
just fmt

# Run all checks (check, test, clippy, fmt)
just all

Benchmarking

Run wire-protocol A/B benchmarks (WrongoDB vs MongoDB via Docker):

just bench-wire-ab

This runs bench_wire_ab with defaults:

warmup: 15s
measure: 60s
concurrency: 1,4,8,16,32,64
mongo image: mongo:7.0

Artifacts are written to:

target/benchmarks/wire_ab/results.csv
target/benchmarks/wire_ab/gate.json
target/benchmarks/wire_ab/summary.md

Release

Create a version tag and push it to trigger the GitHub Actions release workflow (uses cargo-dist):

git tag v0.1.0
git push origin v0.1.0

The workflow builds artifacts (archives, installers) and publishes them to GitHub Releases.

License

MIT

Name	wrongodb-blogging
Description	Plan and write WrongoDB devlog posts in this repo. Use when asked to plan, outline, draft, or revise posts under blog/, generate blog images, or follow the series structure for WrongoDB. This skill embeds the canonical planning and writing prompts and uses blog/generate_image.py for image generation (including structured diagrams via --structured flag).

wrongodb-blogging

SKILL.md