NameGen
A From-Scratch GPT Implementation in Rust
NameGen is a high-performance character-level language model built from the ground up in Rust. It implements a compact GPT-style architecture designed specifically for training and generating realistic names. Unlike many high-level AI projects, NameGen eschews heavy deep-learning frameworks in favor of a custom-built scalar autograd engine, providing full transparency into the mechanics of backpropagation and transformer dynamics.
Key Technical Achievements
1. Custom Scalar Autograd Engine (value.rs)
At the heart of the project is a bespoke automatic differentiation engine.
- Implementation: Built using a scalar graph approach, similar to
microgradbut optimized for Rust’s ownership model. - Memory Efficiency: Features an arena-style architecture to manage node allocations and reuse capacity during training cycles.
- Deterministic Backprop: Ensures precise gradient calculations across the entire computational graph.
2. Compact GPT Architecture (model.rs)
I implemented a specialized transformer-based model optimized for character-level sequences.
- Multi-Head Attention: Full implementation of the attention mechanism, including key/value caching for efficient autoregressive generation.
- Positional Encoding: Learned embeddings to capture structural patterns in names.
- Training Pipeline: Integrated Adam optimizer with seeded RNGs to guarantee 100% deterministic training and generation.
3. Advanced Sampling & Inference (sampling.rs)
The generation engine provides fine-grained control over creativity and quality:
- Techniques: Supports Temperature scaling, Top-K, and Top-P (Nucleus) sampling.
- Deterministic Generation: Every generation run can be perfectly reproduced using a fixed seed.
- Quality Filters: A post-processing layer (
quality.rs) ensures generated outputs meet length and alphabetic constraints while filtering duplicates.
Tech Stack
- Language: Rust (2024 Edition)
- CLI:
Clapfor robust command-line interface design. - Serialization:
Serde&Bincodefor high-speed checkpoint saving and loading. - Concurrency & SIMD: Leverages
Rayonfor data parallelism and thewidecrate for SIMD-accelerated vector operations. - Memory Optimization: Uses
smallvecfor stack-allocated collections and a custom arena for the autograd graph to minimize heap fragmentation. - Observability: Integrated
Env_loggerandIndicatiffor real-time training progress and high-fidelity logging.
Lessons Learned
- Memory Safety in AI: Navigating Rust’s borrow checker while implementing a recursive computational graph taught me deep lessons about memory management and reference handling.
- Low-Level AI Internals: Building the autograd engine and transformer layers from scratch demystified the “black box” of modern LLMs.
- Performance Engineering: Optimizing character-level training in a systems language like Rust highlighted the importance of data locality and efficient serialization.
How to Use
Train the model:
namegen train --dataset names.txt --epochs 5 --save-model assets/model.binGenerate names:
namegen gen --count 20 --temperature 0.8 --top-p 0.9 --load-model assets/model.binThis project was built as part of my journey as a Data & AI Engineer, focusing on the intersection of systems programming and deep learning internals.