NameGen

NameGen

A From-Scratch GPT Implementation in Rust with a custom scalar autograd engine and transformer architecture.

Rust AI Deep Learning Systems Transformer
Table of Contents

NameGen

A From-Scratch GPT Implementation in Rust

NameGen is a high-performance character-level language model built from the ground up in Rust. It implements a compact GPT-style architecture designed specifically for training and generating realistic names. Unlike many high-level AI projects, NameGen eschews heavy deep-learning frameworks in favor of a custom-built scalar autograd engine, providing full transparency into the mechanics of backpropagation and transformer dynamics.

Key Technical Achievements

1. Custom Scalar Autograd Engine (value.rs)

At the heart of the project is a bespoke automatic differentiation engine.

  • Implementation: Built using a scalar graph approach, similar to micrograd but optimized for Rust’s ownership model.
  • Memory Efficiency: Features an arena-style architecture to manage node allocations and reuse capacity during training cycles.
  • Deterministic Backprop: Ensures precise gradient calculations across the entire computational graph.

2. Compact GPT Architecture (model.rs)

I implemented a specialized transformer-based model optimized for character-level sequences.

  • Multi-Head Attention: Full implementation of the attention mechanism, including key/value caching for efficient autoregressive generation.
  • Positional Encoding: Learned embeddings to capture structural patterns in names.
  • Training Pipeline: Integrated Adam optimizer with seeded RNGs to guarantee 100% deterministic training and generation.

3. Advanced Sampling & Inference (sampling.rs)

The generation engine provides fine-grained control over creativity and quality:

  • Techniques: Supports Temperature scaling, Top-K, and Top-P (Nucleus) sampling.
  • Deterministic Generation: Every generation run can be perfectly reproduced using a fixed seed.
  • Quality Filters: A post-processing layer (quality.rs) ensures generated outputs meet length and alphabetic constraints while filtering duplicates.

Tech Stack

  • Language: Rust (2024 Edition)
  • CLI: Clap for robust command-line interface design.
  • Serialization: Serde & Bincode for high-speed checkpoint saving and loading.
  • Concurrency & SIMD: Leverages Rayon for data parallelism and the wide crate for SIMD-accelerated vector operations.
  • Memory Optimization: Uses smallvec for stack-allocated collections and a custom arena for the autograd graph to minimize heap fragmentation.
  • Observability: Integrated Env_logger and Indicatif for real-time training progress and high-fidelity logging.

Lessons Learned

  • Memory Safety in AI: Navigating Rust’s borrow checker while implementing a recursive computational graph taught me deep lessons about memory management and reference handling.
  • Low-Level AI Internals: Building the autograd engine and transformer layers from scratch demystified the “black box” of modern LLMs.
  • Performance Engineering: Optimizing character-level training in a systems language like Rust highlighted the importance of data locality and efficient serialization.

How to Use

Train the model:

namegen train --dataset names.txt --epochs 5 --save-model assets/model.bin

Generate names:

namegen gen --count 20 --temperature 0.8 --top-p 0.9 --load-model assets/model.bin

This project was built as part of my journey as a Data & AI Engineer, focusing on the intersection of systems programming and deep learning internals.