GPT-Simple¶

A clean, readable framework for pretraining language models from scratch. GPT-Simple handles the full LLM pretraining workflow — tokenization, streaming data loading, multi-GPU training, deterministic stop/resume, and inference — through a single YAML config and a small CLI.

Install¶

pip install gpt-simple-lm

The distribution is named gpt-simple-lm; you import it as gpt_simple and run the gpt-simple CLI. For source installs and optional extras, see the project README.

Quick start¶

gpt-simple init -o config.yaml                    # write a commented config template
gpt-simple tokenize --input_dir ./raw_data \
  --output_dir ./data/tokenized --tokenizer_path gpt2
gpt-simple train --config config.yaml             # add --nproc_per_node N for multi-GPU
gpt-simple generate --output-dir ./outputs --prompt "Once upon a time"

New here? Start with the Training guide; every config field is documented in the Configuration reference.

Guides¶

These pages go deeper than the quick start above — each owns a single concern.

Guide	What it covers
Architecture	The built-in model: decoder block, RoPE, normalization, MLP variants, attention backends, weight tying, KV-cache.
Configuration	Every config field — meaning and valid values — for the `model` / `data` / `optimizer` / `training` sections.
Data pipeline	Tokenization, the `.bin/.idx` format, pretokenized vs JSONL, sequence packing, document windowing, curriculum buckets.
Training	Running single- and multi-GPU training, mixed precision, `torch.compile`, gradient checkpointing, W&B.
Checkpointing & resume	On-disk layout, the deterministic stop/resume model, walltime budgets, signals, topology-agnostic data cursors.
Orchestration	Running long, chained jobs under any orchestrator (SLURM, Kubernetes, a local loop).
Inference	`generate` and `batch-generate`, sampling parameters, the batch JSONL schema, dry-run validation.
Hardware tuning	Getting peak throughput from your GPUs: precision, attention backend selection, batch size, dataloader workers.
Performance	Measured throughput on a real 2.8B run, MFU/HFU methodology, how to measure FLOPs, and how it compares to other libraries.

Source of truth¶

These guides describe behavior and intent. When a default value or an exact field list matters, the authoritative source is always the code:

Config fields and defaults: src/gpt_simple/config.py (or run gpt-simple init for a commented template).
Model architecture: src/gpt_simple/model.py.
Public Python API: src/gpt_simple/__init__.py.