NanoEuler: GPT-2 Scale LLM Built in C/CUDA
github.com
Monday, June 29, 2026
- •NanoEuler is a GPT-2 scale LLM built from scratch in C and CUDA.
- •The project includes custom backpropagation, BPE tokenizer, and FlashAttention implementations.
- •The codebase supports full pretraining and Supervised Fine-Tuning workflows for language models.
NanoEuler is a GPT-2 style large language model developed from scratch using pure C and CUDA. The project includes hand-written backpropagation, a BPE tokenizer (method for breaking text into subword units), and FlashAttention (algorithm to speed up transformer attention mechanisms). The implementation features full pretraining capabilities as well as Supervised Fine-Tuning (SFT). The repository is publicly hosted on GitHub by user JustVugg for community review and use.