Last Updated: 02/02/2024 @ 21:53:52
wordplay
🎮 💬2024-02-02
A set of simple, scalable and highly configurable tools for working1 with LLMs.
What started as some simple modifications to Andrej Karpathy's nanoGPT
has now grown into the wordplay
project.
While nanoGPT
is a great project and an excellent resource; it is, by design, very minimal1 and limited in its flexibility.
Working through the code I found myself making minor changes here and there to test new ideas and run variations on different experiments. These changes eventually built to the point where my {goals, scope, code}
for the project had diverged significantly from the original vision.
As a result, I figured it made more sense to move things to a new project, wordplay
.
I’ve priortized adding functionality that I have found to be useful or interesting, but am absolutely open to input or suggestions for improvement.
Different aspects of this project have been motivated by some of my recent work on LLMs.
ezpz
: Painless distributed training with your favorite {framework, backend}
combo.Megatron-DeepSpeed
: Ongoing research training transformer language models at scale, including: BERT & GPT-2The easiest way to get the most recent version is to:
If you’d like to work with the project and run / change things yourself, I’d recommend installing from a local (editable) clone of this repository:
Last Updated: 02/02/2024 @ 21:53:52