Last Updated: 02/02/2024 @ 21:53:52
wordplay
๐ฎ ๐ฌ
A set of simple, scalable and highly configurable tools for working1 with LLMs.
Background
What started as some simple modifications to Andrej Karpathy's nanoGPT
has now grown into the wordplay
project.
If youโre curiousโฆ
While nanoGPT
is a great project and an excellent resource; it is, by design, very minimal2 and limited in its flexibility.
Working through the code I found myself making minor changes here and there to test new ideas and run variations on different experiments. These changes eventually built to the point where my {goals, scope, code}
for the project had diverged significantly from the original vision.
As a result, I figured it made more sense to move things to a new project, wordplay
.
Iโve priortized adding functionality that I have found to be useful or interesting, but am absolutely open to input or suggestions for improvement.
Different aspects of this project have been motivated by some of my recent work on LLMs.
- Projects:
ezpz
: Painless distributed training with your favorite{framework, backend}
combo.Megatron-DeepSpeed
: Ongoing research training transformer language models at scale, including: BERT & GPT-2
- Collaboration(s):
- DeepSpeed4Science (2023-09)
- Loooooooong Sequence Lengths
- Project Website
- Preprint Song et al. (2023)
- Blog Post
- Tutorial
- GenSLMs:
- DeepSpeed4Science (2023-09)
- Talks / Workshops:
Completed
In Progress
Install
Grab-n-Go
The easiest way to get the most recent version is to:
Development
If youโd like to work with the project and run / change things yourself, Iโd recommend installing from a local (editable) clone of this repository:
References
Footnotes
Citation
@online{foreman2024,
author = {Foreman, Sam},
title = {`Wordplay` ๐ฎ ๐ฌ},
date = {2024-02-02},
url = {https://saforem2.github.io/wordplay},
langid = {en}
}