Loooooooooong Sequence Lengths

  • Working with Microsoft DeepSpeed team to enable longer sequence lengths (context windows) for LLMs

25B \hspace{30pt} 33B

Figure 1: Maximum (achievable) SEQ_LEN for both 25B and 33B models [WIP]

Ongoing Work & Collaborations

