Introduction to Large Language Models (LLMs)
Content modified from original content written by Archit Vasan, including materials on LLMs by: Varuni Sastri and Carlo Graziani at Argonne, and discussion/editorial work by Taylor Childers, Bethany Lusch, and Venkat Vishwanath (Argonne)
Contents
Overview
Inspiration from the blog posts “The Illustrated Transformer” and “The Illustrated GPT2” by Jay Alammar, highly recommended reading.
This tutorial covers the some fundamental concepts necessary to to study of large language models (LLMs).
Topics
- Scientific applications for language models
- General overview of Transformers
- Tokenization
- Model Architecture
- Pipeline using HuggingFace
- Model loading
Natural Language Processing (NLP)
Large Language Models (LLMs) are a subset of Natural Language Processing (NLP) techniques that focus on understanding and generating human language. NLP is a field of linguistics / artificial intelligence that enables computers to interpret, understand, and respond to human language in a way that is both meaningful and useful.
The following is a list of common NLP tasks, with some examples:
- Classifying whole sentences: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is gramatically correct or whether two sentences are logically related or not.
- Classifying each word in a sentence: Identifying the grammatical components of a sentence (noun, verb, adjectvie, …), or the named entities (person, location, organization, …).
- Generating Text: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words
- Extracting an answer from a text: Given a question and a context, extracting the answer to the question based on the information provided in the context.
- Generating a new sentence from an input text: Translating a text into another language, summarizing a text
Large Language Models (LLMs)
A large lanuage model (LLM) is an AI model trained on massive amounts of text data that can understand and generate human-like text, recognize patterns in language, and perform a wide variety of language tasks without task-specific training.
They represent a significant advancement in the field of natural language processing (NLP) (Face 2022).
Even with the advances in LLMs, many fundamental challenges remain. These include understanding ambiguity, cultural context, sarcasm and humor. LLMs address these challenges through massive training on diverse datasets, but still often fall short of human-level understanding in many complex scenarios.
References
I strongly recommend reading:
- “The Illustrated Transformer” by Jay AlammarAlammar also has a useful post dedicated more generally to Sequence-to-Sequence modeling
- LLM Course by 🤗 HuggingFace
- “Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention), which illustrates the attention mechanism in the context of a more generic language translation model.
- GPT in 60 Lines of NumPy
References
Citation
@online{foreman2025,
author = {Foreman, Sam},
title = {Introduction to {Large} {Language} {Models} {(LLMs)}},
date = {2025-07-15},
url = {https://saforem2.github.io/hpc-bootcamp-2025/02-llms/},
langid = {en}
}