Introduction to Large Language Models (LLMs)

Sam Foreman

Introduction to Large Language Models (LLMs)

hpc

llm

A beginner’s guide to understanding Large Language Models, their architecture, and how they function.

Author

Affiliation

Sam Foreman

ALCF

Published

July 15, 2025

Modified

August 5, 2025

Authors

Content modified from original content written by Archit Vasan, including materials on LLMs by: Varuni Sastri and Carlo Graziani at Argonne, and discussion/editorial work by Taylor Childers, Bethany Lusch, and Venkat Vishwanath (Argonne)

📂 02-llms/

Overview

Inspiration from the blog posts “The Illustrated Transformer” and “The Illustrated GPT2” by Jay Alammar, highly recommended reading.

This tutorial covers the some fundamental concepts necessary to to study of large language models (LLMs).

Topics

Scientific applications for language models
General overview of Transformers
Tokenization
Model Architecture
Pipeline using HuggingFace
Model loading

Natural Language Processing (NLP)

Large Language Models (LLMs) are a subset of Natural Language Processing (NLP) techniques that focus on understanding and generating human language. NLP is a field of linguistics / artificial intelligence that enables computers to interpret, understand, and respond to human language in a way that is both meaningful and useful.

The following is a list of common NLP tasks, with some examples:

Classifying whole sentences: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is gramatically correct or whether two sentences are logically related or not.
Classifying each word in a sentence: Identifying the grammatical components of a sentence (noun, verb, adjectvie, …), or the named entities (person, location, organization, …).
Generating Text: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words
Extracting an answer from a text: Given a question and a context, extracting the answer to the question based on the information provided in the context.
Generating a new sentence from an input text: Translating a text into another language, summarizing a text

Large Language Models (LLMs)

A large lanuage model (LLM) is an AI model trained on massive amounts of text data that can understand and generate human-like text, recognize patterns in language, and perform a wide variety of language tasks without task-specific training.
They represent a significant advancement in the field of natural language processing (NLP) (Face 2022).

🚧 Warning

While LLMs are are able to generate (what appears to be) human-like text, they are not sentient, and do not have an understanding of the world in the way that humans do. They are trained to predict the next word in a sentence based on the context of the words that come before it, and can generate text that is coherent and relevant to the input they receive. However, they do not have a true understanding of the meaning of the words they generate, and can sometimes produce text that is nonsensical or irrelevant to the input.

Even with the advances in LLMs, many fundamental challenges remain. These include understanding ambiguity, cultural context, sarcasm and humor. LLMs address these challenges through massive training on diverse datasets, but still often fall short of human-level understanding in many complex scenarios.

References

I strongly recommend reading:

“The Illustrated Transformer” by Jay AlammarAlammar also has a useful post dedicated more generally to Sequence-to-Sequence modeling
LLM Course by 🤗 HuggingFace
“Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention), which illustrates the attention mechanism in the context of a more generic language translation model.
GPT in 60 Lines of NumPy

References

Face, Hugging. 2022. “The Hugging Face Course, 2022.” https://huggingface.co/course.

Citation

BibTeX citation:

@online{foreman2025,
  author = {Foreman, Sam},
  title = {Introduction to {Large} {Language} {Models} {(LLMs)}},
  date = {2025-07-15},
  url = {https://saforem2.github.io/hpc-bootcamp-2025/02-llms/},
  langid = {en}
}

For attribution, please cite this work as:

Foreman, Sam. 2025. “Introduction to Large Language Models (LLMs).” July 15, 2025. https://saforem2.github.io/hpc-bootcamp-2025/02-llms/.

--- title: "Introduction to Large Language Models (LLMs)" description: "A beginner's guide to understanding Large Language Models, their architecture, and how they function." categories: - ai - hpc - llm date: 2025-07-15 date-modified: last-modified --- ::: {.callout-note title="Authors" collapse="false"} Content modified from original content written by Archit Vasan, including materials on LLMs by: Varuni Sastri and Carlo Graziani at Argonne, and discussion/editorial work by Taylor Childers, Bethany Lusch, and Venkat Vishwanath (Argonne) ::: ## Contents - 📂 [02-llms/](index.qmd) - [00-intro-to-llms](00-intro-to-llms/index.qmd) - [01-hands-on-llms](01-hands-on-llms/index.ipynb) - [02-prompt-engineering](02-prompt-engineering/index.qmd) - [05-advanced-llms](05-advanced-llms/index.qmd) - [06-parallel-training](06-parallel-training/index.qmd) - [07-shakespeare-example](07-shakespeare-example/index.qmd) - [08-shakespeare-example-colab](08-shakespeare-example-colab/index.ipynb) - [09-rag-tutorial](09-rag-tutorial/index.qmd) - [10-evaluating-llms](10-evaluating-llms/index.qmd)  ## Overview Inspiration from the blog posts "The Illustrated Transformer" and "The Illustrated GPT2" by Jay Alammar, highly recommended reading. This tutorial covers the some fundamental concepts necessary to to study of large language models (LLMs). ### Topics * Scientific applications for language models * General overview of Transformers * Tokenization * Model Architecture * Pipeline using HuggingFace * Model loading ## Natural Language Processing (NLP) Large Language Models (LLMs) are a subset of Natural Language Processing (NLP) techniques that focus on understanding and generating human language. NLP is a field of linguistics / artificial intelligence that enables computers to interpret, understand, and respond to human language in a way that is both meaningful and useful. The following is a list of common NLP tasks, with some examples: - **Classifying whole sentences**: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is gramatically correct or whether two sentences are logically related or not. - **Classifying each word in a sentence**: Identifying the grammatical components of a sentence (noun, verb, adjectvie, ...), or the named entities (person, location, organization, ...). - **Generating Text**: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words - **Extracting an answer from a text**: Given a question and a context, extracting the answer to the question based on the information provided in the context. - **Generating a new sentence from an input text**: Translating a text into another language, summarizing a text ## Large Language Models (LLMs) > A large lanuage model (LLM) is an AI model trained on massive amounts of text > data that can understand and generate human-like text, recognize patterns in > language, and perform a wide variety of language tasks without task-specific > training. > They represent a significant advancement in the field of natural language > processing (NLP) [@huggingfacecourse]. ::: {.callout-warning icon="false" title="🚧 Warning" collapse="false"} While LLMs are are able to generate (what appears to be) human-like text, they are not sentient, and do not have an understanding of the world in the way that humans do. They are trained to predict the next word in a sentence based on the context of the words that come before it, and can generate text that is coherent and relevant to the input they receive. However, they do not have a true understanding of the meaning of the words they generate, and can sometimes produce text that is nonsensical or irrelevant to the input. ::: Even with the advances in LLMs, many fundamental challenges remain. These include understanding ambiguity, cultural context, sarcasm and humor. LLMs address these challenges through massive training on diverse datasets, but still often fall short of human-level understanding in many complex scenarios. ## References I strongly recommend reading: - ["The Illustrated Transformer"](https://jalammar.github.io/illustrated-transformer/) by Jay AlammarAlammar also has a useful post dedicated more generally to Sequence-to-Sequence modeling - [LLM Course by 🤗 HuggingFace](https://huggingface.co/learn/llm-course/chapter1/1) - ["Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)](https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/), which illustrates the attention mechanism in the context of a more generic language translation model. - [GPT in 60 Lines of NumPy](https://jaykmody.com/blog/gpt-from-scratch/)