Introduction to Large Language Models (LLMs)

ai
hpc
llm
A beginner’s guide to understanding Large Language Models, their architecture, and how they function.
Author
Affiliation
Published

July 15, 2025

Modified

August 5, 2025

NoteAuthors

Content modified from original content written by Archit Vasan, including materials on LLMs by: Varuni Sastri and Carlo Graziani at Argonne, and discussion/editorial work by Taylor Childers, Bethany Lusch, and Venkat Vishwanath (Argonne)

Contents

Overview

Inspiration from the blog posts “The Illustrated Transformer” and “The Illustrated GPT2” by Jay Alammar, highly recommended reading.

This tutorial covers the some fundamental concepts necessary to to study of large language models (LLMs).

Topics

  • Scientific applications for language models
  • General overview of Transformers
  • Tokenization
  • Model Architecture
  • Pipeline using HuggingFace
  • Model loading

Natural Language Processing (NLP)

Large Language Models (LLMs) are a subset of Natural Language Processing (NLP) techniques that focus on understanding and generating human language. NLP is a field of linguistics / artificial intelligence that enables computers to interpret, understand, and respond to human language in a way that is both meaningful and useful.

The following is a list of common NLP tasks, with some examples:

  • Classifying whole sentences: Getting the sentiment of a review, detecting if an email is spam, determining if a sentence is gramatically correct or whether two sentences are logically related or not.
  • Classifying each word in a sentence: Identifying the grammatical components of a sentence (noun, verb, adjectvie, …), or the named entities (person, location, organization, …).
  • Generating Text: Completing a prompt with auto-generated text, filling in the blanks in a text with masked words
  • Extracting an answer from a text: Given a question and a context, extracting the answer to the question based on the information provided in the context.
  • Generating a new sentence from an input text: Translating a text into another language, summarizing a text

Large Language Models (LLMs)

A large lanuage model (LLM) is an AI model trained on massive amounts of text data that can understand and generate human-like text, recognize patterns in language, and perform a wide variety of language tasks without task-specific training.
They represent a significant advancement in the field of natural language processing (NLP) (Face 2022).

Warning🚧 Warning

While LLMs are are able to generate (what appears to be) human-like text, they are not sentient, and do not have an understanding of the world in the way that humans do. They are trained to predict the next word in a sentence based on the context of the words that come before it, and can generate text that is coherent and relevant to the input they receive. However, they do not have a true understanding of the meaning of the words they generate, and can sometimes produce text that is nonsensical or irrelevant to the input.

Even with the advances in LLMs, many fundamental challenges remain. These include understanding ambiguity, cultural context, sarcasm and humor. LLMs address these challenges through massive training on diverse datasets, but still often fall short of human-level understanding in many complex scenarios.

References

I strongly recommend reading:

References

Face, Hugging. 2022. “The Hugging Face Course, 2022.” https://huggingface.co/course.

Citation

BibTeX citation:
@online{foreman2025,
  author = {Foreman, Sam},
  title = {Introduction to {Large} {Language} {Models} {(LLMs)}},
  date = {2025-07-15},
  url = {https://saforem2.github.io/hpc-bootcamp-2025/02-llms/},
  langid = {en}
}
For attribution, please cite this work as:
Foreman, Sam. 2025. “Introduction to Large Language Models (LLMs).” July 15, 2025. https://saforem2.github.io/hpc-bootcamp-2025/02-llms/.