June 21, 2025
When I first delved into large language models (LLMs), I often found the explanations perplexing. Many resources were tailored to those with a background in data science or Python—something I, like many Swift developers, didn’t possess. During my presentation at the One More Thing Conference Cupertino 2025, I chose a different approach. Instead of diving into the complexities of the transformer (often the toughest part to grasp), I focused on what happens before and after it. This approach simplifies understanding the entire system. If you’re a Swift developer exploring Apple’s open-source AI framework, MLX, or just curious about how LLMs function, you’re in the right place.
Before diving into the details of large language models (LLMs), it’s useful to first understand their basic components. By examining each part separately, you’ll gain a clearer picture of how they work together. Plus, if you’re interested in experimenting yourself, you’ll need to know where to find and use these files.
An LLM typically consists of three core files:
run.c
) is only about 35 kilobytes, highlighting its compactness. While simpler comparisons might mention its ability to fit in a vintage computer’s memory, what stands out is how few dependencies it has—using only standard C libraries like stdio and math.tokenizer.json
, this file defines how text is broken into tokens—the basic units the model understands. Tokens might be whole words or fragments, such as prefixes or suffixes. For example, the word “Walkman” could be split into “walk” and “man.” The tokenizer also includes special tokens for unknown words and markers for text boundaries. The tokenizer file is part of the model, which you can download here.It’s important to note that these files are static and read-only. Once trained, an LLM does not learn new information.
An LLM allows users to access its knowledge, stored in the model’s weights, using plain English or other trained languages. Unlike traditional software, you don’t need a coding language like SQL. Understanding these files provides insight into how an LLM functions during inference. Once you’re familiar with them, you’re ready to explore how they interconnect. In the following posts, we’ll delve deeper into how these components collaborate to power an LLM.