I joined the Chai Code GenAI with JS cohort recently and noticed a massive distinction that people outside this space completely miss. Most developers working with AI are not actually building AI models. They are building products using those models. Those are two entirely different jobs.
The analogy that made this click for me was the LCA Tejas program. The people working on the jet engine are doing a completely different kind of engineering involving material science, thermodynamics, and manufacturing tolerances that are probably smaller than my patience while debugging production issues. In the AI world, those are the researchers and data scientists building the actual models. But HAL still needs to understand the engine. They do not need to machine turbine blades, but they must know the constraints and tradeoffs because that engine eventually has to fit into a fighter platform that actually flies. That feels much closer to where most of us are.
A fighter jet needs a radar system, a missile system, and an avionics package from different places, and someone still has to integrate all of it into a machine that works as a coherent system. Sometimes it happens successfully, and sometimes it happens after fifteen years of delays and committee meetings. That is pretty close to modern AI development. Your model might be GPT or Claude, but your vector database, tool calling, MCP servers, embeddings, and memory systems are all individual subsystems. You have to make them work together if you want an actual product instead of a cool demo video for Twitter.
Before handling all that, it helps to understand the basics under the hood. LLM stands for Large Language Model. The high-level idea is surprisingly simple: give a model an absurd amount of text and train it to predict what comes next. Word by word, token by token. Somewhere along the way, this starts producing behavior that looks suspiciously close to actual understanding. Writing emails, summarizing PDFs, or generating code all come from that exact mechanism. When you type a message into ChatGPT, it is not copying text from a website. It is predicting tokens one by one until it decides it has said enough.
Tokens matter because computers only understand numbers. Everything eventually becomes numbers, including this sentence and any spelling mistakes I make. Tokenization is just the process of splitting text into these smaller pieces. A token is not always a clean word; sometimes a word gets split into multiple pieces, and sometimes a punctuation mark becomes its own token. AI likes to make things complicated for no apparent reason.
Once those tokens are created, they pass through a Transformer architecture. This was the big breakthrough that changed almost everything in modern AI. Nearly every serious language model today uses Transformers because they are incredibly good at understanding relationships and context across large amounts of text.
At the highest level, the whole thing looks like a simple pipeline:
User -> Prompt -> Tokens -> Transformer -> Response
The diagram is straightforward, but the engineering behind it is not. And honestly, that is okay. HAL engineers are still building fighter aircraft even if they are not the ones building the engine from scratch.