Keep Learning: How LLMs works?

"An LLM reads text as tokens, converts them to vectors, runs them through stacked transformer layers,

and predicts the next token — one at a time."

Tokenization → Embedding → Self-attention → Feed-forward → Output

Break it (tokenize) → Understand it (embed + attend) → Answer it (FFN + output)

Tokenization: Breaking text into pieces the model can read

Tokenization is the process of splitting raw text into smaller units called tokens. A token can be a word, part of a word, or a punctuation mark. Each token is then mapped to a unique integer ID from a fixed vocabulary.

Embedding: Converting tokenIDs into vectors of meaning.

An embedding is a dense vector (list of ~768–4096 numbers) that represents a token's meaning. Tokens with similar meanings have similar vectors. A positional encoding is added so the model knows word order.

Self-attention: Every token looks at every other token

Self-attention is the mechanism that lets each token gather context from all other tokens in the sequence. It computes a weighted sum of all token vectors — the weights (attention scores) determine how much influence each other token has.

Feed-forward Network: Applying stored knowledge to each token

After attention, each token vector passes independently through a feed-forward network — two linear layers with a non-linear activation in between. This is where the model's factual knowledge is stored (in the weights). Attention gathers context; FFN applies what the model knows.

Output prediction: Picking the next token from probabilities

After all transformer layers, the final vector for the last token is projected to a logit score for every vocabulary word. A softmax converts these to probabilities. The model samples or picks the highest-probability token — this becomes the next word. The process then repeats from Step 1.

The stages build on each other: tokenization feeds embeddings, embeddings feed the transformer layers (attention + FFN repeated 32–96×), and the final layer feeds the output predictor which loops back to the start for the next token.

Keep Learning

Pages

How LLMs works?

No comments:

Post a Comment

Follow In

Translate

Thank for your visit

About Me