Software Engineering KB

Home

❯

09 Machine Learning and AI

❯

01 Deep Learning

❯

01 Concept

❯

Decoder Only Models

Decoder-Only Models

Feb 10, 20261 min read

  • deep-learning
  • transformers
  • decoder-only
  • gpt
  • llm

Decoder-Only Models

← Back to Transformers

Transformer models using only the decoder with causal (left-to-right) attention. Each token can only attend to previous tokens. Best for generation tasks. Dominant architecture for LLMs: GPT-4, Claude, LLaMA, Gemini, Mistral.

Related

  • Encoder-Only Models (bidirectional understanding)
  • Autoregressive Models (decoder-only models are autoregressive)
  • Scaling Laws (decoder-only models scale predictably)

deep-learning transformers decoder-only gpt llm


Graph View

  • Decoder-Only Models
  • Related

Backlinks

  • Transformers
  • Autoregressive Models
  • Encoder-Decoder Architecture
  • Encoder-Only Models
  • Scaling Laws
  • Text Generation

Created with Quartz v4.5.2 © 2026

  • GitHub