Software Engineering KB

Home

❯

09 Machine Learning and AI

❯

01 Deep Learning

❯

01 Concept

❯

Encoder Only Models

Encoder-Only Models

Feb 10, 20261 min read

  • deep-learning
  • transformers
  • encoder-only
  • bert

Encoder-Only Models

← Back to Transformers

Transformer models using only the encoder with bidirectional attention. Each token attends to all tokens in the sequence. Best for understanding tasks: classification, NER, semantic similarity. Key models: BERT, RoBERTa, DeBERTa.

Related

  • Decoder-Only Models (autoregressive generation)
  • Encoder-Decoder Architecture (original Transformer)
  • Masked Language Modeling (BERT’s pretraining objective)

deep-learning transformers encoder-only bert


Graph View

  • Encoder-Only Models
  • Related

Backlinks

  • Transformers
  • Bidirectional RNNs
  • Decoder-Only Models
  • Encoder-Decoder Architecture
  • Named Entity Recognition
  • Text Classification

Created with Quartz v4.5.2 © 2026

  • GitHub