Embeddings

Back to Text Processing Pipeline

Dense vector representations of tokens that capture semantic meaning. The foundation of how neural models understand text.

Types

  • Word2Vec — static word embeddings, skip-gram and CBOW
  • GloVe — global vectors from co-occurrence statistics
  • FastText — subword embeddings, handles OOV words
  • Contextual Embeddings — different vector per context (BERT, GPT output embeddings)

nlp embeddings word2vec