Text Features

Back to Feature Engineering

Converting text data into numeric representations for ML models.

Techniques

  • TF-IDF — term frequency-inverse document frequency, sparse representation
  • Bag of Words — word count vectors, ignores order
  • N-grams — sequences of n consecutive words
  • Word Embeddings — dense vector representations (Word2Vec, GloVe, FastText)

ml feature-engineering text nlp