Foundation Models

Back to Large Language Models

Large pretrained models that serve as the base for a wide range of downstream tasks. Trained on massive datasets with enormous compute, they develop general capabilities that can be specialized through prompting or fine-tuning.

Key Models

  • GPT-4 — OpenAI, multimodal, state-of-the-art reasoning
  • Claude — Anthropic, Constitutional AI, long context
  • LLaMA — Meta, open-weight, widely used for research and fine-tuning
  • Gemini — Google DeepMind, multimodal, long context
  • Mistral — Mistral AI, efficient open models, MoE architecture

nlp llm foundation-models