FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

A List of 1 Billion+ Parameter LLMs

Apr 10, 2023. by Matt Rickard

There are already over 50 different 1B+ parametersLLMs accessible via open-source checkpoints or proprietary APIs. That’s not counting any private models or models with academic papers but no available API or model weights. There’s even more if you count fine-tuned models like Alpaca or InstructGPT. A list of the ones I know about (this is an evolving document).

  1. GPT-J (6B) (EleutherAI)
  2. GPT-Neo (1.3B, 2.7B, 20B) (EleutherAI)
  3. Pythia (1B, 1.4B, 2.8B, 6.9B, 12B)
  4. Polyglot (1.3B, 3.8B, 5.8B)
  5. J1 (7.5B, 17B, 178B) (AI21)
  6. LLaMa (7B, 13B, 33B, 65B) (Meta)
  7. OPT (1.3B, 2.7B, 13B, 30B, 66B, 175B) (Meta)
  8. Fairseq (1.3B, 2.7B, 6.7B, 13B) (Meta)
  9. Cerebras-GPT (1.3B, 2.7B, 6.7B, 13B) (Cerebras)
  10. GLM-130B
  11. YaLM (100B) (Yandex)
  12. UL2 20B (Google)
  13. PanGu-α (200B) (Huawei)
  14. Cohere (Medium, XLarge)
  15. Claude (instant-v1.0, v1.2) (Anthropic)
  16. CodeGen (2B, 6B, 16B) (Salesforce)
  17. NeMo (1.3B, 5B, 20B) (NVIDIA)
  18. RWKV (14B)
  19. BLOOM (1B, 3B, 7B)
  20. GPT-4 (OpenAI)
  21. GPT-3.5 (OpenAI)
  22. GPT-3 (ada, babbage, curie, davinci) (OpenAI)
  23. Codex (cushman, davinci) (OpenAI)
  24. T5 (11B) (Google)
  25. CPM-Bee (10B)

Fine-tuned models

  1. Alpaca (7B)
  2. Convo (6B)
  3. J1-Grande-Instruct (17B) (AI21)
  4. InstructGPT (175B)
  5. BLOOMZ (176B)
  6. Flan-UL2 (20B)
  7. Flan-T5 (11B)
  8. T0 (11B)
  9. Galactica (120B) (Meta)

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.