A List of 1 Billion+ Parameter LLMs
There are already over 50 different 1B+ parametersLLMs accessible via open-source checkpoints or proprietary APIs. That’s not counting any private models or models with academic papers but no available API or model weights. There’s even more if you count fine-tuned models like Alpaca or InstructGPT. A list of the ones I know about (this is an evolving document).
- GPT-J (6B) (EleutherAI)
- GPT-Neo (1.3B, 2.7B, 20B) (EleutherAI)
- Pythia (1B, 1.4B, 2.8B, 6.9B, 12B)
- Polyglot (1.3B, 3.8B, 5.8B)
- J1 (7.5B, 17B, 178B) (AI21)
- LLaMa (7B, 13B, 33B, 65B) (Meta)
- OPT (1.3B, 2.7B, 13B, 30B, 66B, 175B) (Meta)
- Fairseq (1.3B, 2.7B, 6.7B, 13B) (Meta)
- Cerebras-GPT (1.3B, 2.7B, 6.7B, 13B) (Cerebras)
- GLM-130B
- YaLM (100B) (Yandex)
- UL2 20B (Google)
- PanGu-α (200B) (Huawei)
- Cohere (Medium, XLarge)
- Claude (instant-v1.0, v1.2) (Anthropic)
- CodeGen (2B, 6B, 16B) (Salesforce)
- NeMo (1.3B, 5B, 20B) (NVIDIA)
- RWKV (14B)
- BLOOM (1B, 3B, 7B)
- GPT-4 (OpenAI)
- GPT-3.5 (OpenAI)
- GPT-3 (ada, babbage, curie, davinci) (OpenAI)
- Codex (cushman, davinci) (OpenAI)
- T5 (11B) (Google)
- CPM-Bee (10B)
Fine-tuned models