Supported Models
Aphrodite supports a large variety of generative Transformer models in Hugging Face Transformers. The following is the list of model architectures that we currently support.
Decoder-only Language Models
Architecture | Example HF Model |
---|---|
AquilaForCausalLM | BAAI/AquilaChat-7B |
ArcticForCausalLM | Snowflake/snowflake-arctic-instruct |
BaiChuanForCausalLM | baichuan-inc/Baichuan2-13B-Chat |
BloomForCausalLM | bigscience/bloomz |
ChatGLMModel | THUDM/chatglm3-6b |
CohereForCausalLM | CohereForAI/c4ai-command-r-v01 |
DbrxForCausalLM | databricks/dbrx-instruct |
DeciLMForCausalLM | DeciLM/DeciLM-7B |
FalconForCausalLM | tiiuae/falcon-7b |
GemmaForCausalLM | google/gemma-7b |
Gemma2ForCausalLM | google/gemma-2-9b |
GPT2LMHeadModel | gpt2 |
GPTBigCodeForCausalLM | bigcode/starcoder |
GPTJForCausalLM | pygmalionai/pygmalion-6b |
GPTNeoXForCausalLM | EleutherAI/pythia-12b |
InternLMForCausalLM | internlm/internlm-7b |
InternLM2ForCausalLM | internlm/internlm2-7b |
JAISLMHeadModel | core42/jais-13b |
JambaForCausalLM | ai21labs/Jamba-v0.1 |
LlamaForCausalLM | meta-llama/Meta-Llama-3.1-8B |
MiniCPMForCausalLM | openbmb/MiniCPM-2B-dpo-bf16 |
MistralForCausalLM | mistralai/Mistral-7B-v0.1 |
MixtralForCausalLM | mistralai/Mixtral-8x7B-v0.1 |
MPTForCausalLM | mosaicml/mpt-7b |
NemotronForCausalLM | nvidia/Minitron-8B-Base |
OLMoForCausalLM | allenai/OLMo-7B-hf |
OPTForCausalLM | facebook/opt-66b |
OrionForCausalLM | OrionStarAI/Orion-14B-Chat |
PhiForCausalLM | microsoft/phi-2 |
Phi3ForCausalLM | microsoft/Phi-3-medium-128k-instruct |
Phi3SmallForCausalLM | microsoft/Phi-3-small-128k-instruct |
PersimmonForCausalLM | adept/persimmon-8b-chat |
QwenLMHeadModel | Qwen/Qwen-7B |
Qwen2ForCausalLM | Qwen/Qwen2-72B |
Qwen2MoeForCausalLM | Qwen/Qwen1.5-MoE-A2.7B |
StableLmforCausalLM | stabilityai/stablelm-3b-4e1t |
Starcoder2ForCausalLM | bigcode/starcoder2-3b |
XverseForCausalLM | xverse/XVERSE-65B-Chat |
INFO
On ROCm platforms, Mistral and Mixtral are capped to 4096 max context length due to sliding window issues.
Encoder-Decoder Language Models
Architecture | Example Model |
---|---|
BartForConditionalGeneration | facebook/bart-large-cnn |
Multimodal Language Models
Architecture | Supported Modalities | Example Model |
---|---|---|
Blip2ForConditionalGeneration | Image | Salesforce/blip2-opt-6.7b |
ChameleonForConditionalGeneration | Image | facebook/chameleon-7b |
FuyuForCausalLM | Image | adept/fuyu-8b |
InternVLChatModel | Image | OpenGVLab/InternVL2-8B |
LlavaForConditionalGeneration | Image | llava-hf/llava-v1.5-7b-hf |
LlavaNextForConditionalGeneration | Image | llava-hf/llava-v1.6-mistral-7b-hf |
PaliGemmaForConditionalGeneration | Image | google/paligemma-3b-pt-224 |
Phi3VForCausalLM | Image | microsoft/Phi-3.5-vision-instruct |
MiniCPMV | Image | openbmb/MiniCPM-V-2_6 |
If your model uses any of the architectures above, you can seamlessly run your model with Aphrodite.