Skip to content

Supported Models

Aphrodite supports a large variety of generative Transformer models in Hugging Face Transformers. The following is the list of model architectures that we currently support.

Decoder-only Language Models

ArchitectureExample HF Model
AquilaForCausalLMBAAI/AquilaChat-7B
ArcticForCausalLMSnowflake/snowflake-arctic-instruct
BaiChuanForCausalLMbaichuan-inc/Baichuan2-13B-Chat
BloomForCausalLMbigscience/bloomz
ChatGLMModelTHUDM/chatglm3-6b
CohereForCausalLMCohereForAI/c4ai-command-r-v01
DbrxForCausalLMdatabricks/dbrx-instruct
DeciLMForCausalLMDeciLM/DeciLM-7B
FalconForCausalLMtiiuae/falcon-7b
GemmaForCausalLMgoogle/gemma-7b
Gemma2ForCausalLMgoogle/gemma-2-9b
GPT2LMHeadModelgpt2
GPTBigCodeForCausalLMbigcode/starcoder
GPTJForCausalLMpygmalionai/pygmalion-6b
GPTNeoXForCausalLMEleutherAI/pythia-12b
InternLMForCausalLMinternlm/internlm-7b
InternLM2ForCausalLMinternlm/internlm2-7b
JAISLMHeadModelcore42/jais-13b
JambaForCausalLMai21labs/Jamba-v0.1
LlamaForCausalLMmeta-llama/Meta-Llama-3.1-8B
MiniCPMForCausalLMopenbmb/MiniCPM-2B-dpo-bf16
MistralForCausalLMmistralai/Mistral-7B-v0.1
MixtralForCausalLMmistralai/Mixtral-8x7B-v0.1
MPTForCausalLMmosaicml/mpt-7b
NemotronForCausalLMnvidia/Minitron-8B-Base
OLMoForCausalLMallenai/OLMo-7B-hf
OPTForCausalLMfacebook/opt-66b
OrionForCausalLMOrionStarAI/Orion-14B-Chat
PhiForCausalLMmicrosoft/phi-2
Phi3ForCausalLMmicrosoft/Phi-3-medium-128k-instruct
Phi3SmallForCausalLMmicrosoft/Phi-3-small-128k-instruct
PersimmonForCausalLMadept/persimmon-8b-chat
QwenLMHeadModelQwen/Qwen-7B
Qwen2ForCausalLMQwen/Qwen2-72B
Qwen2MoeForCausalLMQwen/Qwen1.5-MoE-A2.7B
StableLmforCausalLMstabilityai/stablelm-3b-4e1t
Starcoder2ForCausalLMbigcode/starcoder2-3b
XverseForCausalLMxverse/XVERSE-65B-Chat

INFO

On ROCm platforms, Mistral and Mixtral are capped to 4096 max context length due to sliding window issues.

Encoder-Decoder Language Models

ArchitectureExample Model
BartForConditionalGenerationfacebook/bart-large-cnn

Multimodal Language Models

ArchitectureSupported ModalitiesExample Model
Blip2ForConditionalGenerationImageSalesforce/blip2-opt-6.7b
ChameleonForConditionalGenerationImagefacebook/chameleon-7b
FuyuForCausalLMImageadept/fuyu-8b
InternVLChatModelImageOpenGVLab/InternVL2-8B
LlavaForConditionalGenerationImagellava-hf/llava-v1.5-7b-hf
LlavaNextForConditionalGenerationImagellava-hf/llava-v1.6-mistral-7b-hf
PaliGemmaForConditionalGenerationImagegoogle/paligemma-3b-pt-224
Phi3VForCausalLMImagemicrosoft/Phi-3.5-vision-instruct
MiniCPMVImageopenbmb/MiniCPM-V-2_6

If your model uses any of the architectures above, you can seamlessly run your model with Aphrodite.