Installation with Neuron
Aphrodite supports inference with AWS Trainium/Inferentia chips. At the moment Paged Attention is not supported in Neuron SDK, but naive continuous batching is supported in transformers-neuronx. Data types currently supported in Neuron SDK are FP16 and BF16.
Requirements
- Linux
- Python 3.8 - 3.11
- Accelerator: NeuronCore_v2 (in trn1/inf2 instances)
- PyTorch 2.0.1/2.1.1
- AWS Neuron SDK 2.16/2.17
Building from Source
The following instructions are for Neuron SDK 2.16 and above.
Launch Trn1/Inf2 instances
Here are the steps to launch trn1/inf2 instances, in order to install PyTorch Neuron (“torch-neuronx”) Setup on Ubuntu 22.04 LTS.
- Follow the instructions at launch an Amazon EC2 Instance.
- Refer to these pages for more info about instance sizes and pricing: Trainium1, Inferentia2.
- Select Ubuntu Server 22.02 TLS AMI.
- When launching, adjust your primary EBS volume size to a minimum of 512GB.
- After launching, follow the instructions in Connect to your instance.
Install drivers and tools
If Deep Learning AMI Neuron is installed, this step is unnecessary. Otherwise, follow this:
# Configure Linux for Neuron repository updates. /etc/os-releasesudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <<EOFdeb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} mainEOFwget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | sudo apt-key add -
# Update OS packagessudo apt-get update -y
# Install OS headerssudo apt-get install linux-headers-$(uname -r) -y
# Install gitsudo apt-get install git -y
# install Neuron Driversudo apt-get install aws-neuronx-dkms=2.* -y
# Install Neuron Runtimesudo apt-get install aws-neuronx-collectives=2.* -ysudo apt-get install aws-neuronx-runtime-lib=2.* -y
# Install Neuron Toolssudo apt-get install aws-neuronx-tools=2.* -y
# Add PATHexport PATH=/opt/aws/neuron/bin:$PATH
Install transformers-neuronx
The backend we use for inference is transformers-neuronx. Follow the instructions below to install it:
# Install Python venvsudo apt-get install -y python3.10-venv g++
# Create Python venvpython3.10 -m venv aws_neuron_venv_pytorch
# Activate Python venvsource aws_neuron_venv_pytorch/bin/activate
# Install Jupyter notebook kernelpip install ipykernelpython3.10 -m ipykernel install --user --name aws_neuron_venv_pytorch --display-name "Python (torch-neuronx)"pip install jupyter notebookpip install environment_kernels
# Set pip repository pointing to the Neuron repositorypython -m pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com
# Install wget, awsclipython -m pip install wgetpython -m pip install awscli
# Update Neuron Compiler and Frameworkpython -m pip install --upgrade neuronx-cc==2.* --pre torch-neuronx==2.1.* torchvision transformers-neuronx
Install Aphrodite from Source
git clone https://github.com/PygmalionAI/aphrodite-engine.gitcd aphrodite-enginepip install -U -r requirements-neuron.txtAPHRODITE_TARGET_DEVICE="neuron" pip install -e .