Aphrodite Tool Parser Plugin System
Tool parsers in Aphrodite help convert model outputs into OpenAI-compatible function/tool calls. This guide goes through creating and using custom tool parsers.
Creating a Tool Parser Plugin
Basic Structure
Create a new Python file (e.g. my_tool_parser.py
) with the following structure:
from typing import Dict, Sequence, Union
from aphrodite.endpoints.openai.protocol import ( ChatCompletionRequest, DeltaMessage, ExtractedToolCallInformation, ToolCall, FunctionCall)from aphrodite.endpoints.openai.tool_parsers.abstract_tool_parser import ( ToolParser, ToolParserManager)
@ToolParserManager.register_module(["my_parser"]) # register the unique tool nameclass MyToolParser(ToolParser): def __init__(self, tokenizer): super().__init__(tokenizer) # add any custom initialization here
def adjust_request(self, request: ChatCompletionRequest) -> ChatCompletionRequest: """Optional: Modify the request before processing""" return request
def extract_tool_calls( self, model_output: str, request: ChatCompletionRequest, ) -> ExtractedToolCallInformation: """Handle non-streaming tool call extraction""" # implementation here
def extract_tool_calls_streaming( self, previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], ) -> Union[DeltaMessage, None]: """Handle streaming tool call extraction""" # implementation here
Required Methods
extracted_tool_calls
This method handles complete (non-streaming) responses.
- Input: Complete model output string.
- Output:
ExtractedToolCallInformation
obj containing:tools_called
: bool indicating if tool calls were detectedtool_calls
: List of parsed tool callscontent
: Any non-tool-call content Example:
def extract_tool_calls( self, model_output: str, request: ChatCompletionRequest,) -> ExtractedToolCallInformation: # Example: Parse format "FUNCTION_NAME(arguments)" if "(" not in model_output: return ExtractedToolCallInformation( tools_called=False, tool_calls=[], content=model_output )
try: name, args = model_output.split("(", 1) args = args.rsplit(")", 1)[0]
tool_calls = [ ToolCall( function=FunctionCall( name=name.strip(), arguments=args.strip() ) ) ]
return ExtractedToolCallInformation( tools_called=True, tool_calls=tool_calls, content=None ) except Exception: return ExtractedToolCallInformation( tools_called=False, tool_calls=[], content=model_output )
extract_tool_calls_streaming
This method handles streaming responses token by token:
- called for each new token
- must track state between calls
- return
None
if not ready to send delta - return
DeltaMessage
when ready to send updates
Optional Methods
adjust_request
Modify the request before processing if needed.
def adjust_request( self, request: ChatCompletionRequest) -> ChatCompletionRequest: # Example: Disable special token skipping request.skip_special_tokens = False return request
Using the Tool Parser
Starting the Server
Launch a local Aphrodite instance:
aphrodite run <your_model> \ --tool-parser-plugin /path/to/my_tool_parser.py \ --tool-call-parser my_parser \ --enable-auto-tool-choice \ --chat-template /path/to/chat_template.jinja
Making API calls
Use the OpenAI API format:
import openai
response = openai.ChatCompletion.create( model="your_model", messages=[ {"role": "user", "content": "What's the weather in Paris?"} ], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get the weather in a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } } }], tool_choice="auto")
Example Implementations
For a more complete example set, see tool_parsers.