Providers like OpenAI and Anthropic offer batch mode, which allows you to upload a bunch of prompts to be processed asynchronously, for lower costs (typically 50%). However, these APIs are often very cumbersome to manage:
You typically have to prepare your batch file, upload it, and poll for responses periodically.
Large datasets will typically not fit in a single batch due to batch size limits, and so you will need to split your dataset into mutiple smaller batches, increasing the complexity you need to manage.
With Curator, you only need to toggle a single flag to save $$$, without any headache!
Using batch mode
Let's look at a simple example of reannotating instructions from the WildChat dataset with new responses from gpt-4o-mini.
First, we need to load the WildChat dataset using HuggingFace:
from datasets import load_datasetdataset =load_dataset("allenai/WildChat",split="train")dataset = dataset.select(range(3_000))# Select a subset of 3,000 samples
We then create a new LLM class and apply to dataset. All you need to do to enable batching is setting batch=True when initializing your LLM object, and you're done!
from bespokelabs import curatorclassWildChatReannotator(curator.LLM):"""A reannotator for the WildChat dataset."""defprompt(self,input:dict)->str:"""Extract the first message from a conversation to use as the prompt."""returninput["conversation"][0]["content"]defparse(self,input:dict,response:str)->dict:"""Parse the model response along with the input to the model into the desired output format.""" instruction =input["conversation"][0]["content"]return{"instruction": instruction,"new_response": response}# Initialize the reannotator with batch processingreannotator =WildChatReannotator(model_name="gpt-4o-mini",batch=True,# Enable batch processingbackend_params={"batch_size":1_000},# Specify batch size)reannotated_dataset =reannotator(dataset).dataset
Supported Models
Check out how-to guides for using batch mode with our supported providers: