Using Ollama with Curator
You can use Ollama as a backend for Curator to generate structured synthetic data. In this example, we will generate a list of countries and their capitals, but the approach can be adapted for any data generation task.
Prerequisites
- Python 3.10+ 
- Curator: Install via - pip install bespokelabs-curator
- Ollama: Download via https://ollama.com/download 
Steps
1. Create a curator.LLM subclass
Create a class that inherits from curator.LLM. Implement two key methods:
- prompt(): Generates the prompt for the LLM.
- parse(): Processes the LLM's response into your desired format.
Here’s the implementation:
from bespokelabs import curator
from pydantic import BaseModel, Field
class Location(BaseModel):
    country: str = Field(description="The name of the country")
    capital: str = Field(description="The name of the capital city")
class LocationList(BaseModel):
    locations: list[Location] = Field(description="A list of locations")
class SimpleOllamaGenerator(curator.LLM):
    response_format = LocationList
    def prompt(self, input: dict) -> str:
        return "Return five countries and their capitals."
    def parse(self, input: dict, response: str) -> dict:
        return [{"country": output.country, "capital": output.capital} for output in response.locations]2. Configure the Ollama Backend
- Start Ollama server with - llama3.1:8bmodel.
ollama pull llama3.1:8b
ollama serve- Initialize your generator with Ollama configuration: 
llm = SimpleOllamaGenerator(
    model_name="ollama/llama3.1:8b",  # Ollama model identifier
    backend_params={"base_url": "http://localhost:11434"},  # Ollama instance
)3. Generate Data
Generate the structured data and output the results as a pandas DataFrame:
locations = llm()
print(locations.dataset.to_pandas())Example Output
Using the above example, the output might look like this:
France
Paris
Japan
Tokyo
Germany
Berlin
India
New Delhi
Brazil
Brasília
Ollama Configuration
Use base_url in the backend_params to specify the connection URL.
Example:
backend_params={"base_url": "http://localhost:11434"}Last updated
