# Curate Reasoning data with Claude-3.7 Sonnet

You can use **Sonnet** reasoning model in **Curator** to generate  synthetic data. In this example, we will answer some questions with reasoning traces from claude sonnet 3.7, but the approach can be adapted for any data generation task.

## **Prerequisites**

* **Python 3.10+**
* **Curator**: Install via `pip install bespokelabs-curator`
* **Anthropic:** Anthropic API key&#x20;

## **Steps**

#### **1. Setup environment vars**

```sh
export ANTHROPIC_API_KEY=<your_api_key>
```

**2.  Create a curator.LLM subclass**

Create a class that inherits from `curator.LLM`. Implement two key methods:

* `prompt()`: Generates the prompt for the LLM.
* `parse()`: Processes the LLM's response into your desired format.

Here’s the implementation:

```python
"""Example of reasoning on simple questions using curator."""

import os
from datasets import load_dataset
from bespokelabs import curator

class Reasoner(curator.LLM):
    return_completions_object = True

    def prompt(self, input):
        return input["question"]

    def parse(self, input, response):
        """Parse the LLM response to extract reasoning and solution."""
        content = response["content"]
        thinking = ""
        text = ""
        for content_block in content:
            if content_block["type"] == "thinking":
                thinking = content_block["thinking"]
            elif content_block["type"] == "text":
                text = content_block["text"]
            elif content_block["type"] == "redacted_thinking":
                print("Redacted thinking block! (notifying you for fun)")

        input["claude_thinking_trajectory"] = thinking
        input["claude_attempt"] = text
        return input
```

#### **3. Configure the Anthropic model**

<pre class="language-python"><code class="lang-python"><strong>llm = Reasoner(
</strong>    model_name="claude-3-7-sonnet-20250219",
    generation_params={"max_tokens": 20000, "thinking": {"type": "enabled", "budget_tokens": 18000}},
    batch=False,
    backend="anthropic",
    backend_params={"require_all_responses": False},
)
</code></pre>

#### **4. Generate Data**

Generate the structured data and output the results as a pandas DataFrame:

```python
ds = llm([
    {"question": "How to solve for world peace?"},
    {"question": "What is the fifteenth prime number?"},
])
print(ds.dataset)
print(ds.dataset[0])
```

### **Example Output**

Using the above example, the output might look like this:

| question                            | claude\_thinking\_trajectory                      | claude\_attempt                                   |
| ----------------------------------- | ------------------------------------------------- | ------------------------------------------------- |
| How to solve for world peace?       | This is a question about solving for world pea... | The Path to World Peace\n\nWorld peace is on...   |
| What is the fifteenth prime number? | Let me list out the prime numbers in order to ... | The fifteenth prime number is 47.\n\nThe seque... |

## **Api Reference**

* Check out complete [configuration ](https://docs.bespokelabs.ai/bespoke-curator/api-reference/llm-api-documentation#online-mode-parameters)
