# Curate Reasoning data with Claude-3.7 Sonnet

You can use **Sonnet** reasoning model in **Curator** to generate  synthetic data. In this example, we will answer some questions with reasoning traces from claude sonnet 3.7, but the approach can be adapted for any data generation task.

## **Prerequisites**

* **Python 3.10+**
* **Curator**: Install via `pip install bespokelabs-curator`
* **Anthropic:** Anthropic API key&#x20;

## **Steps**

#### **1. Setup environment vars**

```sh
export ANTHROPIC_API_KEY=<your_api_key>
```

**2.  Create a curator.LLM subclass**

Create a class that inherits from `curator.LLM`. Implement two key methods:

* `prompt()`: Generates the prompt for the LLM.
* `parse()`: Processes the LLM's response into your desired format.

Here’s the implementation:

```python
"""Example of reasoning on simple questions using curator."""

import os
from datasets import load_dataset
from bespokelabs import curator

class Reasoner(curator.LLM):
    return_completions_object = True

    def prompt(self, input):
        return input["question"]

    def parse(self, input, response):
        """Parse the LLM response to extract reasoning and solution."""
        content = response["content"]
        thinking = ""
        text = ""
        for content_block in content:
            if content_block["type"] == "thinking":
                thinking = content_block["thinking"]
            elif content_block["type"] == "text":
                text = content_block["text"]
            elif content_block["type"] == "redacted_thinking":
                print("Redacted thinking block! (notifying you for fun)")

        input["claude_thinking_trajectory"] = thinking
        input["claude_attempt"] = text
        return input
```

#### **3. Configure the Anthropic model**

<pre class="language-python"><code class="lang-python"><strong>llm = Reasoner(
</strong>    model_name="claude-3-7-sonnet-20250219",
    generation_params={"max_tokens": 20000, "thinking": {"type": "enabled", "budget_tokens": 18000}},
    batch=False,
    backend="anthropic",
    backend_params={"require_all_responses": False},
)
</code></pre>

#### **4. Generate Data**

Generate the structured data and output the results as a pandas DataFrame:

```python
ds = llm([
    {"question": "How to solve for world peace?"},
    {"question": "What is the fifteenth prime number?"},
])
print(ds.dataset)
print(ds.dataset[0])
```

### **Example Output**

Using the above example, the output might look like this:

| question                            | claude\_thinking\_trajectory                      | claude\_attempt                                   |
| ----------------------------------- | ------------------------------------------------- | ------------------------------------------------- |
| How to solve for world peace?       | This is a question about solving for world pea... | The Path to World Peace\n\nWorld peace is on...   |
| What is the fifteenth prime number? | Let me list out the prime numbers in order to ... | The fifteenth prime number is 47.\n\nThe seque... |

## **Api Reference**

* Check out complete [configuration ](https://docs.bespokelabs.ai/bespoke-curator/api-reference/llm-api-documentation#online-mode-parameters)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bespokelabs.ai/bespoke-curator/data-curation-recipes/curate-reasoning-data-with-claude-3.7-sonnet.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
