Using SimpleStrat block for generating diverse data
StratifiedGenerator: Generate Balanced Question-Answer Pairs
Overview
Installation
pip install bespokelabs-curatorQuick Start Example
from datasets import Dataset
from bespokelabs.curator.blocks.simplestrat import StratifiedGenerator
# Create a simple dataset of questions
questions = Dataset.from_dict({"question": [f"{i}. Name a periodic element" for i in range(20)]})
# Initialize the generator with your preferred model
generator = StratifiedGenerator(model_name="gpt-4o-mini")
# Generate stratified QA pairs
qa_pairs = generator(questions).dataset
# Examine the results
print(f"Generated {len(qa_pairs)} QA pairs")
print(qa_pairs[0]) # View the first QA pairHow StratifiedGenerator Works
Advanced Usage
Customizing the Generator
Saving and Loading Results
Performance Considerations
Common Applications
Troubleshooting
Additional Resources
Last updated