Aspect based sentiment analysis
Introduction
In this notebook, we will demonstrate how to use Curator to distill capabilities from a large language model to a much smaller 8B parameter model.
We will use Yelp restaurant reviews dataset to train a sentiment analysis model. We will generate a synthetic dataset using curator and finetune a model using Together's finetuning API.
Example input:
The food was good, but the service was slow.Example output:
{
"food_sentiment": "Positive",
"service_sentiment": "Negative"
}Installation
!pip install bespokelabs-curator datasets togetherImports
from bespokelabs import curator
from datasets import load_dataset
from together import Together
import os
import jsonDataset Curation
The data curation process is pretty simple. We will use a prompt to instruct the model to analyze the review and output the sentiment for each aspect.
Note that here we are not using structured outputs, since the same prompt/curator block will be used to evaluate the base model that we finetune below (Llama-3.1-8B-Instruct). We use json mode instead of structured outputs below since many small models don't support that
We will run this curator on yelp restaurant reviews dataset to generate aspect based sentiment annotations for each review.
Creating the finetuning dataset
We will create a train test split and use the curated dataset to finetune a smaller model.
Evaluating the base model
Output
Above, we can see that the overall accuracy is 82.7% and the aspect accuracies are not very good.
Thus we will use the curated dataset to finetune a 8B parameter model. Below is the dataset if you wish to analyze further:
Formatting the dataset for finetuning
Output
Comparing Results
0
Overall Accuracy
0.827
0.918
10.95 %
1
Food Sentiment
0.866
0.904
4.32 %
2
Service Sentiment
0.925
0.941
1.70 %
3
Ambience Sentiment
0.705
0.886
25.70 %
4
Price Sentiment
0.815
0.874
9.66 %
5
Overall Sentiment
0.825
0.965
16.95 %
Conclusion
We can see that the fine-tuned model has higher overall accuracy and also better aspect accuracies. Also, it is 13.8x cheaper than the teacher model ($0.18 for the 8B model on together.ai vs. $2.5 for GPT-4o, per million tokens)! As next steps, we can rerun with a larger dataset and better hyperparameter settings, to match the performance of GPT-4o.
Last updated