Aspect based sentiment analysis

Introduction

In this notebook, we will demonstrate how to use Curator to distill capabilities from a large language model to a much smaller 8B parameter model.

We will use Yelp restaurant reviews dataset to train a sentiment analysis model. We will generate a synthetic dataset using curator and finetune a model using Together's finetuning API.

Open In Colab

Example input:

The food was good, but the service was slow.

Example output:

{
    "food_sentiment": "Positive",
    "service_sentiment": "Negative"
}

Installation

!pip install bespokelabs-curator datasets together

Imports

from bespokelabs import curator
from datasets import load_dataset
from together import Together
import os
import json

Dataset Curation

The data curation process is pretty simple. We will use a prompt to instruct the model to analyze the review and output the sentiment for each aspect.

Note that here we are not using structured outputs, since the same prompt/curator block will be used to evaluate the base model that we finetune below (Llama-3.1-8B-Instruct). We use json mode instead of structured outputs below since many small models don't support that

We will run this curator on yelp restaurant reviews dataset to generate aspect based sentiment annotations for each review.

Creating the finetuning dataset

We will create a train test split and use the curated dataset to finetune a smaller model.

Evaluating the base model

Output

Above, we can see that the overall accuracy is 82.7% and the aspect accuracies are not very good.

Thus we will use the curated dataset to finetune a 8B parameter model. Below is the dataset if you wish to analyze further:

Formatting the dataset for finetuning

Output

Comparing Results

Metric
Base Model
Fine-tuned Model
% Improvement

0

Overall Accuracy

0.827

0.918

10.95 %

1

Food Sentiment

0.866

0.904

4.32 %

2

Service Sentiment

0.925

0.941

1.70 %

3

Ambience Sentiment

0.705

0.886

25.70 %

4

Price Sentiment

0.815

0.874

9.66 %

5

Overall Sentiment

0.825

0.965

16.95 %

Conclusion

We can see that the fine-tuned model has higher overall accuracy and also better aspect accuracies. Also, it is 13.8x cheaper than the teacher model ($0.18 for the 8B model on together.ai vs. $2.5 for GPT-4o, per million tokens)! As next steps, we can rerun with a larger dataset and better hyperparameter settings, to match the performance of GPT-4o.

Last updated