Finetuning a model to identify features of a product

Note: This example requires a GPU for finetuning. If you don't have a machine with GPUs handy, you can use the Colab version below with free T4 GPUs.

Open In Colab

We will go through a small example here to create data with Ollama using Curator, finetune with Unsloth, and then evaluate it again using Curator.

Imagine you are a product wizard at a fictional product company called Azanom Inc., and want to highlight product features in the description of each product.

Code for displaying product
from IPython.display import HTML, display
import re

def display_product(
    product_name,
    description,
    features,
    image_url,
):
"""Displays a product give its product_name, features, and description"""
  # Product description and features
  def highlight_features(text, features):
      # Sort features by length in descending order to handle overlapping matches
      sorted_features = sorted(features, key=len, reverse=True)

      # Create a copy of the text for highlighting
      highlighted_text = text

      # Replace each feature with its highlighted version
      for feature in sorted_features:
          pattern = re.compile(re.escape(feature), re.IGNORECASE)
          highlighted_text = pattern.sub(
              f'<span class="highlight">{feature}</span>',
              highlighted_text
          )

      return highlighted_text

  # Create HTML content with CSS styling
  html_content = f"""
  <style>
      .product-container {{
          max-width: 800px;
          margin: 20px auto;
          padding: 30px;
          font-family: 'Segoe UI', Arial, sans-serif;
          line-height: 1.6;
          background: white;
          border-radius: 12px;
          box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
      }}

      .product-title {{
          color: #1d1d1f;
          font-size: 28px;
          margin-bottom: 20px;
          text-align: center;
      }}

      .product-image {{
          width: 100%;
          max-width: 600px;
          height: auto;
          margin: 0 auto 30px;
          display: block;
          border-radius: 8px;
      }}

      .product-description {{
          color: #333;
          font-size: 16px;
          margin-bottom: 20px;
      }}

      .highlight {{
          background: linear-gradient(120deg, rgba(37, 99, 235, 0.1) 0%, rgba(37, 99, 235, 0.2) 100%);
          border-radius: 4px;
          padding: 2px 4px;
          transition: background 0.3s ease;
      }}

      .highlight:hover {{
          background: linear-gradient(120deg, rgba(37, 99, 235, 0.2) 0%, rgba(37, 99, 235, 0.3) 100%);
          cursor: pointer;
      }}
  </style>

  <div class="product-container">
      <h1 class="product-title">{product_name}</h1>
  """
  if image_url:
    html_content += f'<img class="product-image" src="{image_url}" width="300px" alt="{product_name}">'
  html_content += f"""<p class="product-description">
          {highlight_features(description, features)}
      </p>
  </div>"""

  display(HTML(html_content))


display_product(
    product_name="Apple Airpods Pro",
    description="The Apple AirPods Pro are a pair of wireless earbuds that are designed for comfort and convenience. They are lightweight in-ear earbuds and contoured for a comfortable fit, and they sit at an angle for easy access to the controls. The AirPods Pro also have a stem that is 33% shorter than the second generation AirPods, which makes them more compact and easier to store. The AirPods Pro also have a force sensor to easily control music and calls, and they have Spatial Audio with dynamic head tracking, which provides an immersive, three-dimensional listening experience.",
    features=[
    "lightweight in-ear earbuds",
    "contoured design",
    "sits at an angle for comfort",
    "better direct audio to your ear",
    "stem is 33% shorter than the second generation AirPods",
    "force sensor to easily control music and calls",
    "Spatial Audio with dynamic head tracking",
    "immersive, three-dimensional listening experience"],
    image_url="https://store.storeimages.cdn-apple.com/4982/as-images.apple.com/is/airpods-pro-2-hero-select-202409_FMT_WHH?wid=750&hei=556&fmt=jpeg&qlt=90&.v=1724041668836")

Given a product and its description, your first instinct is to use GPT-4o, to get the features given a product description. But you quickly realize that you don't need a jackhammer to nail this one and want to find a much cheaper and scalable alternative.

So let's try to train a 1B model by generating data from a 8B model. This data generation should cost $0. We can always use bigger models to generate higher-quality data.

Note that we have simplified this example for demonstration purposes.

Installation

Import the required library

Generate training data using Llama-3.1-8B

Our goal is to extract features from the product descriptions.

We will use Curator to easily generate a dataset of products and their features. We seed the dataset with personas from PersonaHub and create products for each persona for diverse products. To make the data generation process easy for LLMs, we include the and tag in the output description. This way, we get high quality descriptions for given features.

We can then create a ProductCurator object and curate products using personas

Next, let's create some products for the personas with ProductCurator! This can take a while. You can use Together.ai or Deepinfra through Curator and LiteLLM to speed up this up.

Here's an example of a generated product:

Example generated product

PERSONA: A Political Analyst specialized in El Salvador's political landscape.

PRODUCT: Salvadoria: El Salvador's Political Landscape Analyzer

DESCRIPTION: Salvadoria is a cutting-edge tool designed specifically for Political Analysts specializing in El Salvador's political landscape. It offers Advanced natural language processing for news articles and social media posts, allowing users to quickly analyze the tone, sentiment, and key themes of online discussions. The customizable keyword alert system enables analysts to track specific topics and hashtags in real-time, ensuring they stay up-to-date on the latest developments. Salvadoria also features an interactive map of El Salvador with election results, demographic data, and key infrastructure information, providing a comprehensive view of the country's political landscape. With access to a comprehensive database of past elections, including voter turnout, candidate performance, and electoral district boundaries, analysts can gain valuable insights into historical trends and patterns. The tool also includes an in-depth analysis of government spending, revenue, and budget allocation by department and agency, allowing users to identify areas of inefficiency or potential corruption. Salvadoria's real-time tracking of public opinion polls, surveys, and focus groups on various political issues keeps analysts informed about shifting public sentiment and policy preferences. Users can customize their dashboard with a range of visualizations and metrics using the customizable dashboard, while also exporting data in CSV format for further analysis or integration with other tools via the ability to export data in CSV format. Regular updates include new data, including special reports on election forecasts, economic indicators, and policy changes, which are integrated seamlessly through the integration with popular spreadsheet software.

FEATURES:

  • advanced natural language processing for news articles and social media posts

  • keyword alert system

  • interactive map of El Salvador with election results, demographic data, and key infrastructure information

  • comprehensive database of past elections

  • in-depth analysis of government spending, revenue, and budget allocation by department and agency

  • real-time tracking of public opinion polls, surveys, and focus groups

  • customizable dashboard

  • ability to export data in CSV format

  • integration with popular spreadsheet software

Evaluate the baseline performance on eval data from gpt-4o-mini

Set up the EvaluationLLM object using curator

We can create an EvaluationLLM object to evaluate the performance of our models

We also set up some utilities to run the evaluation, calculate precision, recall, and F1 metrics, and tabulate them in a nice format.

Create an eval set with gpt-4o-mini

In order to prevent bias from using the same model to generate train and eval data, we are not going to create a train and test split using the newly created data from Llama-3.1-8B. Instead, we will use all of it for training but generate eval data with a completely different LLM, gpt-4o-mini.

Run the evaluation and get results

We can see that Llama-3.1-8B is not able to extract the features as well as 8B (as expected). So, we will finetune it on the training set.

Finetune Llama3.2-1B using Unsloth

Prepare data for finetuning

Run SFT finetuning with Unsloth

Save the finetuned model and serving it using Ollama

Final results

Running the evaluation on the new finetuned model, we found that F1 for Llama-3.2-1B model jumped from 0.496 to 0.688, a significant improvement!

Just for fun, let try running our new finetuned model on a new example:

This is not bad for a quick start. In some cases, you will see that the LLM doesn't output exact text (which happens for even GPT-4o)!

Great next steps:

  1. Increase the number of training examples.

  2. Systematically evaluate the error types.

  3. Run this in local machine and run curator-viewer to visualize your data.

  4. Create complex strategies for data curation (involving multiple curator.LLM stages).

  5. Star https://github.com/bespokelabsai/curator/!

Last updated