Member-only story

PART 1: How to enhance your fine-tuning journey using GPT-4 and BARThez for summarization?

Pydathon
8 min readJun 16, 2023

--

Photo by Sergei A on Unsplash

In the bustling world of the 21st century, the information highway has led us to a crossroads where time has become the most valuable commodity. Emails, one of the chief modes of communication in this digital age, often take a toll on this precious resource. For professionals dealing with hundreds of emails every day, the task of parsing through them can be daunting, time-consuming, and frequently counterproductive. What if there was a way to streamline this process, making it more efficient and less time-intensive?

GPT-4 seems to be good for summarizing. However, this model is not open-source, and it could lead to high bills using it daily. A cool solution to this problem could be open-source models, using GPT-4 as a training set generator.

In this article, we’re gonna try to fix these problems:

  • Is GPT-4 suitable to create a fine-tuning dataset?
  • Is it worth it to fine-tune an already fine-tuned model?

Goals

  1. Use Gmail API to retreive emails
  2. Generate training set using GPT-4 API
  3. Summarize French emails by Fine Tuning BARThez

Requirements

--

--

Pydathon
Pydathon

Written by Pydathon

Data Scientist, I like to explore different subjects, and I would like to become ML engineer. Hope you will like my writings! :)

Responses (1)

Write a response