!pip install torch transformers datasets
!pip install accelerate -U

Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: torch in /home/jose/.local/lib/python3.10/site-packages (2.0.1)
Requirement already satisfied: transformers in /home/jose/.local/lib/python3.10/site-packages (4.30.2)
Requirement already satisfied: datasets in /home/jose/.local/lib/python3.10/site-packages (2.13.1)
Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch) (3.6.0)
Requirement already satisfied: typing-extensions in /home/jose/.local/lib/python3.10/site-packages (from torch) (4.5.0)
Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from torch) (1.9)
Requirement already satisfied: networkx in /home/jose/.local/lib/python3.10/site-packages (from torch) (3.1)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch) (3.0.3)
Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.7.99)
Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.7.99)
Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.7.101)
Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /home/jose/.local/lib/python3.10/site-packages (from torch) (8.5.0.96)
Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.10.3.66)
Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /home/jose/.local/lib/python3.10/site-packages (from torch) (10.9.0.58)
Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /home/jose/.local/lib/python3.10/site-packages (from torch) (10.2.10.91)
Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.4.0.1)
Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.7.4.91)
Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /home/jose/.local/lib/python3.10/site-packages (from torch) (2.14.3)
Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /home/jose/.local/lib/python3.10/site-packages (from torch) (11.7.91)
Requirement already satisfied: triton==2.0.0 in /home/jose/.local/lib/python3.10/site-packages (from torch) (2.0.0)
Requirement already satisfied: setuptools in /home/jose/.local/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (67.6.1)
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (0.37.1)
Requirement already satisfied: cmake in /home/jose/.local/lib/python3.10/site-packages (from triton==2.0.0->torch) (3.26.4)
Requirement already satisfied: lit in /home/jose/.local/lib/python3.10/site-packages (from triton==2.0.0->torch) (16.0.6)
Requirement already satisfied: huggingface-hub<1.0,>=0.14.1 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (0.15.1)
Requirement already satisfied: numpy>=1.17 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (1.23.5)
Requirement already satisfied: packaging>=20.0 in /usr/lib/python3/dist-packages (from transformers) (21.3)
Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3/dist-packages (from transformers) (5.4.1)
Requirement already satisfied: regex!=2019.12.17 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (2023.6.3)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from transformers) (2.25.1)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (0.13.3)
Requirement already satisfied: safetensors>=0.3.1 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (0.3.1)
Requirement already satisfied: tqdm>=4.27 in /home/jose/.local/lib/python3.10/site-packages (from transformers) (4.65.0)
Requirement already satisfied: pyarrow>=8.0.0 in /home/jose/.local/lib/python3.10/site-packages (from datasets) (12.0.1)
Requirement already satisfied: dill<0.3.7,>=0.3.0 in /home/jose/.local/lib/python3.10/site-packages (from datasets) (0.3.6)
Requirement already satisfied: pandas in /home/jose/.local/lib/python3.10/site-packages (from datasets) (1.5.3)
Requirement already satisfied: xxhash in /home/jose/.local/lib/python3.10/site-packages (from datasets) (3.2.0)
Requirement already satisfied: multiprocess in /home/jose/.local/lib/python3.10/site-packages (from datasets) (0.70.14)
Requirement already satisfied: fsspec[http]>=2021.11.1 in /home/jose/.local/lib/python3.10/site-packages (from datasets) (2023.6.0)
Requirement already satisfied: aiohttp in /home/jose/.local/lib/python3.10/site-packages (from datasets) (3.8.4)
Requirement already satisfied: attrs>=17.3.0 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (22.2.0)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (3.1.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (4.0.2)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (1.8.2)
Requirement already satisfied: frozenlist>=1.1.1 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (1.3.3)
Requirement already satisfied: aiosignal>=1.1.2 in /home/jose/.local/lib/python3.10/site-packages (from aiohttp->datasets) (1.3.1)
Requirement already satisfied: python-dateutil>=2.8.1 in /home/jose/.local/lib/python3.10/site-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas->datasets) (2022.1)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.1->pandas->datasets) (1.16.0)
Requirement already satisfied: idna>=2.0 in /usr/lib/python3/dist-packages (from yarl<2.0,>=1.0->aiohttp->datasets) (3.3)
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: accelerate in /home/jose/.local/lib/python3.10/site-packages (0.20.3)
Requirement already satisfied: numpy>=1.17 in /home/jose/.local/lib/python3.10/site-packages (from accelerate) (1.23.5)
Requirement already satisfied: packaging>=20.0 in /usr/lib/python3/dist-packages (from accelerate) (21.3)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages/psutil-5.9.1-py3.10-linux-x86_64.egg (from accelerate) (5.9.1)
Requirement already satisfied: pyyaml in /usr/lib/python3/dist-packages (from accelerate) (5.4.1)
Requirement already satisfied: torch>=1.6.0 in /home/jose/.local/lib/python3.10/site-packages (from accelerate) (2.0.1)
Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch>=1.6.0->accelerate) (3.6.0)
Requirement already satisfied: typing-extensions in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (4.5.0)
Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from torch>=1.6.0->accelerate) (1.9)
Requirement already satisfied: networkx in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (3.1)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch>=1.6.0->accelerate) (3.0.3)
Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.7.99)
Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.7.99)
Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.7.101)
Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (8.5.0.96)
Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.10.3.66)
Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (10.9.0.58)
Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (10.2.10.91)
Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.4.0.1)
Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.7.4.91)
Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (2.14.3)
Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (11.7.91)
Requirement already satisfied: triton==2.0.0 in /home/jose/.local/lib/python3.10/site-packages (from torch>=1.6.0->accelerate) (2.0.0)
Requirement already satisfied: setuptools in /home/jose/.local/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.6.0->accelerate) (67.6.1)
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.6.0->accelerate) (0.37.1)
Requirement already satisfied: cmake in /home/jose/.local/lib/python3.10/site-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (3.26.4)
Requirement already satisfied: lit in /home/jose/.local/lib/python3.10/site-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (16.0.6)


from datasets import load_dataset

dataset = load_dataset('imdb')

/home/jose/.local/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
Found cached dataset imdb (/home/jose/.cache/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0)
100%|██████████| 3/3 [00:00<00:00, 886.25it/s]


# Select 5 random samples
import random
random.seed(42)
dataset = dataset.shuffle()
for i in range(5):
    print('---')
    print(dataset['train'][i]['text'])
    print('negative' if dataset['train'][i]['label'] == 0 else 'positive')

---
There is no relation at all between Fortier and Profiler but the fact that both are police series about violent crimes. Profiler looks crispy, Fortier looks classic. Profiler plots are quite simple. Fortier's plot are far more complicated... Fortier looks more like Prime Suspect, if we have to spot similarities... The main character is weak and weirdo, but have "clairvoyance". People like to compare, to judge, to evaluate. How about just enjoying? Funny thing too, people writing Fortier looks American but, on the other hand, arguing they prefer American series (!!!). Maybe it's the language, or the spirit, but I think this series is more English than American. By the way, the actors are really good and funny. The acting is not superficial at all...
positive
---
This movie is a great. The plot is very true to the book which is a classic written by Mark Twain. The movie starts of with a scene where Hank sings a song with a bunch of kids called "when you stub your toe on the moon" It reminds me of Sinatra's song High Hopes, it is fun and inspirational. The Music is great throughout and my favorite song is sung by the King, Hank (bing Crosby) and Sir "Saggy" Sagamore. OVerall a great family movie or even a great Date movie. This is a movie you can watch over and over again. The princess played by Rhonda Fleming is gorgeous. I love this movie!! If you liked Danny Kaye in the Court Jester then you will definitely like this movie.
positive
---
George P. Cosmatos' "Rambo: First Blood Part II" is pure wish-fulfillment. The United States clearly didn't win the war in Vietnam. They caused damage to this country beyond the imaginable and this movie continues the fairy story of the oh-so innocent soldiers. The only bad guys were the leaders of the nation, who made this war happen. The character of Rambo is perfect to notice this. He is extremely patriotic, bemoans that US-Americans didn't appreciate and celebrate the achievements of the single soldier, but has nothing but distrust for leading officers and politicians. Like every film that defends the war (e.g. "We Were Soldiers") also this one avoids the need to give a comprehensible reason for the engagement in South Asia. And for that matter also the reason for every single US-American soldier that was there. Instead, Rambo gets to take revenge for the wounds of a whole nation. It would have been better to work on how to deal with the memories, rather than suppressing them. "Do we get to win this time?" Yes, you do.
negative
---
In the process of trying to establish the audiences' empathy with Jake Roedel (Tobey Maguire) the filmmakers slander the North and the Jayhawkers. Missouri never withdrew from the Union and the Union Army was not an invading force. The Southerners fought for State's Rights: the right to own slaves, elect crooked legislatures and judges, and employ a political spoils system. There's nothing noble in that. The Missourians could have easily traveled east and joined the Confederate Army.<br /><br />It seems to me that the story has nothing to do with ambiguity. When Jake leaves the Bushwhackers, it's not because he saw error in his way, he certainly doesn't give himself over to the virtue of the cause of abolition.
positive
---
Yeh, I know -- you're quivering with excitement. Well, *The Secret Lives of Dentists* will not upset your expectations: it's solidly made but essentially unimaginative, truthful but dull. It concerns the story of a married couple who happen to be dentists and who share the same practice (already a recipe for trouble: if it wasn't for our separate work-lives, we'd all ditch our spouses out of sheer irritation). Campbell Scott, whose mustache and demeanor don't recall Everyman so much as Ned Flanders from *The Simpsons*, is the mild-mannered, uber-Dad husband, and Hope Davis is the bored-stiff housewife who channels her frustrations into amateur opera. One night, as Dad & the daughters attend one of Davis' performances, he discovers that his wife is channeling her frustrations into more than just singing: he witnesses his wife kissing and flirting with the director of opera. (One nice touch: we never see the opera-director's face.) Dreading the prospect of instituting the proceedings for separation, divorce, and custody hearings -- profitable only to the lawyers -- Scott chooses to pretend ignorance of his wife's indiscretions.<br /><br />Already, the literate among you are starting to yawn: ho-hum, another story about the Pathetic, Sniveling Little Cuckold. But Rudolph, who took the story from a Jane Smiley novella, hopes that the wellworn-ness of the material will be compensated for by a series of flashy, postmodern touches. For instance, one of Scott's belligerent patients (Denis Leary, kept relatively -- and blessedly -- in check) will later become a sort of construction of the dentist's imagination, emerging as a Devil-on-the-shoulder advocate for the old-fashioned masculine virtues ("Dump the b---h!", etc.). When not egged-on by his imaginary new buddy, Scott is otherwise tormented by fantasies that include his wife engaged in a three-way with two of the male dental-assistants who work in their practice. It's not going too far to say that this movie is *Eyes Wide Shut* for Real People (or Grown-Ups, at least). Along those lines, Campbell Scott and Hope Davis are certainly recognizable human beings as compared to the glamourpuss pair of Cruise and Kidman. Further, the script for *Secret Lives* is clearly more relevant than Kubrick's. As proof, I offer the depiction of the dentists' children, particularly the youngest one who is about 3 or 4 years old, and whose main utterance is "Dad! Dad! Dad! Dad! Dad! DAD!!!" This is Family Life, all right, with all its charms.<br /><br />The movie would make an interesting double-bill with *Kramer vs. Kramer*, as well. One can easily trace the Feminization of the American Male from 1979 to 2003. In this movie, Dad is the housewife as in *Kramer*, but he is in no way flustered by the domestic role, unlike Dustin Hoffman, who was too manly to make toast. Here, Scott gets all the plumb chores, such as wiping up the children's vomit, cooking, cleaning, taking the kids to whatever inane after-school activity is on the docket. And all without complaint. (And without directorial commentary. It's just taken for granted.)<br /><br />The film has virtues, mostly having to do with verisimilitude. However, it's dragged down from greatness by its insistence on trendy distractions, which culminate in a long scene where a horrible five-day stomach flu makes the rounds in the household. We must endure pointless fantasy sequences, initiated by the imaginary ringleader Leary. Whose existence, by the way, is finally reminiscent of the Brad Pitt character in *Fight Club*. And this finally drives home the film's other big flaw: lack of originality. In this review, I realize it's been far too easy to reference many other films. Granted, this film is an improvement on most of them, but still. *The Secret Lives of Dentists* is worth seeing, but don't get too excited about it. (Not that you were all that excited, anyway. I guess.)
negative


from transformers import BertTokenizer

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

'''
    Function: tokenize_function
    Description: Tokenize the text
    Input: 
        - examples, a dictionary with key 'text' and value as the text to be tokenized
        - Padding is used to ensure that all sequences in a batch have the same length 
            by adding padding tokens. Setting padding to 'max_length' ensures that each 
            sequence is padded to have a length equal to max_length.
        - truncation, When a text input is longer than the model can handle (or longer 
            than the specified maximum length), it needs to be truncated. Setting truncation=True 
            will ensure that inputs longer than the maximum length are truncated.
        - max_length, specifies the maximum length of a sequence. Any input that is longer 
            than this value will be truncated, and any input shorter than this value will 
            be padded.
    Output: tokenized text
'''
def tokenize_function(examples):
    return tokenizer(examples['text'], padding='max_length', truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

Loading cached processed dataset at /home/jose/.cache/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-2f019c60187261be.arrow
Loading cached processed dataset at /home/jose/.cache/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-271ea7d24566f69c.arrow
Loading cached processed dataset at /home/jose/.cache/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0/cache-76898d4e76125c65.arrow


from transformers import BertForSequenceClassification

model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

2023-06-27 12:13:45.777585: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-27 12:13:46.550359: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


from sklearn.metrics import classification_report

def evaluate_model(model, dataset, tokenizer, label_names):
    model.eval()
    true_labels = []
    predictions = []
    
    for batch in dataset:
        inputs = tokenizer(batch['text'], return_tensors="pt", padding=True, truncation=True, max_length=128)
        outputs = model(**inputs)
        batch_predictions = outputs.logits.argmax(dim=1).tolist()
        true_labels.append(batch['label'])
        predictions.extend(batch_predictions)
    
    print(classification_report(true_labels, predictions, target_names=label_names))


# Evaluate the vanilla model
print("\nPerformance of the Vanilla Model:")
evaluate_model(model, tokenized_datasets['test'], tokenizer, ['negative', 'positive'])

Performance of the Vanilla Model:
              precision    recall  f1-score   support

    negative       0.54      0.04      0.07     12500
    positive       0.50      0.97      0.66     12500

    accuracy                           0.50     25000
   macro avg       0.52      0.50      0.37     25000
weighted avg       0.52      0.50      0.37     25000


from transformers import TrainingArguments, Trainer

# Training arguments for the vanilla model
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
)

# Train the vanilla model (this will take a while)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
    tokenizer=tokenizer,
)

trainer.train()

/home/jose/.local/lib/python3.10/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
  5%|▌         | 500/9375 [17:34<5:13:51,  2.12s/it]

{'loss': 0.5194, 'learning_rate': 5e-05, 'epoch': 0.16}

 11%|█         | 1000/9375 [35:24<4:49:17,  2.07s/it]

{'loss': 0.4712, 'learning_rate': 4.71830985915493e-05, 'epoch': 0.32}

 16%|█▌        | 1500/9375 [52:38<4:30:12,  2.06s/it]

{'loss': 0.4143, 'learning_rate': 4.436619718309859e-05, 'epoch': 0.48}

 21%|██▏       | 2000/9375 [1:09:40<4:15:00,  2.07s/it]

{'loss': 0.3892, 'learning_rate': 4.154929577464789e-05, 'epoch': 0.64}

 27%|██▋       | 2500/9375 [1:26:44<3:52:57,  2.03s/it]

{'loss': 0.366, 'learning_rate': 3.8732394366197184e-05, 'epoch': 0.8}


model.save_pretrained('./fine-tuned_model')
tokenizer.save_pretrained('./tokenizer')

('./tokenizer/tokenizer_config.json',
 './tokenizer/special_tokens_map.json',
 './tokenizer/vocab.txt',
 './tokenizer/added_tokens.json')


from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load the model and tokenizer
loaded_model = AutoModelForSequenceClassification.from_pretrained('./fine-tuned_model')
loaded_tokenizer = AutoTokenizer.from_pretrained('./tokenizer')

# Evaluate the fine-tuned model
print("\nPerformance of the Fine-tuned Model:")
evaluate_model(loaded_model, tokenized_datasets['test'], loaded_tokenizer, ['negative', 'positive'])

Performance of the Fine-tuned Model:
              precision    recall  f1-score   support

    negative       0.88      0.89      0.88     12500
    positive       0.89      0.88      0.88     12500

    accuracy                           0.88     25000
   macro avg       0.88      0.88      0.88     25000
weighted avg       0.88      0.88      0.88     25000

Pre-Trained Model Demo¶

1. Install Dependencies¶

2. Load the dataset and prepare the data¶

3. Select a pre-trained model and load it¶

4. Evaluate the vanilla model¶

5. Fine-tune the model¶

6. Conclusion¶