Nov 2, 2023
I collected about 10 different datasets, re-formatted them into seq2seq format (that converted each input into a text with a small prompt before it), and fine-tuned the rut5-base model on this dataset for a few days.
Here is a text in Russian that describes it in more detail: https://habr.com/ru/articles/581932/.
The code I used is in https://gist.github.com/avidale/4de1454bf41822dc862fddbd779d4cc6, but it is rather dirty, so I would recommend re-writing from scratch the code for a new model, using my old code only as a source of inspiration, instead of relying directly on it.