David DaleHow to fine-tune a NLLB-200 model for translating a new languageNLLB is a translation model that supports 200 languages. I teach it one more language, Tyvan, and explain the code behind this update.Oct 17, 20237Oct 17, 20237
David DaleinTowards Data ScienceCompressing unsupervised fastText modelsA python package to reduce word embeddings models by 300 times, with almost the same performance on downstream NLP tasks.Dec 14, 20211Dec 14, 20211
David DaleinTowards Data ScienceHow to adapt a multilingual T5 model for a single languageLoad embeddings only for the tokens from your language to reduce model sizeMay 4, 20217May 4, 20217
David DaleDo you have to try to love math?If you don’t like and don’t understand math, does it mean you are stupid? Do you need to love math to achieve at…Feb 13, 2018Feb 13, 2018
David DaleinThe StartupA machine learning model to understand fancy abbreviations, trained on TolkienRecently I bumped into a question on Stackoverflow, how to recover phrases from abbreviations, e.g. turn “wtrbtl” into “water bottle”, and…Jan 14, 20185Jan 14, 20185