Building a Small Language Model

A functional prototype for this idea is still WIP which will be available soon in my GitHub and uploaded to Streamlit Cloud. However, I’m sharing the idea here.

Background Link to heading

The language models models are getting powerful, cheaper and smaller. Soon the next hype will be about small language models that are capable to run locally in your laptop, browswer, in IOT devices, glasses, with no internet connection. My idea is to use this blog to build a small language model that’s specialized for a particular task by distilling a larger model. I want to take inspirations from Phi set of models. I also want to make this prototype to learn how to build a language model from scratch (a simple one ofcourse)