As Chief Scientist (Neural Networks) at Databricks, I lead our analysis group towards the aim of giving everybody the power to construct and fine-tune AI fashions with their very own knowledge. In 2020, I used to be a part of a small group of machine studying lecturers and trade veterans that based MosaicML. We’ve all the time been dedicated to supporting open scientific inquiry, each by sharing our data and offering instruments to the neighborhood. Since becoming a member of Databricks, which shares related tutorial roots, now we have solely deepened that dedication.
With that spirit in thoughts, now we have been collaborating with scientists from the nonprofit Allen Institute for AI (AI2) on the whole lot from technical knowledge-sharing to at present’s large announcement: OLMo. In my view, AI2 is without doubt one of the greatest NLP labs on the earth, much more so as a result of they conduct their cutting-edge analysis with the unrestrained creativity, dedication to integrity, and sources of a non-profit. We’ve discovered frequent floor in a perception in openness, a ardour for doing rigorous science, and a love of constructing artifacts that we put into the arms of the neighborhood.
In the present day AI2 is releasing OLMo 7B, an open supply, state-of-the-art massive language mannequin. Databricks is proud to have supported their work: OLMo (brief for Open-source Massive Language Mannequin) was skilled utilizing our Mosaic AI Mannequin Coaching Platform. The AI2 group can also be sharing the pre-training knowledge and coaching code used to develop this mannequin (which is a spinoff of the MosaicML LLM Foundry).
We’re thrilled to have performed a component within the success of the OLMo undertaking, however I need to give credit score the place credit score is due. We shared our instruments, however they did the exhausting work of constructing the fashions. Pete Walsh, Senior Software program Engineer at AI2, mentioned, “Mosaic was a game-changer for creating OLMo. Their platform allowed us to effortlessly scale up coaching and ablations when wanted, whereas their command-line interface lets us iterate rapidly by launching multi-node jobs proper from our laptops.” AI2’s seamless expertise utilizing our coaching platform validated the work we’ve achieved to make constructing and fine-tuning massive fashions as simple as doable. To study extra concerning the OLMo 7B mannequin and its variants, try AI2’s weblog publish or the mannequin card on Hugging Face.