Where is Apple going with OpenELM?
Apple has released the full code, weights, checkpoints, and more for OpenELM, its latest language models. Here is what it means for its generative AI strategy.
Apple has recently released OpenELM, a family of open-source small language models (SLM) meant to run on phones and laptops.
The standout feature of OpenELM models is their memory- and compute-efficiency. They build on top of a bunch of recent optimization techniques that reduce the memory and compute footprint of language models.
It also uses a layer-wise scaling scheme that allocates parameters to attention and feed-forward layers in a non-uniform fashion as opposed to the classic transformer models, which have a uniform structure across all layers.
But more importantly, Apple has released everything on OpenELM, including the weights for eight models, training logs, multiple training checkpoints, and pre-training configurations. The goal is to “empower and strengthen the open research community” which is in contrast to Apple’s culture of secrecy.
However, in the context of the developments in the generative AI market, this move makes sense and can help Apple to build momentum and network effects that will enable it to corner the emerging model for small, on-device language models.
Read more about OpenELM and its business implications on TechTalks.
Read the full paper on Arxiv.
Review the model card and download the models on Hugging Face.
For more on AI research: