Creating foundational models through natural selection
Model merging is a cost- and compute-efficient way to create models that combine the components and capabilities of existing foundational models. However, currently, model merging relies on human intuition, and with more than 500,000 models available on Hugging Face, the space of possible models is virtually impossible to explore manually.
Sakana AI's new algorithm, Evolutionary Model Merge, provides a systematic approach to finding optimal model merges. It is inspired by natural selection and can discover non-intuitive merging solutions that would go unnoticed by humans. Evolutionary Model Merge automatically combines layers and weights from different models to create new generations of models.
This new technique fits Sakana AI’s vision to create AI systems that are inspired by nature.
The researchers at Sakana AI have already used Evolutionary Model Merge to create a Japanese LLM and VLM that outperform other SOTA models, and more will come in the future. Interestingly, the models were created by combining models that were trained on Japanese and non-Japanese data, combining their different capabilities (e.g., processing Japanese prompts + reasoning on math problems).
I spoke to Sakana AI co-founder David Ha about the research, its results, and the next steps.
Read the full article on VentureBeat.