Microsoft came up with a cost-effective small language model

Microsoft has announced its low-cost Phi-3 Mini AI model optimized specifically for smartphones and other local devices, one variant of a total of three Phi-3 models planned for release in the near future. With the 3.8 billion parameter model, the creators aim to provide smaller organizations with an inexpensive alternative to cloud-based large language models (LLM) with the flexibility to be ideal for running on consumer GPUs or AI acceleration hardware found in smartphones and laptops. to run.

Based on the claim of the Redmond company, the new model surpasses the performance of the previous generation Phi-2, presented in December, and even claims that it performs at the same level as ten times larger models, such as the GPT-3.5, only in a smaller size. The data set is based on the Phi-2 model, but also uses synthetic data available on the Internet, which have undergone a strict filtering process.

Win one of 5 Craft conference tickets worth 1,000 euros!

On the occasion of the two-day international developer conference, HWSW is launching a kraftie prize draw.

Win one of 5 Craft conference tickets worth 1,000 euros!
On the occasion of the two-day international developer conference, HWSW is launching a kraftie prize draw.

Microsoft’s new technology outperformed other small models (Mistral, Gemma, Llama-3-In) in various math, programming and academic tests. One of the disadvantages of the smaller data set is the breadth of general, factual knowledge, but it is an ideal choice for working with smaller, internal data sets (even within an organization). According to Microsoft’s hopes, this way it will be able to offer an accessible solution for various applications that require language processing, even for companies with smaller budgets.

Most of the Redmond company’s rivals already have smaller AI models, most of which are designed for simple, specific tasks, such as document summarization or coding assistance. Google’s Gemma 2B and 7B are mainly ideal for chatbots and language tasks, Anthropic Claude 3’s Haiku summarizes research, while Meta’s recently released Llama 3 8B model also helps with coding assistance.

Phi-3 Mini is now available on Azure, Hugging Face and Ollama. Microsoft plans to release Phi-3 Small (7 billion parameters) and Phi-3 Medium (14 billion parameters) next, which can interpret even more complex instructions.

The article is in Hungarian

Tags: Microsoft costeffective small language model

Win one of 5 Craft conference tickets worth 1,000 euros!

Related posts