Microsoft launches its smallest AI model yet

Author: Editorial
event 24.04.2024.
Foto: Shutterstock

The Phi-3 Mini, trained on “bedtime stories”, marks the beginning of a series of three compact models slated for release by the company.

Phi-3 Mini, with its 3.8 billion parameters, has been trained on a relatively smaller dataset compared to larger language models like GPT-4. It is now accessible on platforms such as Azure, Hugging Face, and Ollama. Microsoft’s roadmap includes the subsequent launch of Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters). Parameters denote the complexity of instructions a model can comprehend.

Building on the success of Phi-2 released in December, Microsoft assures that Phi-3 outperforms its predecessor, delivering responses akin to models ten times its size. Eric Boyd, corporate vice president of Microsoft Azure AI Platform, likened Phi-3 Mini’s capabilities to those of larger models like GPT-3.5, albeit in a more compact form.

Compared to their larger counterparts, smaller AI models like Phi-3 are not only more cost-effective to operate but also exhibit superior performance on personal devices such as smartphones and laptops. Alongside Phi, the company has introduced another lightweight AI model, Orca-Math, specialized in solving mathematical problems.

Rival companies also boast their own small AI models, targeting various tasks ranging from document summarization to coding assistance. Google’s Gemma 2B and 7B are tailored for simple chatbots and language tasks, while Anthropic’s Claude 3 Haiku excels at reading dense research papers and summarizing them swiftly. Meta’s recently launched Llama 3 8B serves purposes like chatbots and coding assistance.

Boyd revealed that developers trained Phi-3 using a “curriculum,” drawing inspiration from childhood learning methods involving simplified literature. The model builds upon the knowledge gained from previous iterations, enhancing its coding and reasoning capabilities. While Phi-3 demonstrates proficiency in general knowledge, it falls short of the breadth offered by models like GPT-4, which are trained on vast datasets from the internet.

Smaller models like Phi-3 are favored by many companies for their tailored applications, especially given the modest size of their internal datasets. Additionally, their reduced computational requirements translate to more affordable solutions, making them a practical choice for various use cases.


Zainteresirani ste za jedan od treninga?

Ispunite prijavu i javit ćemo Vam se u najkraćem mogućem roku!

Markoja d.o.o.
Selska cesta 93
OIB: 10585552225

    Ispunite prijavu i javit ćemo Vam se u najkraćem mogućem roku!

    All news