Microsoft Unleashes the Power of Compact AI with Phi-3.5 Models

Image Source:  Microsoft

Microsoft has taken a significant leap forward in the AI landscape with the launch of its latest Phi-3.5 model collection. These small language models (SLMs) are designed to be both cost-effective and highly capable, surpassing the performance of models that are both similar in size and much larger. This release marks a new era for Azure customers, providing more options to build and enhance generative AI applications.

Phi-3.5-MoE: A Mixture-of-Experts Marvel

The standout of this release is the Phi-3.5-MoE, a Mixture-of-Experts model that brings together 16 experts with 3.8 billion parameters each, culminating in a model size of 42 billion parameters. When using two experts, this model activates 6.6 billion parameters, achieving performance levels that rival much larger dense models. Supporting over 20 languages, the Phi-3.5-MoE excels in multi-lingual tasks and demonstrates remarkable efficiency in both language understanding and reasoning.

Phi-3.5-Mini: Small Yet Mighty

Phi-3.5-mini is another noteworthy addition, showcasing substantial improvements in multi-lingual support, conversation quality, and reasoning capabilities. Despite its compact size of just 3.8 billion parameters, this model has undergone extensive pre- and post-training processes, including Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO). The result is a model that not only competes with but often surpasses larger models in key benchmarks, making it a formidable tool for a variety of applications.

Extended Context Length for Complex Tasks

One of the most impressive features of the Phi-3.5-mini is its support for a 128K context length, which allows it to excel in tasks requiring the processing of long documents or multi-turn conversations. This capability puts it ahead of competitors like the Gemma-2 family, which supports only an 8K context length, and positions it as a leader in long-context tasks.

Advancing Multi-Frame Image Understanding

Phi-3.5-vision introduces groundbreaking advancements in multi-frame image understanding and reasoning. This model has been fine-tuned based on customer feedback, resulting in significant performance improvements across several benchmarks. From multi-image summarization to video analysis, Phi-3.5-vision opens up new possibilities for detailed image comparison and storytelling.

Robust Safety Measures Across the Board

Safety remains a top priority for Microsoft, and the Phi-3.5 models are no exception. Developed in accordance with the Microsoft Responsible AI Standard, these models undergo rigorous safety evaluations, including testing across multiple languages and risk categories. The Phi-3.5-MoE, in particular, incorporates a robust safety post-training strategy that combines open-source and proprietary datasets to ensure the model is both helpful and harmless.

Optimized for Azure: Speed and Efficiency

For Azure customers, the Phi-3.5 models are optimized to deliver faster and more predictable outputs. With ONNX Runtime, developers can optimize these models on various hardware targets, ensuring they get the most out of their AI deployments. Additionally, the introduction of Guidance to the Phi-3.5-mini serverless endpoint in Azure AI Studio further enhances output predictability, reducing cost and latency by steering the model token by token during inference.

Source: Microsoft

TheDayAfterAI News

We are your source for AI news and insights. Join us as we explore the future of AI and its impact on humanity, offering thoughtful analysis and fostering community dialogue.

https://thedayafterai.com
Previous
Previous

AI Can Know Your Health Condition by Listening to Your Cough?

Next
Next

NVIDIA Unveils Mistral-NeMo-Minitron 8B: A Compact Powerhouse for Generative AI