OpenAI’s New ‘o’ Series Is a Giant Leap Toward Multimodal AI Assistants

by shayaan

The race to dominate the AI ​​limit has just received a different plot twist – and this time he talks back, looks at you and may even listen with feeling.

OpenAi launched his new “O” series models todayintroduce GPT-4O and his lightweight cousin, GPT-4O-Mini (AKA O4 And O3). These new models are not only coordinated chatbots omnimodalWhich means they can understand and generate it text, image, audio and video Native. No Frankenstein modules were found to fake visual literacy.

This is effective with eyes, ears and a mouth.

One model to rule them all?

OpenAi says the “O” stands for “omni,” And the implications are exactly what you would expect: A Unified model That can record a screenshot, hear your voice crack and turn off an emotionally calibrated answer – all in real time. It is the first real hint of a future where AI assistants are not alone in Your phone – them Are Your phone.

The O3 (Mini) version is built for Speed ​​and affordabilityWith performance closer to Claude Haiku or a well -oiled Mistral, but still retain that full multimodal super power set. In the meantime, O4 (Full-fat GPT-4O) is square for the large competitions, matching GPT-4-Turbo in power, but zips through images and audio as if it is playing an informal round of charades.

See also  Spain’s Second-Largest Bank BBVA To Offer Customers Bitcoin, Ethereum Trading

And it’s not just speed. These models are cheaper to runMore efficient to implement and can – here the kicker –Work native on devices. That’s right: real -time, multimodal AI Without the latency of the cloud. Think of personal assistants who not only listen to commands, but Respond as companions.

Beyond Chatbots: Enter the Agent era

With this release, OpenAI lays the foundation for the agent from AI Die Smartmer-Dan-Smart assistants who not only talk and write, but also, but Observe, acting and autonomous dealing with tasks.

Do you want you to pars a Twitter -Thread, generate a graph, set up a tweet and announce it on Discord with a self -satisfied meme? That is not only within reach. It is practically on your desk – wearing a monocle, sipping espresso and correcting your grammar in a delicious baritone.

The O Series models are intended Everything in power, from real-time speech bots to AR glassesOffering a hint on the “AI-first” hardware movement that has the old guard of Tech (and new) on sharp. In the same way as the iPhone that was re -defined mobile, these models are that The start of AI’s Native Interface era.

OpenAI versus the field

This does not happen in a vacuum. Google’s Gemini evolves. Anthropic’s Claude beats his weight. Meta has a lama in the lab. But openi’s O The series may have done something that the rest has not yet nailed: Real-time, uniform multimodal fluency In a single model.

This could be OpenAi’s answer to the inevitable: hardware. Whether it’s Apple’s rumors AI collaboration or his own “Jony Ive Stealth Mode” project, OpenAI is Preparation for a world where AI is not only an app – it is the operating system.

See also  French banking giant SocGen's unit to expand its euro-backed stablecoin to Stellar

Published by Andrew Hayward

Generally intelligent Newsletter

A weekly AI trip told by Gen, a generative AI model.



Source link

You may also like

Latest News

Copyright © Sovereign Wealth Signals