AI Double Header: Google IO follows ChatGPT 4o
ChatGPT 4o now available to paid users, and all users within 2 weeks!
Yesterday, May 13, 2024, Open AI demoed and released their new model, ChatGPT 4o. The “o” seems to stand for Omni Model, which blends together the various modalities of a multi-model LLM (text, vision, audio). While this model may not have rolled out to all free user yet, it is expected to within two weeks.
The result was clearly modeled off the 2013 movie (that I recently rewatched) called Her, the voice they used even sounds like Scarlett Johansson. It can chat with you in near real time, it can see what you snap pictures of, and your whole voice conversation gets transcribed in the app. While the voice chat is fast, and back and forth, there is a delay before the response that I didn’t notice in the demo.
OpenAI also showcased a desktop app, and announced that free tier users would get access to the same calibur models, and the GPT store. All this comes on the back of other recent improvements, especially memory between chats. Even just in the normal chat interface you can tell ChatGPT to remember things about you, preferences, decisions, and it can keep that as ongoing context for you across your chats.
Then today, May 14th, Google kicked off their developer conference, Google IO, and as expected offered up a slew of AI announcements. They are of course working on a similar multi-modal Her / Jarvis like voice assistant under the name Project Astra. There was a lot of focus on Android, showing Google’s strategy of deeply integrating their assistant into their phone OS. They said to expect these features to be released to the public “later this year.”
Apple’s developer conference, WWDC, is coming up June 10-14, and presumably will have some similar announcements for Apple devices and operating systems. One would expect to see a ChatGPT calibur Siri in consumers’ hands in September, when most new iPhones have historically been announced.
Google also announced integrations with Workspace, bringing Gemini into Calendar, Tasks, and Keep. One can imagine that if you used Google products to manage your life and business, an AI Assistant that could act against your Google Accounts could indeed eliminate a lot of your daily grind.
And really that is the last remaining gap in these tools, though they will continue to improve in a number of ways. But currently they are plenty capable, plenty intelligent, they are just stuck in a box. When working with them you find yourself constantly copy and pasting, clicking around following their instructions. I’m always left wondering, why can’t this thing just do the action itself. I almost feel like I’m ChatGPT’s assistant!
I believe that before the holiday shopping season comes around, all these companies will have assistants on the market that can actually do things for you. They will organize your email, book your travel, and fill out paperwork. In fact we may find that we rarely use a mouse to click around and do a bunch of things, rather we just type or talk to our AI assistant who takes the wheel of computing.
Stay tuned, we’ll be here playing with it, showing off demos and more! Let us know if there’s anything in particular you’d like to hear about, or a use case you’d like to see explored.