Amazon’s Bargain

Jeff Brown
|
Apr 23, 2025
|
The Bleeding Edge
|
6 min read


It’s been just over a decade since Amazon first released its Amazon Echo smart speaker in 2014.

It was empowered with Amazon’s smart assistant – Alexa – designed to make our lives easier and engage in natural language conversation.

I was so excited to experiment with the technology back then. I bought one right out of the gate.

Amazon Echo circa 2014 | Source:  Amazon

Echo was pretty big compared to today’s hockey puck-sized smart speakers, about the size of a tube of tennis balls.

And for any of us who have experienced Alexa and the Amazon Echo, it was a huge disappointment.

A Painful User Experience

We expected to simply speak with Alexa – as if we were having a conversation with someone –  and it would provide accurate and useful information without us having to lift a finger.

But back then, there weren’t any generative AI or large language models (LLMs). For Alexa to respond to a query, the software had to go through a multi-stage process.

Whether it was Alexa, Microsoft’s Cortana, or Apple’s Siri, they all suffered the same technological limitations:

  • First, the software used speech-recognition technology to convert speech into text.
  • Then it used natural language processing to understand context and intent.
  • Using this information, the software would then generate a text response. This was a real weak spot as the responses were typically rule-based, meaning they had to be programmed into the software to provide specific responses to specific inquiries. Knowledge graphs were also a widely used approach to connect information and help elicit more accurate responses.
  • And then the final step was to transform the text response into speech.

The above process had to happen in a matter of seconds, which is impressive at face value…

But ultimately, this technological architecture was responsible for the underwhelming user experience from these early versions of smart assistants.

Rule-based systems and knowledge graphs could get many straightforward, simple questions right, like, “What’s the weather like today?” Or, “What time is the Avalanche vs. Stars game tonight?”

But ask it a question that would take multiple steps to answer – like, “If I wanted to fly from Columbus to Sioux Falls with a stop in Denver for one night, and do it for less than $300, what is the best month to fly?” – forget about it.

Fortunately, those days are over.

Amazon’s Alexa just got a monster upgrade.

Alexa’s Monster Upgrade

It’s about time.

Earlier this month, Amazon released a new, generative AI reasoning model – Nova Sonic.

It will not only transform Alexa’s utility and user experience but also supercharge hundreds, maybe thousands, of companies to incorporate speech and agentic AI into their products and services.

Alexa+, which will be powered by Nova Sonic, will be released widely in the coming weeks for $19.99 a month. It’s free for Amazon Prime members.

This is going to be a very compelling product and a genuine AI-powered smart assistant capable of a wide range of tasks that will make our lives easier.

In some recent research that Amazon published about Nova Sonic, it actually refers to this frontier AI model as artificial general intelligence (AGI). It’s definitely not AGI, but it is still impressive and significant, nonetheless.

Source: Amazon

Nova Sonic is built on a frontier, multimodal large language model (LLM) and incorporates reasoning and agentic AI to enable what will become an incredibly useful digital assistant.

This technology is what will enable us to outsource tasks to a genuinely smart assistant –  like Alexa+ – that has consumed hours of our time every day.

Not only does Nova Sonic enable natural language conversation, including multilingual capabilities and understanding accents, but it will also have the agency to be able to transact, purchase, solve, and schedule on our behalf.

And just like the technological architecture was the root cause of Amazon Echo and the “old” Alexa’s weakness, the new architecture of Nova Sonic is the reason for this incredible upgrade.

  • Nova Sonic no longer has a multi-stage process required to create a response. It is capable of processing speech and producing accurate speech outputs as a singular process. It is a multimodal speech-to-speech artificial intelligence model.
  • The foundation of Nova Sonic is a state-of-the-art, pre-trained large language model.
  • It is designed for natural human conversations. It can capture and understand tone, inflection, and pacing.
  • It can actually adjust its prosody – the pattern of speech, intonation, and emotional state – to match that of the human that it is interacting with.
  • And Nova Sonic is an agentic AI, enabling it to act on our behalf to transact in the real world.

This is what we’ve been waiting for.

And it’s the same kind of technology that companies like OpenAI, xAI, Google (GOOGL), and Meta (META) have been working on.

Best-in-Class

Amazon’s (AMZN) Nova Sonic is a big deal for a few reasons. The first is that Amazon already has a distribution channel for its technology.

After all, there are more than 200 million Amazon Prime members around the world that will have access to Alexa+ in the coming weeks, more than 160 million of which are in the U.S. And there are more than 300 million monthly users of Amazon globally.

But that’s not the biggest opportunity for Nova Sonic and Alexa+.

Amazon is making Nova Sonic through Amazon Web Services’ (AWS) development platform through application programming interfaces (APIs). This means that any enterprise customer of AWS – who has a product or service that would benefit from implementing an agentic reasoning AI – will be able to “plug in” to Nova Sonic and “turn on” these kinds of capabilities.

In other words, companies won’t have to worry about developing their own frontier AI. All they have to do is plug into AWS’s Nova Sonic and they’re done. This is a well-established business model in high tech.

Twilio (TWLO) pioneered this model for messaging. It developed the back-end software that enabled any software application to message via SMS and digital messaging in just about every country on the globe. Now, any company can simply plug their software into Twilio’s APIs, giving them messaging capabilities.

It’s the same concept with Nova Sonic. And for a company like Amazon, which has lagged behind the industry in developing its own artificial intelligence, Nova Sonic is an impressive innovation that vaults Amazon to best-in-class with a speech-to-speech AI reasoning model.

Source:  Amazon

Above is a chart developed by Amazon that shows Nova Sonic compared to two OpenAI GPT-4o models (in red and orange) and Google’s Gemini 2.0 Flash model (in green). Shown above is the performance on four different speech benchmarks: MLS (Multilingual LibriSpeech), FLEURS, AMI, and SD-QA (Spoken Dialectal Question Answering).

Novi Sonic is best-in-class in both MLS and AMI, and it is competitive in both FLEURS and SD-QA.

But here’s the best part…

All at a Fraction of the Cost

Nova Sonic was optimized for two things: quick response time and low cost. Amazon traded off a bit of performance to lead on response time and cost.

Nova Sonic is about 80% less expensive to run than OpenAI’s GPT-4o. That’s a huge delta, and a price point that will be very attractive for enabling widespread enterprise and mass consumer adoption.

And Nova Sonic enables a response time of a mere 1.09 seconds on average, which compares favorably to OpenAI’s real-time API average of 1.18 seconds.

This is such an exciting development, not just for Amazon but also for the industry and society at large. We’re right at the cusp of having widespread access and availability to intelligent, agentic AIs that will act as our personalized digital assistants, capable of enabling us to recapture hours a day.

Amazon (AMZN) is a remarkable value today, having pulled back 24% since its all-time highs. Its pullback is entirely due to the fear, uncertainty, and doubt around tariffs and structural changes being made by the U.S. government.  This is temporary.  This volatility is providing an incredible entry point for Amazon, which is only trading at an EV/Sales of 2.8 and an EV/EBITDA of 11.7.

For a company growing its revenues roughly 10% a year at 50% gross margins and generating $47 billion in free cash flow a year, that’s a ridiculous bargain.

Jeff


Want more stories like this one?

The Bleeding Edge is the only free newsletter that delivers daily insights and information from the high-tech world as well as topics and trends relevant to investments.