If you watched, read or even heard about Apple’s big event on Wednesday, chances are you noticed that the upcoming Apple TV upgrade sounds impressive. One of its big draws is that users can search for shows and control the experience using their voice rather than relying on a remote control. This new Siri-powered voice search capability is a feature that demonstrates just how far artificial intelligence technology has come, as well as how far it has to go before it rivals the full power of the human mind.
In a nutshell, the new Apple TV voice search feature looks a lot like using Amazon’s music-playing, news-reading Echo intelligent speaker—only, you know, on a TV. Viewers will be able to search for shows or movies by name or actress or genre, or some combination of them. They’ll be able to fast-forward and rewind by exact times (by saying, for example, “Fast-forward serven minutes”) and even call up the weather or other information, all by pressing a button and speaking like they might to another human being. Or, I guess, Siri.
It would all be pretty remarkable if speech recognition wasn’t already so commonplace. Whether we use iOS, Android or even Windows, we can all talk to our phones and get answers. On the TV, Roku 3 and Amazon Fire TV already support voice search (albeit with fewer bells and whistles than Apple showed Wednesday). And Amazon AMZN 0.02% has its aforementioned Echo device that lets users play music, get the news, dim their lights and a do whole lot more using their voice.
Artificial intelligence is all around us
This is not an indictment of Apple’s innovative spirit, but rather an acknowledgement of how amazing advances in artificial intelligence have been in recent years. Mostly, the pervasiveness of high-quality speech recognition owes to the untold millions of dollars that companies—primarily Google GOOG -0.10% , Microsoft MSFT 0.27% , Facebook FB -0.04% and Baidu—have spent researching and commercializing a field of AI called deep learning.
Putting aside thin comparisons to how brains work (an unavoidable consequence of so-called artificial “neural network” algorithms that form the technology’s foundation), the reality of learning models is that they’re very good at recognizing patterns. Train a voice-recognition system with enough voice samples, and it will learn to recognize spoken words. Train a computer vision system on enough images and it will learn torecognize the objects (or faces) in them. The same goes for the meanings of words in text, the sounds in different types of music, the rules of video games — you name it.
Once reports started emerging about the successes these companies achieved with deep learning, their peers caught on pretty quickly. Apple, Amazon, Netflix NFLX 2.70% , Pinterest, Twitter TWTR 0.29% and other companies began buying up startups and hiring experts to get their own deep learning efforts off the ground. That’s why advanced speech recognition, computer vision and text analysis are so pervasive now — from Google Photos to Microsoft’s Skype Translateto SwiftKey’s predictive keyboard app that knows which word you’ll type next.
There are also artificial intelligence startups, often based on deep learning, that specialize in outsourcing a variety of these sci-fi tasks. Expect Labs specializes in voice search. Earlier this year, Facebook acquired Wit.AI, a startup building a speech-recognition system that lets developers turn regular applications into voice-powered ones. Clarifai analyzes images;MetaMindanalyzes images and text; Dextro analyzes video; and AlchemyAPI, acquired earlier this year by IBM IBM 0.28% , analyzes images, text and news articles online.
Oh, yeah, IBM has Watson, too. Since winning atJeopardy! in 2011, the Watson machine-learning software has been busy reading and learning text documents in fields from retail to oncology.
Leaving art to the artists
When I watched Apple’s AAPL -0.20% event Wednesday, though, I was also reminded that AI will probably never be the star of the show when it comes to entertainment—at least not anytime soon. Content is still king, and when we’re using Apple TV, or any AI for that matter, we’ll be using it to sort through and analyze creative content that no AI can yet create.
We want Apple TV to help us sort through the shows and movies on Netflix, Hulu and iTunes. We want Spotify to help us find new music we’ll like. We want Google Photos and Facebook to help us find and organize our favorite photos.
Sure, you’ll occasionally read headlines about computers that can rap, identify fashion trends or even devise recipes, but take a look at the finished products and they’re all a little less impressive. Mashing up lines and rhymes from a collection of rap songs is not a particularly creative endeavor, especially when Eminem’s creative way of rhyming words confounds the algorithms. Confirming correlations between what clothing designers come up with and what people wear is not the same as developing a new fall lineup that blows people’s minds.
One could argue that people are generally optimistic about IBM’s Chef Watson cookbook—an attempt to show that an AI system can mine recipes and then come up with its own unique ones—but at least one reviewer called its Austrian Chocolate Burrito the worst he’s ever had. Perhaps Watson should have listened to its human collaborator, chef Michael Laiskonis, and included cotija cheese in the recipe rather than cheese curds.
If there’s a moral to all of this, it’s that we should be amazed by AI and how prevalent it is becoming in our lives. When it comes to recognizing distinct breeds of dogs or tens of thousands of human faces, AI systemsare actually better than people. The 21st century is shaping up to be a lot more like The Jetsons than many of could have predicted even at the turn of the millennium (we didn’t even have the iPod in 2000, much less the iPhone or Siri).
But make no mistake: AI today is often just serving and surfacing the genius of the human mind. I’m happy for Apple TV and Siri to elevate my movie watching experience, but it will be a while before I’ll be watching a good movie made by Siri.