In part one of this two-part series, "AI Wars: Hey Cortana is that you on my iPhone?", we focused on Microsoft's strategy of bringing Cortana to rival platforms.
We looked at the benefits such a move would bring to the Microsoft ecosystem. In this piece, we would like to look at what Cortana's rivals may bring to the table in attempts to hamper our favorite AI's ambitious advance.
You Better Act Like You Know
As a child growing up in the 80's one of my favorite cartoons was GI Joe. In the conscientious way of a bygone era at the conclusion of each episode a moral lesson was shared. The familiar tagline was "now you know and knowing is half the battle."
"Now you know and knowing is half the battle."
In a war, one must know his enemy. 2015 is a big year for Microsoft. The company is pioneering a new genre of OS, not just an incremental upgrade. Windows 10 will be the world's first OS to run on all form factors. Microsoft is offering this upgrade for free for the first year after launch to millions of users with Windows 7 and above.
Cortana, Microsoft's Halo inspired digital assistant, will be improved with new predictive abilities and integrated with Windows 10. She will also be going to iOS and Android as a bold step in the AI digital assistant war. These are big plays by Microsoft. Of course, nothing in tech happens in a vacuum.
During Apple's Spring Forward Event, CEO, Tim Cook demonstrated launching Siri, via voice, using the Apple Watch. With the simple words "Hey Siri" to the Watch Siri springs into action.
Wait! Cortana can do that on some Windows Phones. You are right. Sure she can. Moreover, it's cool. Really cool. Moreover, handy. Especially since the Lumia device does not need to be plugged into a power source for passive listening to work. Unlike the iPhone, which currently needs to be tethered to an outlet.
With the simple words "Hey Siri" to the Watch Siri springs into action.
However, there's something to be said for the convenience of being able to initiate one's digital assistant through a wearable. "Hey, Cortana" directed to a Microsoft Band sadly won't call up your trusty assistant. The act of bypassing the Band and talking to one's pocket where the phone may be located, though not laborious, is cumbersome and in my experience has mixed results. Cortana often doesn't hear me if my device is in my inner breast pocket.
Raising your voice just above what's comfortable or appropriate to ensure that Cortana can hear you through a pocket, above all ambient sounds or conversations can be a challenge. Raising a wrist, Michael Knight style bypasses these simple barriers to using the passive listening functions in certain environments. Granted the Microsoft band provides this convenience, but requires pushing a button.
With "Hey Siri" passive listening function on the Watch, Apple promises easy access for users to engage their digital assistant. This new and exciting way to interact with Siri via the Watch will likely reignite interest. This interest may even approach the levels people had when it first launched in 2011. Before being relegated from being a tech media darling to the butt of many jokes and public scorn. #SiriFails.
Undoubtedly "Hey Siri" will begin to fill the air as users interacting with their Apple Watches bring Siri back into the forefront. Exactly where Apple wants it. However, if it is no more capable than it has proven to be over the past nearly four years, a disappointing Siri via the Watch is just as disappointing as a disappointing Siri via the phone.
That, of course, is not where Apple wants Siri to be. Apple needs to take swift action to capitalize on what is sure to be renewed interest in their digital assistant. As the Swift programming language was a surprise announcement at last year's WWDC, Apple may have other surprises gracing its strategic branches. Cortana's power play for the hearts and minds of the masses via Windows 10 and a cross-platform launch may be just the blow that knocks Apples next step loose. Descending upon the heads of the unwary – Apple, in the magical way they make the old –also ran- look new may generate a eureka moment for the masses.
Siri, Don't Take it Personally
Before we get there, let's look at the here and now. Unlike Cortana and Google Now, Siri lacks its own search engine backbone (it uses Bing as does Apple's Spotlight). It also lacks the benefits of machine learning that neural net technology, which works like the human brain, brings to the table. It functions without context awareness, the ability to learn about its users and the ability to act pro-actively. In summation, Siri is virtually a purely responsive digital assistant. It never takes the initiative. It is incapable of doing so.
Let's look at it this way. If Cortana and Siri were both vying for an assistant position, both candidates would ace the interview. They each have the ability to shine during the screening process. You know, those scenarios where basic questions are asked. Fact grabbing, response times, setting appointments and the like. Yes, they would both get the job.
"If Cortana and Siri were both vying for an assistant position both candidates would ace the interview."
This is where the parallels end, however. Cortana is a self-starter. Once she gets the job, that is when she really shines. (By contrast Siri's brightest moments were during the "interview.")
Through permissions set in Cortana's user controlled Notebook, and as a product of the technology upon which she is built she learns about you. She learns interests, work schedules, workout schedules, favorite places and much more. She pro-actively supplies information and makes suggestions. Siri, on the other hand, without a neural net base, always needs a push to get anything done. Be honest, which would you hire?
"...Cortana is a self-starter...Siri needs a push...whom would you hire?"
Because Cortana is a personal assistant that learns the unique tastes and behaviors of individuals. Digital assistant comparisons that just throw all comers in a ring and make conclusions regarding which comes out on top based on metrics such as response time, fact fetching and the like are stopping at the "interview" stage. These comparisons ignore Cortana's key strength, which is arguably the essential feature of any digital assistant - the ability to get to know the user. How well an assistant serves, each is the true measure of a personal assistant. A real AI showdown would require a comparison along those lines. Microsoft claims that Cortana is the world's first personal digital assistant. Personally I agree.
Through the "Hey Siri" feature of the Apple Watch, I believe Apple is setting the stage for a multi-step strategy to make Siri more relevant. Most of the world is right handed. Thus in a contest, a left handed fighter or a fighter with a slick left hook can take his opponent off guard.
Despite its applications in home automation and automotive, Apple is keenly aware that the masses have made Siri a running joke. They are also aware that competitors like Cortana and Google Now are far more competent assistants based on more advanced technology that allows for pro-active and context relevant support and deep learning.
Through the "Hey Siri" feature of the Apple Watch, I believe Apple is setting the stage for a multistep strategy to make Siri more relevant.
This reality has also had the effect (bolstered by Microsoft's Cortana vs. Siri campaign) of beginning to shape the collective consciousness beyond the ardent fan base. Cortana (and Google Now) are the real players when it comes to digital assistants. This is a perception and a reality that Apple will not tolerate.
Aware of its disadvantaged position, in 2013 Apple poached one of Microsoft's top managers, Alex Acero, who had spent nearly 20 years in Redmond researching speech technology. He now serves as a senior director of the Cupertino company's Siri group. They've also plucked from the Nuance tree Gunnar Evermann (Manager, Siri Speech) and Larry Gillick (Chief Speech Scientist, Siri). At an office located in Boston, MA, Apple appears to be investing in a crack in-house team of speech experts aimed at neural net technology. In their clandestine fashion, they have made no announcements along those lines.
Peter Lee, head of Microsoft's research arm, and former boss to the Apple poached Alex Acero, however, is convinced that Apple is building the foundations to deliver on a Neural Network based Siri. In June of 2014 he is quoted as saying:
"All of the major players have switched over except Apple Siri…I think it is a matter of time."
In June of 2014 he felt that Apple could make that happen in 6 months.
Rope-a-Dope - Apple's Strategy?
The famous boxer Muhammad Ali executed a fighting strategy where he allowed himself to be backed against the ropes while absorbing the blows of his enemy. To onlookers and his opponent, it appeared that he was losing the fight. His strategy, however, was to cause his enemy to tire out and become over confident before Ali delivered his knockout blows. Rope-a-dope.
In the year since Cortana's release, Google Now has seen advancements in its predictive abilities, but Siri has not seen much in the way of growth. It is making moves in home automation and vehicles, yes. Advances in how it serves a unique user as a personal assistant? Not so much. Apple and Siri have been taking hits from the competition during this time. Aggressive ads showing Siri's inferiority to Cortana are popular. Siri is also regularly blasted on social media.
All of this while the company is quietly building a strong engineering and research speech team.Peter Lee estimated that it would take Apple six months before they would be able to introduce a neural network enhanced Siri. Six months from the time of that June statement would have been December 2014. So where's the new and improved Siri?
Well, just because you can doesn't mean you should. Timing is everything. Presentation matters. If any company prides themselves on those two precepts, Apple does. Apple's strategy it seems is to get a new medium of interacting with Siri in the hands, rather, on the wrists of millions of users first. The company has reportedly ordered about 5- 6 million Watches for the initial launch.
With this move, Siri will suddenly be interacted with again. It may even re-enter the mindset and collective consciousness of the world in a positive manner like it did in 2011. Yes, before it disappointed everyone and failed to meet expectations. This window of renewed excitement will be brief. For the newness of the "Hey Siri" watch interaction will begin to wane, particularly since Siri through the Watch will be no better than a user's experience with it on the phone. So why not introduce a new and improved Siri now; before the Watch's launch?
"This window of renewed excitement will be brief."
To have introduced a new and improved Siri too early would not have had maximum impact. However, an April launch of the Apple Watch, bolstered by Apples magical marketing, gives the company a new Siri portal strategically close to their early June World Wide Developers Conference(WWDC).
This conference that saw the announcing of Apples Swift programming language and Health Kit last year would be, I presume in Apple's view, the perfect stage to debut a new and improved Siri. Such a demonstration would likely include onstage use scenarios from the Watch, and the "re-introducing" of neural net powered predictive, deep learning, and context-aware abilities already found in Cortana and Google Now.
Like her rivals, Siri will finally be capable of getting to know a user.
Two More Things?
An announcement about an improved Siri less than two months after the watches debut, riding on the coattails of a populace who's excitement over Siri will have likely been reignited could be quite the blow to Cortana. A left hook if you will. The blow could be more impactful if Apples classic "one more thing" moment actually includes two more things.
One that the iPhone would no longer need to be plugged in for passive listening to work. The second that this new improved Siri speech API is now open to the avidly dedicated Apple/Apple Watch developer base. Both of these areas Cortana leads in. However with little market share and a relatively small, yet dedicated, developer base a widespread embracing of Cortana's open speech API has not yet occurred. A real pity. By contrast, Apple's massive and passionate developer base would likely be eager to try their hand at connecting their iOS apps to Siri's speech API. Particularly given the new opportunities interacting with it hands-free via the Watch would present.
Apple's goal is to comprise as comprehensive and self-sustained an ecosystem as possible. Beyond that, they want to create such a cohesion between its products and services that a departure or use of rival services, like Cortana, diminishes the experience within the Apple ecosystem. Such a synergy of products and services within their ecosystem is an enticing appeal to use their integrated services.
The Cupertino company naturally is in a stronger position to assert this goal in some areas rather than others. For instance, Microsoft Office on iOS is such a desired staple in the industry that it is conceivable that it is availability on iOS has supplanted Apple's productivity suite as the default for many users.
By contrast, Siri can be deeply integrated into iOS. Though Microsoft will be bringing a very powerful version of Cortana to the iPhone, it will be an app. Not a part of the core OS. This, despite Cortana's advanced functionality, gives Siri an inherent advantage in iOS as its abilities will be deeply interwoven into the OS. Apple will likely also ensure advantages of Siri's use within its ecosystem and across devices like phone, car, Apple TV and iPad.
Ok Now Google – Monkey See Monkey Do
Recent reports confirmed that Google is preparing to open Google Now APIs to developers. What will this allow? Well, again, it will allow developers to do what Microsoft has already allowed its developers to do with Cortana from her beginnings. They will be able to tie their apps into the digital assistant's Speech API.
Ideally, from a user's perspective the implementation of such a feature would eventually make apps "invisible". As users interact with the digital assistant, the AI should be able to discern the requested function and choose the app that best carries out that function. Just as Cortana launches Here Drive when I tell her to give me directions to a location.
"I do not think it is a coincidence that Google has announced their open API plans"
From a developer's perspective, the level of integration of such functionality making their app invisible to the user may be a deterrent to embracing such integration. Developers want their apps to be noticed. Google with their vast user base may garner more developer support for their digital assistant's open API than Cortana has received in the past year.
I do not think it is a coincidence that Google has announced their open API plans at this time.
About 5 to 10
At the time of writing, we are maybe 5 months away from the global release of Microsoft's new era of computing in Windows 10. Windows 10 pioneers a single OS for all form factors. It will work as the platform for Microsoft's enterprising HoloLens. It unifies Microsoft's store. It creates a platform for Universal Apps where developers can write once for all form factors. It serves as a unifying agent to support the transitioning of user experiences between devices. Windows 10 will also catapult Cortana into the mainstream through Microsoft's tremendous PC installed base.
Such attention and increased availability would indeed make Cortana with her open API a much more tempting prospect for developers. That with the envelope-pushing predictive abilities thanks to Project Einstein combined with Cortana's eventual arrival on iOS and Android, have positioned Cortana to steal the Digital Assistant headlines in the coming months. Neither Apple nor Google want their assistants overshadowed. In response to Microsoft's pioneering advances in predictive search, Google's official word is:
"Predictive search is so new that the more companies working on it, the better."
They also claim there is no timeline for the launch of their open API. That said I would not be surprised if Google strives to have the Google Now API ready by Google I/O later this year to bolster Google Now against the coming Cortana enhanced Windows 10 tsunami.
The War Has just Begun
We are still in the early stages of the era of the new PC user interface. What is defined as a "personal computer" arguably falls under a much broader umbrella than what was categorized as a PC fifteen years ago. It can be argued that hybrids, tablets, and even smartphones are modern PCs. The new ways we interact with our devices has evolved with the evolution of our computing toward greater mobility. From keyboards and mouse to touch and now, voice.
Intelligent assistants that can be be engaged via voice as the PC user interface is the new battle ground. Whichever company masters the assistant's position as an ever present, unobtrusive but "there when I need it" shadow may stand to define this category. Microsoft's investments and early successes with Cortana, the excitement those in the Microsoft camp are experiencing must be tempered with the reality that the competition is not standing still. Keeping a watchful eye, and a thoughtful heart on all players and their actual and potential plays keeps you in the know. And as we learned from GI Joe, knowing is half the battle.
Food for Thought
It may be conceivable that the next step of the digital assistant beyond digital assistant as UI is the digital assistant as the OS. Microsoft's ambitious ubiquitous Windows 10 one OS presence with integrated Cortana may be the nascent seeds of this possibility. Ahhhh but that is a topic for another post.
In case you missed it! "AI Wars Part I: Hey Cortana is that you on my iPhone?".)