Microsoft's cognitive services and AI everywhere vision are making AI in our image

In 2014 I wrote that Microsoft's Cortana would be the next big thing. I may be right. Redmond's vision for its johnny-come-lately AI is that it, like the GUI before it, will be pivotal in the evolution of the personal computing user interface.

Microsoft's ambitions for Cortana were evident in 2014.

Microsoft envisions an unbounded AI that developers and partners will incorporate into a range of everyday and innovative devices enabled by the Cortana SDK.

They also see an AI that will follow users across platforms and devices while retaining access to all of their digital information.

Through Microsoft's Home Hub, a diverse range of appliances, and third-party Amazon Echo-like devices, Redmond is positioning Cortana and a cadre of bots as ambient, ever-present intelligences.

The fluid human-AI interactions science-fiction has primed us for are still a long way off, though. As the perceptive powers and intelligence (opens in new tab) of AI systems evolve, however, "inputting" data about our world, actions, and intentions will continue to become more natural.

Microsoft realizes the goal of a fluid human-AI relationship requires a human-AI symbiosis now to evolve the systems knowledge-base, perceptions and behavior around meeting human needs.

Heads literally in the Cloud

Through the world's first cloud-based AI supercomputer (opens in new tab) and its Cognitive Services (opens in new tab) Microsoft is investing in a comprehensive approach to improving both AI's intelligence and "senses," and thereby, our interactions with AI systems.

Microsoft's corporate VP of products for technology and research Andrew Shuman adds:

…it's important...for us to make careful choices so that this technology ultimately translates into welfare for all, and that means design principles that focus on the human benefit of AI, transparency and accountability.Ultimately...humans and machines will work together to solve society's greatest challenges, to create magical experiences and change the world.

Part of the magic will be derived from the unique Azure-based AI supercomputer Microsoft has constructed (opens in new tab).

Microsoft's AI supercomputer will power their AI vision.

This cloud-based intelligence is capable of translating all of Wikipedia in less than a tenth of a second, and the 38 million books in the Library of Congress in 76 seconds. It is this intelligence that Microsoft, as the platform and "do more" company, is making available to individuals and organizations. Nadella said (opens in new tab):

...we are focused on empowering both people and organizations, by democratizing access to intelligence to help solve our most pressing challenges. To do this, we are infusing AI into everything we deliver across our computing platforms and experiences.

This isn't a purely altruistic move, Microsoft needs humans to make its AI better.

It takes two to tango

Microsoft has a four-pronged approach to its AI vision. They're harnessing agents, like Cortana, to facilitate the human-AI relationship, infusing every application everywhere with intelligence, making intelligent services accessible to all developers and building infrastructure via the world's most powerful AI computer and making it available to everyone.

To achieve the first three objectives, Microsoft is leveraging the advantage of its pervasive products and services. By infusing Cortana and other intelligence throughout its services ecosystem (and applications with which it interacts) and opening the cognitive services that power them to developers, Redmond will build a market for its AI. They will also glean massive amounts useful data.

This data will be synthesized into Microsoft's Azure-based AI supercomputer (objective four) to continually evolve it as it interacts with millions of users, and addresses millions of scenarios daily. Nadella bragged, "We want to bring intelligence to everything, to everywhere, and for everyone." This human/AI co-dependence, or symbiosis, will help foster the evolution required to make AI more useful, mature AI/human interaction and build the trust Microsoft hopes humans will eventually invest in AI systems.

Human-AI interaction now is key to developing useful AI systems.

Naturally, a system's ability to integrate into our lives by "perceiving" what we perceive and fluidly responding to our needs and actions is key to overcoming the disjointed, almost forced, interactions we have today. If Microsoft's investments in "Cognitive Services" (which endows AI with human-like perceptions) pay off, the unnatural focus we give our digital assistants to solicit their support may be overcome.

Becoming human

Microsoft's Cognitive Services include vision, face, emotion, video, language and other APIs that are positioned to endow AI systems with human-like perceptions. If developers embrace these tools evolving AI will potentially be capable of greater autonomy by proactively acting on what they "perceive." Shuman added:

Microsoft released a whole set of APIs…that allow for a lot more natural software experience...

Imagine a refrigerator-embedded Cortana suggesting a favorite ice cream to a family member who she detects is sad based on face and emotion APIs. Her proactive suggestion would be founded on what she has learned about that family member over time as part of Microsoft's Home Hub vision.

AI will become a more natural and integrated part of our families.

The ubiquity achieved via a democratized Cortana combined with the power of an AI supercomputer, and expanded cognitive abilities would enable a much more fluid AI/human interaction than we see today. Nadella extols Microsoft's progress:

"There are a few companies that are at the cutting-edge of AI…But when you just look at the capability around speech recognition, who has the state of the art? Microsoft does…image recognition? Microsoft again, and those…are judged by objective criteria."

Additionally, Microsoft's AI recently reached parity with professional human transcriptionists when transcribing human dialogue. As AI's range of perception expands and becomes smarter, parity with human abilities may segue to parity with human behavior.

Final Thoughts

It will be interesting to see AIs evolution in relation to Microsoft's mobile and AR visions. A collision of these technologies with an ultra-mobile pen-focused cellular Surface (geared toward written/typed language interaction), paired with AR glasses (geared toward spoken language and gesture interaction) would be an interesting intersection for Microsoft's Conversations as a Canvas vision.

How might such a union of technology capture our natural, and oh so human, hodge-podge of verbal and non-verbal communication?

Jason Ward

Jason L Ward is a columnist at Windows Central. He provides unique big picture analysis of the complex world of Microsoft. Jason takes the small clues and gives you an insightful big picture perspective through storytelling that you won't find *anywhere* else. Seriously, this dude thinks outside the box. Follow him on Twitter at @JLTechWord. He's doing the "write" thing!

  • Thanks for reading folks! Microsoft's infusing products and services everywhere with AI (will)exposing it to millions of users in millions of scenarios. This allows humans to teach AI, and through increased interaction we learn a greater trust in our relationship with AI. Microsoft's AI supercomputer is also a first of it's kind enabling profound levels of computing for the masses. Where will all this lead?
  • Cortana might be successful project only if MS start shipping devices on which Cortana is installed and used. The obvious device is a mobile phone.
    Oops, MS abounded the market and there are almost no phones that actually use Cortana.
    Oops, another problem ->Cortana is available only in selected countries.
    Oops, did I mentioned that Cortana's voice recognition is not actually that good.
    Too many 'Oops' in my view. Half baked project.
  • I don't think that is fair to say. I use Cortana on my Xbox and PC for Cortana specific stuff way, way more than my phone. Sure I use voice commands on my phone, (which I also disagree with you there. She does very well in my car, almost surprisingly so), but I am not using true Cortana functions very much while out and about. When I am home, different story. I am setting reminders, notes, researching, etc. I am not saying that my use case is standard or common, but you also can't say that mobile is the only path for use. As well, if someone with a competing OS uses Cortana on Xbox or PC, then then would be more apt to install on their phone. Windows Mobile is far from a make or break scenario for cortana. I think Alexa is a great example of that.
  • I hate how Cortana needs internet access for absolutly everything.  Pre-Cortana I could dial a person with voice command, but with Cortana, if I'm in an area with bad reception she won't recognize anything, not even local commands...
  • I do agree with this 100%. I wish basic functions would still work. Text, SMS, Music, little things.
  • Not even so much that it needs Internet access, but only seems to work over Wi-Fi. Don't see why it cant work over my data.
  • That may be an issue with your network/phone/signal. She works fine for me unless my network is trash, in an elevator for example.
  • I think this issue is specific to you as i use Cortana all the time in my car.
  • That's how the voice recognition works. You can't trust a phone to have the processing power to accurately do such powerful voice recognition.
  • For basic functions that is not true. Before Cortana and vastly less powered phones, we had voice recognition for basic functions. Music, texting, etc.
  • 2 things, 1 think about how well those functions worked compared to Cortana and other advanced assistants. They were not nearly as accurate or able to accept as many commands. 2 you can't have a device that uses an assistant for advanced functions, and local recognition for basic functions. The interface alone would just piss people off more than how it acts when there's no data connection since you won't know which you're using half the time to ensure you tailor your commands properly.
  • I agree, but.... super simple stuff like pausing music, or skipping tracks should not rely on a connection. But, I agree with your point.
  • Jason, your articles are always a great read... ;)
  • Thanks! I appreciate that!
  • I think your article is a pretty accurate summation of what Microsoft is driving at, at least in the abstract. I disagree with some of their approach in some aspects (e.g., I so vehemently hate Windows 10 it's not even funny).  It makes sense that we are the teacher and Cortana, et al, is the student.  It makes sense that this is necessary for Cortana as an AI to develop, although we could argue whether "evolve" is the proper descriptive as opposed to "develop".  That being said--and fears of SkyNet aside--I think there some legitimate concerns as we go along. As Cortana and the cloud AI acquires the information, telemetry and tools to observe and "read" humans, in the expressed effort to respond to us in a more human way (pronounced "comfortable"), there is the chance that WE become LESS human.  By that I mean that there may be many of us over time who are not all that comfortable with an omnipresent and potentially powerful/influential being able to "sense" or "read" us.  This could drive some of us to mask our natural behavior or otherwise neutralize our presence to avoid such an AI.  Another issue is the predictability of Cortana's eventual inclusion of IoT, which would be critical TO such omnipresence and power.  By extension, autonomous cars and, eventually, robots of various types would be a big part of that brave new world.  Taking into consideration the likelihood of the "baking in" of something along the lines of the Three Laws of Robotics, and given the context of some of the quotes from Nadella and others, we could get into some serious debates over what qualifies as "harm" and "good of all".  Even now, in the direct human social context, we can't agree on these things.  There are those who actually believe if parents are not teaching their children how to prevent global warming they are somehow harming, neglectful or abusive of the child.  In Europe, there have been examples of the State removing children from the home of devout Christian families because the environment is harmful to the children because they aren't PC enough.  Those kinds of things  may be somewhat anectdotal now, but many things that use to be anecdotal have become more significant over time.  It's not an unreasonable concern that an AI could develop (or, rather, be developed) to a point where it does things to save us from ourselves.  In the end, the anger and mistrust that Will Smith's character in the movie adaptation of "I, Robot" because an AI made a different choice than he would have (resulting in the death of his family) could become reality. For me, I actually do want some degree of an omnipresent and omin-aware AI that responds to my direction.  I don't necessarily want one that "solves problems" for me.  I don't need an AI to make independent choices, but should rather be totally subserviant.  I don't need it to make suggestions for me, but instead respond with options when I ASK.  I don't need it to second-guess me.  But, Microsoft has a LONG way to go.  They can't get Cortana to truly be omnipresent yet.  Cortana exists as almost exclusively separate entities on every single device.  Microsoft will need to solve the problem of enabling Cortana to be aware of her own presence on multiple devices (phones, tablets, consoles, regrigerators, etc.) and then interact with humans as one AI FROM one device, whatever that is, and be consistent with the response.  Right now, it's a bloody nightmare walking in a room with Cortana listening on a phone, a tablet and an Xbox.  If you make the mistake of saying, "Hey, Cortana" the result is complete chaos.  Microsoft will have to reach a "Jarvis Epiphany" with how Cortana works before we'll see any useful progress.
  • and in summary....
  • If you can summarize it in a meaningful way, by all means, be my guest.
  • I think Microsoft is the most advanced company regarding AI, however the smart intelligence as it is today is nothing to be afraid of, Microsoft recently published MS Marco (A Human Generated MAchine Reading COmprehension Dataset.  This consists on real questions posted by real humans (of course anonimized) and answers to queries are human generated, a subset of those queries has multiple answers.  Agents like Cortana, Siri, Alexa, Google Assistant make use of deep search state of the art algorithms, but at the end this does not make AI self-counciousness, the AI we should be worried about is the one been developed in Carnegie Mellon with robots which today are capable of asking for external help when they don't know the answer and this might make these robots much more self-counciousness than we think. 
  • To everyone in handful of countries that is...
  • What Cortana's adv over Siri/Google?  Series question.  Siri/Google can speak many different language, Cortana can speak.... one?  Just checked, only eng.  Back in the 8.1 day I can ask Cortana to take me to Walmark, now on w10m she can't.  Google have 80%(guessing) of the search data in the world, how can Bing/Cortana compete? 
  • Not sure where you checked, or where you learned to count...
    English (US/CA/AU & GB), Portuguese (PT & BR), French (FR & CA), Chinese (Simplified), German, Italian, Japanese, Spanish (ES & MX).
    ​I count 8 languages, and really 12 as locales are different speech engines, recognizers and grammar paths. ​As for the capabilities, the goal of Cortana isn't to answer every request, it's to build an extensible framework for every app and online bot to complement its grammar, while not used much yet, this has the capability to litteraly understand anything. I can control my lights because I have huetro installed for example.
    ​When apps catch up, the TV app like HDHomeRun could provide understanding for things like "record the next CNN news" or "play the next episode of Humans I haven't watched".
    ​With IoT, it means devices can add things like "Brew me some coffee with sugar and milk", "What can I cook quickly with the things I have in the fridge", etc...
  • I can count ok I am sure, from Canada. Cortana setting, Cortana language.  Eng(canada), Eng(USA), Eng(UK), Eng(AU) and Eng(india).  As for the capabilities it does not change the question, we of all people know better that "waiting" for app to catch up don't work.  Cortana is very limited right now compare to others.      
  • Available languages are dependent on current UI language, so you probably only see flavors of English because your current UI is in English as well. ​As for the extensibility, I agree Microsoft should provide more out of the box, but none really have a good solution right now, and I think it takes time to get these things right, and they're definitely working on it. Microsoft Cognitive Services have new generation of speech recognition and user recognition based on audio, these could be the foundation for better natural speech interface, and for automatic user selection and therefore context, based on voice alone. Combine this with the rumours for the Family Hub mode and you have an assistant that can effectively interact with several family members at the same time, understanding "add a meeting with john tomorrow at 2 pm" and "play my favorite music" based on who is talking.
  • People has got a very, very big misconception of just how big the Cortana AI product offering in Microsoft is. The comments above is just further proof. Cortana that some people use on their phones is just one small part of it all. It would actually be good if Windows Central could do an article on how deep and far reaching Cortana is and goes. Machine Learning and Artificial Intelligence is so much more than a client on a geek's phone. Most people use it daily without even knowing they do - irrespective of the market they find themselves in!
  • I'm a hater and I'm just gonna sit here and complain about what isn't happening instead of being of any use whatsoever and doing anything productive...I mean what a bunch of losers at Microsoft trying to refine things and waiting instead of just giving us things when they aren't done so I can continue to talk about them making half baked products like its my job because its not like I have anything better to do. Lmao at companies with paid employees... why can't they just give me stuff? U_U
  • Jason, Forget the naysayers. You're heading in the right direction. Sure, Cortana still only works in certain languages and countries. But in the languages and countries it works in it simply blows the completion well out of the water. IN the quest for anthropomorphic AI, Microsoft is pushing all the right buttons. Compare their approach to the others. IBM's Watson is a simplistic brute-force lookup engine. Google's is a hit-and-hope massive comparative-analytic system that might work - but then again might not.  Already, using Cortana on devices such as mobile phone, HoloLens and others is a far better experience than any of the competitors. But this approach to AI (copying humans) will be a short-term market. Since AI is new to most people the way they think best to deal with it is to make the device pretend it's a human. In the end this is a severe limitation of what we can do with the technology - or where it might go by itself.
  • Interesting ideas, including the way you categorized Google And IBM's offerings. What are your other thoughts on where AI can go/evolve outside of Microsoft's current approach?
  • After reading the comments above, i can partly agree and disagree with both the sides. Yes, I found cortana better in 8.1 than in 10 but at the same time it is also reaching far deeper with win10. Cloud and IOT with the new found focus of MS to define and redefine and create and recreate new categories, makes me excited for the things to come
  • Why would anyone use Microsoft's solution when Google, Apple and Amazon have more users and more mature platforms. Microsoft's lack of mobile users is going to make their efforts much tougher. PC is much smaller than mobile and voice actions are a departure from most people's PC routine.
  • Do you have numbers to back that up? Just because it has the ability to install on more pieces of hardware (and I'm not positive they can be) doesn't mean it has more users. As for a reason why, well I can say that Cortana works better than google now for sure. I can't speak about Siri since I haven't use it in many years.
  • You need numbers? There are maybe 400 million Windows 10 devices compared to billions of Android/iOS devices and they outsell Windows 5:1. Not to mention PCs are historically used with voice actions/apps.
  • He ment numberq you're not making up on the spot.
  • He also left out how I mentioned that not every one that owns the device, uses the assistant. My parents are a perfect example. They own a Windows Phone and an Android Phone (I think a lumia 640 with W10M and LG G4 with Marshmallow) Neither of them use Cortana or Ok Google.
  • But if they did, they likely would not stray from the default. I question whether this is the future at all. People don't seem to want to to their devices, especially computers. Maybe if it gets good enough that will change, but Microsoft certainly isn't the only one with a dog in that race.
  • You're still avoiding the point. You initially said there are Less Cortana users than users of other AI assistants. The only numbers you attempted to provide were how many users there are of Android devices. You haven't provided anything that determines the amount of people that actually USE the AI assistants.
  • I actually mentioned all those points! I said people don't use them anywhere, especially on PC, and if they did, they would use the default. That is why it will be tough for Microsoft unless they come out with something revolutionary. So far, that has certainly not been the case. Google Assistant seems to be the best, but it isn't revolutionary.
  • "All those points" There's one point and you haven't provided any facts for it. I know you think just because you say something it is a fact, but that's not the case. Until you can provide any numbers for active users of Cortana vs Alexa vs Siri vs Ok Google, you've said nothing that has any facts behind it.
  • Yeah, I am just assuming they are all used infrequently. I would also assume Siri is used more than the others. I just meant to say that Cortana is the underdog as it is the default on the least number of mobile devices. I don't think that is risky statement.
  • Where are my numbers wrong? I just gave ballparks.
  • @bleached; I have an Android phone and I have Cortana on it because Microsoft made it available for Android though not sure about Apple. 
  • Siri is Apple only. Google Now and Assistant is and will be everywhere.
  • Siri is just an expensive girlfriend.
  • Great read again Jason, thanks!
  • Microsoft is fantasizing about AI. Google is doing it.
  • What? Fantasizing about what? Both of your comments are silly.
  • Google is much, much better than Bing. You know why? Because Google has the AI. That's why.
  • With all respect you should have a chat with a Bing search engineer from Microsoft so you understand how good this engine is from Microsoft, I haven't used Google for years thanks to the great results I get on bing.  The only issue with Bing today is internationalization, outside US Bing still provides results much less interesting than Google International. (for. ex. Google Mexico is still better than Bing Mexico).
  • I don't know about that any more I am getting the same results from both Bing and Google search engines.
  • I am using Cortana more and more on my phone.  Too bad it does not work for Canadian settings. It's integration with Bing is pretty good.  Hoping it can play better with Skype (like a 3rd party - while on a call "hey Cortana, send the file to <person> or website, or contant info or whatever" - waiting for other companies to jump on like FB, new sites, banking. Great for reminders.  Glad they are moving on this.  Compeition from Google, Amazon, Apple - will be good too.  Wonder if they talk to each other? Mr. V
  • Yesterday I downloaded Marco Dataset from Microsoft and it comes with JSON database to be able to experiment with your own AI algorithms for reading comprehension. For ex. here is one question Q: how old do boars have to be to breed? The database provides 8 answers to this query and the algorithm should pick the best answer (all are answers made by humans, but some were made by experts and other were made in yahoo answers) so the question is how to pick the best one? Here are the 8 answers:  Females should be about 5 to 6 months and have a body weight of over 500g. 
     She must have her first litter before she is 7 months old as after that her pelvic 
     bones will have fused so dangerous for her to breed her first litter after that.  It is best to try to start breeding a female at 4 months old as guinea 
     pigs only have 6 to 8 hours in an estrus cycle that they can get pregnant and their cycles run 17 days. 
     It is very common for them not to get pregnant the first and second cycle. Females should be about 5 to 6 months and have a body weight of over 500g. She must have 
     her first litter before she is 7 months old as after that her pelvic bones will have fused 
     so dangerous for her to breed her first litter after that. Sows 
     can breed at the tender age of five weeks but this is too young. A sow should weigh a minimum of 400gm, equivalent 
     to about three months of age, before being mated and should preferably be a little older than this.. 
     Male guinea pigs should be about three to four months of age before being allowed to mate. Boars must be replaced when they become too large to serve most of the sows on the farm. 2  
     Boars usually have a maximum working life of between 18 and 24 months. 3  
     This means they should be replaced when they are 30 to 36 months old. Boars usually have a maximum working life of between 18 and 24 months. 2  This means they should be replaced when 
     they are 30 to 36 months old. 3  It is very important to keep record of the boars' use so that infertile ones 
     can be detected and replaced as soon as possible. 4  A low sex drive (libido) can also be a problem. Boars must be replaced when they become too large to serve most of the sows on the farm. 2  
     Boars usually have a maximum working life of between 18 and 24 months. 3  This means they should 
     be replaced when they are 30 to 36 months old. When breeding pigs, 
     select sows that are at least 9 to 10 months old as this is the ideal age for breeding. 
     For the boars, you can either buy them when they're at least 8 months old if you have a small 
     number of sows then breed them with a larger number as they grow older.
  • Let's hope that we don't end up with a 'WestWorld'-esque scenario.
  • who has the state of the art in speech recognition? does he really belive microsoft is ahead of google in speech recognition? or any other type of recognition?  
  • Development of AI has nothing to do with how much money one can invest in R&D or with Microsoft or Google. It's requires simple, logical and step by step thinking and anyone willing to decode it can do it.
    As an analogy, Microsoft can invest billions but still it will not increase its probability of discovering something like theory of relativity. There is a limit to what money can buy.
  • Just a few minutes ago a Read this "...We are working intensely to make the machines smarter, but I would like to see progress made in making people wiser and making better use of the talents, because those are the keys to the future of humanity..." (Carlos López Otín-CATHEDRAL of Biochemistry and Molecular Biology-Ovideo Spain) 
  • +1   I wish techno-journalists who haven't coded a day in their lives but have this oxymoronic obsession with technocratic overreaching would understand this.  Thanks for sharing!
  • Microsoft has been more advanced that others ina lot of areas and while they have some great technology, they lack strategy and vision and marketing. As of 2012, they had 10 years and tens of millions invested in speech analytics, but they had no idea what to do with the technology. Incorporate it with MS Word! Idiots. Enter apple and google to show they how to use and market that technology and msft jumps in a day late and a dollar short. MSFT can do AI, and they've been talking about it for years but have done nothing with it while competitors pass them up msft just becomes a me too company. They should have stuck with mobile, the most obvious use case for speech and AI. But they lack vision beyond business apps. They used to have the OS, but they are giving it up to mobile OSs. They used to have search engine, but other companies have eroded their search business. Web browser.. IE nor Edge can keep up with google ad strategy and technology so people are adopting chrome. They were first to talk about tablets, but never delivered. Where is msft not losing technology business? They are scrambling to salvage what business they can. It's really sad.....