For those who have just "tuned into" the tech scene, the bot and AI future we're now entering may appear to be a new revolution fueled by the "me too" efforts of established companies and start-ups alike. This assumption would be an error. The industry has been moving toward this goal for decades. In fact, the technology has only recently caught up with the tech dreams of those who came before us.
Investments in machine learning, natural language processing and deep neural networks have finally begun yielding the fruit where artificial intelligent sidekicks that know us and can act proactively on our behalf seems well within reach. Though many companies have invested in, advocated for and applied these technologies in their products through the years, Microsoft has been one of the earliest and most deeply invested leaders in the industry.
A vision forward
Cortana is now extensible to what Microsoft refers to as "experts", or bots that developers build to engage users as proactive, intelligent apps. As part of Microsoft's Conversations as a Canvas strategy, Cortana is, in essence, a cross-platform "platform" that can benefit from the large Android and iOS install base.
Cortana, Microsoft' evolving cloud-based, device agnostic, intelligent agent has long been in the making. And even now we are only glimpsing the coalescing of the fundamental variables that Redmond is working to materialize into a ubiquitous intelligence that unifies the entirety of a user's digital experiences. These efforts began under the tenure of Microsoft's first and second CEO's Bill Gates and Steve Ballmer respectively.
As Moses led the children of Israel out of Egypt but not into the Promised Land, Bill Gates and Steve Ballmer heralded the investments that have brought Microsoft to the border of a new age in personal computing. As Joshua inherited the charge of his predecessor and led Israel into Canaan, Redmond's current chief executive, Satya Nadella, is leading Microsoft into the age of AI and bots.
Though Nadella is at the helm during this pivotal transition, we mustn't forget those who brought the company to this point.
Bill opens the gates
Microsoft's first CEO Bill Gates was a visionary. His dream to put a computer on every desk and in every home was ultimately a means to bring Microsoft's personal computing environment to the masses. And it was largely achieved.
The present age of personal computing that is not bound to the desk required an evolution of the strategy to bring Microsoft's brand of personal computing to the masses. A personal computing environment that has the benefit of a consistently connected internet, is not bound to a single device and has access to a vast repository of information required an intelligent UI to help users manage digital experiences in such a dynamic environment. This advancement is a tremendous evolution to the ill-fated Microsoft Bob which was introduced over 20 years ago.
In the video below Gates expounds on the utility of Microsoft Bob but also foreshadows a more advanced AI to come.
Years later Gates expressed a vision of an intelligent agent that would be personal, present and deeply integrated into a user's life. It would be wholly consistent with the modern and mobile state of personal computing:
As everyone gets essentially what we'd call the personal agent—it's been talked about for decades and now really is possible—we see where you're going, we see your calendar, we see your various communications, some of those communications we can actually look at the tags, look at the speech, try to be helpful to you in your activities…I think that we will be more connected, so that when somebody wants to find a gift of a certain type, or take a trip in a certain way, that there will be a closer match."
Though many Windows Phone fans, myself included, initially hoped that Cortana would never go cross-platform, her venturing to iOS and Android coupled with her deeper integration within Microsoft's other services are key to Microsoft's vision for the AI. Microsoft does not view Cortana as a Microsoft-ecosystem-focused AI, but more of a boundless platform for intelligent interaction with all digital experiences.
Ballmer kept the ball rolling
Steve Ballmer succeeded Bill Gates as CEO of Microsoft in 2000. Though Ballmer's leadership style differed from that of his predecessor, the goal of bringing an ever-present intelligent agent to the masses remained.
During his tenure investments in the foundational technologies required to make the dream a reality continued. Steve Ballmer articulated the following description of his and Microsoft's AI vision in a July 11th, 2013 memo:
"Our machine learning infrastructure will understand people's needs and what is available in the world, and will provide information and assistance,"…"We will be great at anticipating needs in people's daily routines and providing insight and assistance when they need it. When it comes to life's most important tasks and events, we will pay extra attention. The research done, the data collected and analyzed, the meetings and discussions had, and the money spent are all amplified for people during life's big moments."
We will provide the tools people need to capture their own data and organize and analyze it in conjunction with the massive amount of data available over the Web."
Our shell will natively support all of our essential services, and will be great at responding seamlessly to what people ask for, and even anticipating what they need before they ask for it."
Ballmer expanded on the all-encompassing nature of a ubiquitous cloud intelligence by sharing Microsoft's vision of a family of first-party devices that this intelligence and the Windows platform would unify. We saw the fruit of this vision two years later on October, 6 2015 during Microsoft's Windows 10 Devices Event.
A few years ago in a speech I gave at CES, I observed that there was a shift underway. We were headed from a phone, a PC and a TV to simply three screens and a cloud — and over time, a common software-based intelligence would drive all of these devices, bringing them together into one experience for the consumer."
"As devices proliferate, it has become clearer that consumers crave one experience across all of their technology…Going forward, our strategy will focus on creating a family of devices and services for individuals and businesses that empower people around the globe at home, at work and on the go, for the activities they value most.
To take advantage of our critical competitive assets, we will center our work on…A business model based on partner and first-party devices with both consumer and enterprise services… A family of devices powered by a service-enabled shell… No technology company has as yet delivered a definitive family of devices useful all day for work and for play, connected with every bit of a person's information available through one cloud.
Our devices must share a common user-interface approach tailored to each hardware form factor.
Reflecting on Ballmer's vision
Reflected in Ballmer's statements is Microsoft's commitment to the duo user by providing a portfolio of partner and first-party devices which are useful all day for work or play. These devices would be connected by an intelligent cloud and common UI. Ballmer's reference to the company's intelligent agent indicated that it would be user-focused and would know and support the user in both professional and personal settings.
Ballmer predicted this ubiquitous intelligence would unify a family of first-party devices.
This duo user focus is an important point when we consider that computing has moved to the cloud, and that Redmond's hardware is purposefully context sensitive easily transitioning between professional and personal productivity. Microsoft envisions a bot-connected Cortana that effortlessly flows with a user across services, and devices growing through the rich content of user's experiences and data gleaned from the system. The systems intelligence would mature and become more proficient at anticipating and serving a user's needs.
Naturally, the success of a company's endeavors doesn't rest solely in the plans of its leaders. Scores of talented individuals are required to bring those dreams to fruition. Time and space will not allow us to acknowledge all who have contributed and are contributing to bringing Microsoft's AI vision to fruition. However, we've "heard" the broad overarching views of previous leaders. Let's now listen to the voices of some of those who have helped and are helping that vision to materialize.
Larry Heck talks Microsoft's AI tech
Larry Heck is currently a Principal Scientist with Google Research. Before joining the Mountain View company, as a Microsoft employee, Heck was credited with starting the conversational-understanding (CU) personal-assistant effort at Microsoft in 2009.
Prior to his current post he worked on a technology vision for virtual personal digital assistants for the Bing research and development team. At the request of Zig Serafin, who was appointed by Ballmer to unify Microsoft's speech efforts across the company, Heck joined his team as the chief scientist. Through this position, Heck and the team began building the plan that later evolved into what (or who) we know as Cortana.
Heck had this to say in an April 2014 interview about the current state of digital assistants:
I believe the personal-assistant technology that's out there right now is comparable to the early days of search..in the sense that we still need to grow the breadth of domains that digital personal assistants can cover. In the mid-'90s, before search, there was the Yahoo! directory. It organized information, it was popular, but as the web grew, the directory model became unwieldy. That's where search came in, and now you can search for anything that's on the web."
"Current implementations target the most common functions, such as reminders and calendars, but as technology matures, the personal assistant has to extend to other domains so that users can get any information and conduct any transaction anytime and anywhere."…"Having a long-term vision means we have a long-term architecture. The goal is to support all types of human interaction-whether it's speech, text, or gestures-across domains of information and function and make it as easy as a natural conversation.
It is interesting to see how two years after these statements we see how Cortana can be interacted with via Microsoft's new inking platform in addition to the standard speech and text interactions that are available.
AI's basic fundamentals according to Heck
"In fact, we've pioneered efforts in each of those areas."
According to Heck, "The base technologies for a virtual personal assistant include speech recognition, semantic/natural language processing, dialogue modeling between human and machines, and spoken-language generation" and "each area has in it a number of research problems that Microsoft Research has addressed over the years. In fact, we've pioneered efforts in each of those areas", he says.
Heck continues with emphasizing the long history Microsoft has in these critical areas of natural language processing, machine learning, deep learning and deep neural networks:
The underlying vision for this work and where it can go was derived from Eric Horvitz's work on conversational interactions and understanding, which go as far back as the early '90s. Speech and natural language processing are research areas of long standing, and so is machine learning. Plus, Microsoft Research is a leader in deep-learning and deep-neural-network research.
Microsoft's years of investment in this increasingly competitive space has helped prepare the company for the challenges it will face in its quest to emerge as the leader in AI and bots.
Mike Calcagno and Larry Heck talk Microsoft's AI advantage
Mike Calcagno, Director of Engineering for Bing experiences joined Larry Heck in an interview where we, through the conversation get a clear picture of the strategic advantage Microsoft brings to AI and bots. As a result of its various tools and services such as Bing and Skype, Microsoft has a wide net through which the company can feed data back into the system to continually improve upon it. Heck explains:
I think a lot of advancements in speech recognition is dependent on the data and how much data you have that's fed back into the system and the quality of data you have fed back. And the amount data we're getting right now that we're pulling back from particularly Bing voice search is tremendous. And we're extending speech recognition capabilities also into Skype, the Skype translator, for speech to speech translation so that there's speech recognition that occurs when the person is talking in English and translation and then synthesis into say German. Well, all of that data on the translator side, the voice search side and Cortana side and all of the other speech applications feeds back into a single recognizer we use across the board for the whole company.
So I think one of the things that is an advantage for us is that we have so many different scenarios, and products that are out there that feed back into this speech recognition technology.
These varied points of data input provide Microsoft with a unique resource to help evolve its AI. Calcagno also stressed that Microsoft recognizes the usefulness of both Siri and Google Now but Microsoft's goal was not to make Cortana a "knock off" either of its rivals.
Long road ahead
Microsoft has a long history in machine learning, natural language processing, and deep neural networks. This history coupled with widely used cross-platform tools and services which consistently glean human language data from users and reintroduces it into the system positions Microsoft efforts at the front of the AI race. Redmond's Bot Framework and Conversation Canvases also positions Microsoft's ecosystem as the developer's platform or "devbox" for intelligent apps or bots.
Finally, Cortana, unlike her competitors, as a Conversation Canvas agent, is being positioned as an unbounded intelligent platform to complete computing functions independent of a user's device or operating system. She is slowly evolving into Microsoft's vision of an intelligent cloud-based UI that will know users, and as a meta-app interacting with bots, anticipate and facilitate a user's needs.
We are at the beginning of a long and competitive road.
Admittedly we are at the very early stages of this journey. As a parent joyfully (and fearfully) celebrates the transition of their child from infancy to the first day of school, we are celebrating the transitioning of Microsoft's AI and machine learning efforts into the mainstream. Still, as with a child in kindergarten, there remains a very long road ahead.
So as we watch Cortana's evolution, take the time now and then to revisit these visionary statements from those who came before. Yes, there is competition from the likes of Google, Viv and even Apple within its closed ecosystem. But taking a look back every so often will help us see just how this ambitious effort, which began many years ago, is materializing within an increasingly competitive space.
So what are your thoughts? Is Microsoft on course with its AI vision which it set forth years ago? Does competition from rivals require a shift in plans? Is Microsoft's industry-wide platform play too broad? Does it dampen the uniqueness of Microsoft's ecosystem? Sound off in comments and on Twitter!
If you missed parts I or II or these related pieces check them out here!
- Part I: My evolving view of Microsoft's AI strategy
- Part II: Cortana to rule the world
- The untold app gap story part III: The mobile web is the path to bots
- The untold app gap story Part IV: Going from apps to bots
We may earn a commission for purchases using our links. Learn more.