Source: Windows Central
What you need to know
- Microsoft's DeBERTa AI model outperformed humans in a test of natural language understanding.
- The AI earned higher marks than the human baseline in the SuperGLUE test.
- Google also has an AI that beats the human baseline, though Microsoft's AI model scores higher on the same test.
Microsoft invests heavily in artificial intelligence in a wide range of sectors. One of those sectors is natural language understanding, which aims to have AI models understand everyday speech. This is a particularly tricky challenge for machines, but Microsoft's DeBERTa AI model recently scored higher than the human baseline in the SuperGLUE test.
As explained by Microsoft, SuperGLUE is one of the most challenging benchmarks for natural language understanding. Microsoft shares an example in its recent blog post:
Given the premise "the child became immune to the disease" and the question "what's the cause for this?," the model is asked to choose an answer from two plausible candidates: 1) "he avoided exposure to the disease" and 2) "he received the vaccine for the disease."
This is a simple question for humans. We have background information and are used to placing things within context, but it's a challenging question for AI. To make an AI model answer this question correctly, it needs to understand cause and effect, and both options presented to it. The SuperGLUE test includes natural language inference, co-reference resolution, and word sense disambiguation, as explained by Microsoft.
The DeBERTa model was recently updated to include 48 Transformer layers and 1.5 billion parameters. As a result, the DeBERTa model earned a macro-average score of 90.3 in the SuperGLUE test. The human baseline for the same test is 89.8.
Microsoft states that it will release the DeBERTa model and its source code to the public.
Microsoft explains that the DeBERTA AI model beating out humans in the SuperGLUE test doesn't mean that it's as intelligent as humans.
Despite its promising results on SuperGLUE, the model is by no means reaching the human-level intelligence of NLU. Humans are extremely good at leveraging the knowledge learned from different tasks to solve a new task with no or little task-specific demonstration. This is referred to as compositional generalization, the ability to generalize to novel compositions (new tasks) of familiar constituents (subtasks or basic problem-solving skills). Moving forward, it is worth exploring how to make DeBERTa incorporate compositional structures in a more explicit manner, which could allow combining neural and symbolic computation of natural language similar to what humans do.
Microsoft's DeBERTa model isn't the first to beat the human baseline on the SuperGLUE test. Google's T5 + Meena" model hit a score of 90.2 on January 5, 2021. Microsoft's DeBERTa model beat Google's with a score of 90.3 just a day later.
We may earn a commission for purchases using our links. Learn more.

We're going to see tons of game delays this year and that's OK
We'll be seeing a lot of games delayed throughout 2021, more than were delayed in 2020. Right now, you should only depend on playing things that were originally supposed to release last year. Here's why that's OK.

AMD's Radeon RX 6000 GPUs have arrived. Here's where to find them.
AMD's Radeon RX 6000 GPUs aren't easy to find, and you might be wondering which models are available where. Check out the retailers and models you can expect to buy when stock normalizes.

These are the biggest PC announcements from CES 2021
CES 2021 was different in that it wasn't held at a physical location. Instead, companies relied on press kits and virtual presentations to showcase all the new products. We've rounded up the best PC-related announcements in case you happened to miss the show.

These are the best GPUs for playing Escape from Tarkov
Looking for a new GPU to play Escape from Tarkov? Here are our top picks for 1080p, 1440p, and 4K, from both NVIDIA and AMD.