AI gone wild: Gemini and ChatGPT flub spouse questions of public figures, says WSJ

Artificial Intelligence AI Assistant Apps - ChatGPT, Anthropic Claude, Google Gemini, Microsoft Copilot, Perplexity, Poe.

(Image credit: Getty Images | iStock | Kenneth Cheung)

A recent in-depth analysis by BBC News revealed that Copilot and ChatGPT generate AI news summaries riddled with inaccurate information because they are unable to discern opinions from facts. And as it now happens, the hallucination episodes continue to haunt AI tools.

According to a report by The Wall Street Journal, AI tools are more likely to err when asked who someone is married to (via Futurism).

I blatantly attempted to replicate the results and findings shared by WSJ and Futurism for Windows Central, but my efforts were futile. According to ChatGPT:

"Based on available public information, there is no indication of Kevin Okemwa's marital status. Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya, known for his work across several publications, including Windows Central."

Microsoft Copilot blurted out a similar response, but it was not half as polished. It seemingly picked up information from multiple people I share a name with. I guess there's no love lost for me as Valentine's Day edges closer.

However, WSJ and Futurism's findings are quite interesting. For instance, Futurism's Noor Al-Sibai asked Google's Gemini who she was married to. The chatbot instantly generated a makeshift husband for the reporter, "Ahmad Durak Sibai," prompting her to spit out her coffee.

This is quite hilarious because Al-Sibai revealed that she wasn't married at the time of writing. According to the reporter:

"I'd never heard of such a person, but a little Googling found a lesser-known Syrian painter, born in 1935, who created beautiful cubist-style expressionist paintings and who appears to have passed away in the 1980s. In Gemini's warped view of reality, our love appears to have transcended the grave."

While I could not replicate similar outputs using Copilot or ChatGPT, Al-Sidai's findings were consistent with WSJ's AI editor, Ben Fritz. Fritz seemingly used multiple AI chatbots, and while he didn't disclose the exact models, they seemingly married him off to a tennis influencer, a random lowan woman, and a writer he'd never interacted with.

Perhaps more concerning, Al-SIdai switched to other chatbots in an attempt to establish a pattern of AI tools spreading misinformation about people's marital status. For OpenAI's ChatGPT, the results were more or less the same as those depicted by Google's Gemini.

Interestingly, as The Wall Street Journal highlighted, Anthropic's Claude AI has seemingly been trained to respond with some level of uncertainty to questions it doesn't have an answer to or understand.

This comes after recent emerging reports suggest top AI labs, including OpenAI, Google, and Anthropic, cannot develop advanced AI systems due to a lack of sufficient and high-quality content for model training. The AI firms seemingly lean more toward reasoning AI models amid the rising number of lawsuits filed by publishers and authors citing copyright infringement issues.

How dependable is generative AI? The short answer: It's complicated. This is in the wake of a detailed report by Microsoft indicating that an overdependence and reliance on AI-powered tools like Microsoft Copilot and OpenAI's ChatGPT negatively impact a user's critical thinking.

TOPICS

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.