Support
About UD
LoginContact Sales
EN
UD Blockchain
InfiniAI
Security
Cloud Server
Network
Cloud Hosting
Solution
UD Blog
LoginContact Sales
Support
About UD
EN

UD Blog

Unveiling Perspectives and Delivering Insights Related to Tech

The weaknesses of AI shopping assistants are fully exposed


 

Microsoft has recently established a simulated economy in which hundreds of AI act as buyers and sellers, observing their failures in handling basic tasks. These results should raise concerns for anyone betting on automated AI shopping assistants.

 

Join the CFTime TG discussion group! Stay updated on the latest trends in the cryptocurrency market and AI news, plus get free Web 3 entry tickets!

 

According to research from Microsoft in collaboration with Arizona State University, 100 client AI agents competed against 300 business agents in scenarios like ordering food. While the results were largely as expected, the study indicates that the potential for autonomous agent commerce has not yet reached maturity.

 

When faced with 100 search results (which were too complicated for the agents), the primary AI models could not respond effectively, significantly lowering their "welfare scores" (a measure of the model's usefulness). These agents failed to perform comprehensive comparisons and chose the first "good enough" option they encountered, resulting in a phenomenon known as "first proposal bias," which made response times 10-30 times faster than the actual quality.

 

AI Shopping Assistants Vulnerable to Malicious Manipulation

 

Worse yet, Microsoft found that AI shopping agents are also susceptible to malicious manipulation. Microsoft tested six manipulation strategies, from fake credentials and social proof to aggressive prompt injection attacks. OpenAI's GPT-4o and its open-source model GPTOSS-20b were extremely vulnerable to these manipulations, with all payments successfully redirected to malicious agents. In contrast, Alibaba's Qwen3-4b was easily influenced by basic persuasion techniques, while only Claude Sonnet 4 was able to resist such manipulations.

 

When Microsoft asked the agents to work towards a common goal, some agents could not clarify their roles or coordinate effectively. Although performance improved under clear, step-by-step human guidance, this contradicted the purpose of autonomous agents.

 

Consequently, Microsoft suggests that the current effectiveness of AI agents in shopping is suboptimal. They stated, "Agents should assist, not replace human decision-making." The study recommends adopting a supervised autonomy model, allowing agents to handle tasks while humans retain control and review suggestions before final decisions.

 

These findings come at a time when companies like OpenAI and Anthropic are racing to launch autonomous shopping assistants. OpenAI's Operator and Anthropic's Claude agents promise to navigate websites and complete purchases without supervision, but Microsoft's research indicates that this promise is premature.

 

Amazon Requests Halt to Comet Browser Usage on Its Platform

 

Meanwhile, irresponsible behavior from AI agents has triggered tensions between AI companies and retail giants. Amazon recently sent a cease-and-desist letter to Perplexity AI, demanding it stop using the Comet browser on Amazon's site, accusing the AI agent of impersonating human shoppers and harming customer experience.

 

Additionally, researchers from Gwangju Institute of Science and Technology in South Korea have demonstrated that AI models can develop digital behaviors akin to gambling addiction. The latest study placed four major language models in a negative expected value simulated slot machine, observing their rapid bankruptcies. When presented with varying betting options and asked to "maximize rewards," the models had a bankruptcy probability of up to 48%.

 

Perplexity disputed Amazon's actions, calling it a "legal bluster" that threatens user autonomy, asserting that consumers should have the right to employ their own digital assistants rather than rely on platform-controlled ones.

 

Currently, this open-source simulation environment is available on GitHub, allowing other researchers to replicate these findings and observe market chaos in the simulation.


UD Blockchain Newsletters

The smart way to stay informed on how blockchain, cryptocurrencies and digital assets are transforming global business!

UDomain Whatsapp