AI’s Limits Exposed: New Study Finds Machines Struggle With Real Remote Work

Research shows AI agents fail most remote tasks, with top performer automating just 2.5% of freelance work.

The study, called the Remote Labor Index (RLI), represents one of the most detailed attempts so far to measure AI’s performance on practical digital work.

Methodology

Researchers collected 240 completed projects from professional freelancers working through platforms such as Upwork.

Each project included the original brief, all input materials, and the final deliverable that a client had accepted.
These projects came from 23 categories of work, including product design, animation, architecture, game development, and data analysis.
Together they covered more than 6,000 hours of paid labor valued at about $140,000.

Results

Six advanced AI agents were then tested on the same projects.

The systems included Manus, Grok 4, Sonnet 4.5, GPT-5, ChatGPT agent, and Gemini 2.5 Pro.

AI agents fail most remote tasks, with top performer automating just 2.5% of freelance work.

Author’s summary: AI fails most remote tasks with top performer automating just 2.5%.

Digital Information World — 2025-11-01