Why Your Job Won’t Get Automated

Last week, Anthropic released a preview version of their Cowork product. It is touted as a tool that can automate non-technical work. At its core it uses Claude 4.5 Opus (LLM) and Claude code’s “harness” (code that orchestrates the LLM) to plan and execute tasks that involve computer use, for example researching a topic and creating a PowerPoint deck on that topic. The first time I used Claude Code was like the first time I rode in a fully self-driving car. However, much like the self-driving car, drive in it enough and you quickly realise its shortcomings.

The starkest examples occur when encountering novelty. Self-driving cars can handle well-signed and maintained roads, but when met with roadworks or another self-driving car it will freeze or worse (google Chinese driverless van for a laugh), because the world has stepped outside the distribution of their training. LLM agents fail in the same way. The Claude Code equivalent is continuously looping around the same failed plan or deleting an entire set of testcases.

If you ask Cowork for a slide deck on the top five Premier League teams over the last three years and you may receive something that looks plausible, even impressive, complete with charts and confident prose, but the ordering is wrong for two of the years. Dig in and the failure mode becomes clear. The agent could not open the relevant table for those seasons, so it invented results without acknowledging that it was guessing. This is the equivalent of a human assistant who quietly fabricates the accounts, then hands you a glossy report.Proponents reply that they work, and they work well, you just don’t use it properly. These are the same people who would design the whole world around self-driving cars with standardised signals, stations and tracks instead of investing in railways.

Coding agents are the most advanced use-case because they operate within a narrow, formal domain that has a wealth of verifiable training data. Anthropic has even suggested that Claude Code now writes all of their code. What tends to be left out is the larger reality of software engineering. Humans still decide what to build, set the architecture, model the data, define interfaces, handle security, shape UI and UX, orchestrate systems, and then monitor and approve what the agent does. Using this as the benchmark for state-of-the-art, the prospect of replacing jobs that span multiple domains, mixed incentives, and ambiguous goals looks remote.

The deeper mistake sits beneath the hype. Do not confuse the purpose of a job with the series of automatable tasks found inside it. Most organisations do not hire a person because they can make slides, write lists, or collate research. They hire a person because someone must be accountable, and because trust matters more than throughput.

‍

19 December 2026

Measure How Much Productivity You Could Gain With Our Calculator

Our productivity calculator reveals the potential costs Traffyk can save your business and improve productivity by when inefficient workforce communication is reduced.

Productivity Calculator Contact Us