Why AI surveillance cameras keep getting it wrong

Last year, Transport for London tested AI-powered CCTV at Willesden Green tube station, running feeds through automated systems from October 2022 to September 2023. According to Wired, the goal was to detect fare evasion, aggressive gestures, and safety risks. Instead, the system generated more than 44,000 alerts—nearly half of them false or misdirected. Children following parents through ticket barriers triggered fare-dodging alarms, and the algorithms struggled to distinguish folding bikes from standard ones.

The impact was immediate: Staff faced 19,000-plus real-time alerts requiring manual review, not because problems existed, but because the AI could not distinguish between appearance and intent. Trained to watch motion and posture, not context, the system exposed a deeper flaw at the core of many AI tools today.

As AI spreads into daily life—from shops to airports—its inability to interpret why we move, rather than simply how, risks turning ordinary human behavior into false alarms.

The Limits of What Cameras Can “See”

Most vision AI excels at spotting patterns: crossing a line, entering a zone, breaking routine. But nuance, ambiguity, and cultural variation trip them up.

“In dynamic or crowded environments, one of the biggest challenges is when people or objects block each other from view,” says Tuan Le Anh, CEO of Vietnam-based Advanced Technology Innovations (ATIN). “When people overlap or move quickly in low lighting, the system might merge them into one person or, worse, duplicate them. It’s easy for cameras to miss key actions or misclassify what’s going on entirely.”

That lack of context has real consequences. A person running could be exercising, fleeing danger, or chasing a bus, but AI sees only the act, not the reason. Most systems process brief visual fragments without factoring in time, crowd dynamics, or audio. “They can say what is happening—like someone running—but not why,” Le Anh notes. “That lack of causal reasoning creates blind spots.”

In practice, this has led to retail cameras mistaking reaching motions for theft, public transit systems disproportionately flagging passengers of color, and healthcare monitors confusing routine gestures with signs of distress—sometimes while missing genuine emergencies.

Le Anh argues the solution lies in training AI to see the whole scene. “When you combine multiple data sources and let the model learn from patterns over time, you get closer to something that understands intent,” he says. “That’s where this technology can stop making the same mistakes and start becoming truly useful.”

False Patterns, Real Consequences

This problem reflects what Sagi Ben Moshe, CEO of Lumana, calls the “pattern-matching trap.” AI trained to classify pixels often latches on to surface details with no real meaning.

“One classic example came from military image-recognition projects,” Ben Moshe tells Fast Company. “They trained the system to detect tanks using photos that happened to be taken near trees. What happened is that the system learned to spot trees, not tanks. It worked great in testing, but failed in the field.”

Lumana—fresh off a $40 million Series A funding round led by Wing Venture Capital and backed by Norwest and S Capital—designs video AI to avoid those pitfalls. Its “continuous learning models” track motion over time and in context.

“There’s a huge difference between seeing and understanding,” Ben Moshe says. “AI currently can detect a person, but it doesn’t know if that person is distressed, distracted, or just waiting for a ride. And when systems act on that incomplete view, we risk misunderstanding becoming automated at scale.”

The risks are highest in schools, hospitals, and stadiums—places where safety depends on accurate classification, and false positives can cause escalation or missed threats. Lumana’s approach integrates diverse data streams to reduce those errors.

Why AI Needs Physics, Not Just Pixels

Experts argue that real understanding requires more than 2D vision. AI must learn the same physical and spatial rules humans absorb as children: gravity, motion, cause, and effect.

“Today’s AI vision systems are amazing at spotting patterns, but terrible at explaining why something is happening,” Ben Moshe says. “They don’t have a built-in sense of physical logic. A toddler knows that if you push a ball, it rolls. An AI model doesn’t, unless it’s seen millions of videos of balls rolling in similar ways.”

Industry efforts are moving in that direction. Lumana builds structured models of objects, forces, and scenes, while ATIN explores transformer-based vision and 3D scene graphs to capture depth and relational context. But high-resolution, real-time interpretation demands vast processing power. As Ben Moshe puts it, “Not everyone can have an Nvidia H200 sitting in their building.”

Building AI That Understands

As companies race to automate physical spaces, the stakes are clear: Unless AI learns context, we risk scaling human blind spots into automated ones.

“When you deploy AI that sees without understanding, you create systems that act with confidence but without context,” Ben Moshe says. “That’s a recipe for unfairness, distrust, and failure, especially in high-stakes environments.”

Ben Moshe and Le Anh agree: The future of AI won’t hinge on sharper cameras or better labels, but on reasoning—linking movement to meaning and time to intent. If AI is to coexist with humans, it must first understand us and our world.

Progress is happening, with models that integrate time, audio, and environmental cues. But real trust will depend on systems that are not only smarter but also transparent, interpretable, and aligned with human complexity.

When that shift comes, AI won’t just recognize a face or track motion, it will grasp the context behind it. And that opens the door to technology that doesn’t just watch us, but works with us to create safer, fairer, more responsive public spaces.

https://www.fastcompany.com/91391237/why-ai-surveillance-cameras-keep-getting-it-wrong?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Établi 5h | 25 août 2025, 13:20:05

Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

TikTok shutdown deadline will keep getting extended, says Trump

President Donald Trump is calling

25 août 2025, 15:30:09 | Fast company - tech

The gap between AI hype and newsroom reality

Although AI is changing the media, how much it’s

25 août 2025, 10:50:11 | Fast company - tech

Big Tech locks data away. Wikidata gives it back to the internet

While tech and AI giants guard their knowledge graphs behind proprieta

25 août 2025, 10:50:10 | Fast company - tech

Another AI tool won’t solve your problems. But AI training might

Every company wants to have an AI strategy: A bold vision to do more w

25 août 2025, 10:50:08 | Fast company - tech

Smarter AI is supercharging battery innovation

The global race for better batteries has never been more intense. Electric vehicles, drones, and next-generation aircraft all depend on high-performance energy storage—yet the traditiona

24 août 2025, 11:40:14 | Fast company - tech

AI passed the aesthetic Turing Test, raising big questions for art

Pick up an August 2025 issue of Vogue, and you’ll come across an advertisement for the brand Guess featur

24 août 2025, 09:20:14 | Fast company - tech

This word-search website is the brain boost you never knew you needed

Language is the original technology, the tool we’ve all used to coordinate with each other for thousands of years. Our success in life—both professionally and in relationships—depends on it.

24 août 2025, 00:10:13 | Fast company - tech

Tomas_r2