Blog

25,000 Viewers, Zero Milliseconds to React

What the IShowSpeed incident taught us about livestream safety.

Derek, Founder of Airlock

Derek

Founder of Airlock ·

Key Takeaways

  • 1.The IShowSpeed incident proved that live content can go viral in seconds, with no way for a human to intervene in time.
  • 2.Human reaction time averages 250ms. AI frame classification takes under 10ms. That gap is the entire problem.
  • 3.A real-time safety layer running on-device can detect and act on risky content before it ever reaches viewers.

What happened

On August 16, 2023, Darren Watkins Jr. (better known as IShowSpeed) was streaming Five Nights at Freddy's to roughly 25,000 concurrent viewers on YouTube. A jump scare startled him, he reacted physically, and in the process accidentally exposed himself on camera (Complex, 2023).

The clip spread instantly. Within minutes, #IShowMeat was trending across X (formerly Twitter), Reddit, and TikTok. Screenshots were everywhere. Speed had over 20 million YouTube subscribers at the time. Today he has 52.2 million (Wikipedia, 2026).

YouTube didn't ban him (TMZ, 2023). There was no malice here. It was an accident. But the damage was done the instant it happened, and no amount of post-hoc moderation was going to undo it.

This isn't a one-off. It's a structural problem.

People watch a lot of live content. Global live stream watch time hit 32.5 billion hours in 2024, up 12% from the year before and double what it was in 2019 (Teleprompter.com, 2024). The live streaming market is now worth over $100 billion and is projected to reach $345 billion by 2030 (Teleprompter.com, 2024).

And none of that content has a safety net while it's live.

Platforms moderate after the fact. Twitch took 39,876 enforcement actions for adult nudity and sexual conduct in the second half of 2024 alone (Twitch H2 2024 Transparency Report). YouTube has removed 178.5 million videos between 2019 and 2024 (Video Advertising Bureau, 2025). But removing a video after thousands of people already saw it and clipped it doesn't fix the problem. It just cleans up the mess.

Here's the core issue: human reaction time. A peer-reviewed study published in the Indian Journal of Physiology and Pharmacology found that average visual reaction time is approximately 250 milliseconds, with college-age individuals averaging around 190ms under ideal conditions (Jain et al., 2015). And that's for a simple motor response to a visual stimulus. Not the complex cognitive task of recognizing that something inappropriate just appeared on screen, deciding what to do about it, and then physically clicking a button.

In practice, a human moderator watching a stream has zero chance of catching a flash-frame incident. The clip gets captured, screenshotted, and shared before the streamer even says "oh my god."

What a safety layer actually looks like

Think of it like this: right now, your camera feed goes straight to your streaming software, and your streaming software sends it straight to the internet. There's nothing in between. If something goes wrong on camera, it goes wrong live.

A safety layer sits between the camera and the stream. It watches every frame and listens to every word in real time. When it spots something that shouldn't be on stream, it acts before the frame reaches your viewers.

How fast? Modern AI content classification can process a single video frame in under 10 milliseconds on GPU hardware, with total pipeline latency (classification plus post-processing) coming in under 30-40 milliseconds (API4AI, 2025). Compare that to human reaction time of 190-250ms. The AI has already detected the problem, made a decision, and triggered a protective action before a person's brain has even finished processing what it saw.

In the IShowSpeed case, a system like this would have detected the NSFW content within the first frame, triggered an automatic scene switch to a safe image, and the 25,000 viewers would have seen... nothing. Just a brief cut to a standby screen while Speed collected himself. No clip. No trending hashtag. No story.

Not every problem needs a nuclear option

One of the mistakes people make when they think about content safety is treating it as binary. Either the stream is fine or you kill it. That's not how it should work.

A better model is a graduated response. Think of it as a ladder:

Low confidenceWarn the streamer privately
Medium confidenceMute the audio or bleep it
High confidenceBlur the video feed
Very high confidenceSwitch to a safe scene automatically
CriticalAlert a moderator or end the stream

You set every threshold. The system follows your rules.

The key word there is "confidence." Not every detection is certain. If the system is 60% sure something is off, maybe it just sends you a private warning. If it's 95% sure, it switches your scene. You decide where those lines are drawn.

This matters because false positives are real. You don't want your stream cutting to a standby screen every time you lean back in your chair. Graduated response means the system is proportionate. It matches the severity of the action to the certainty of the detection.

Why this has to run on your machine

There's one more piece that matters here, and it's about privacy.

If a safety system is sending your camera feed to a cloud server for analysis, that means a third party is watching everything on your stream. Every frame. Every word. That defeats the purpose. You're solving one privacy problem by creating another one.

The right way to do this is local processing. The AI models run on your hardware. Your audio and video never leave your machine. No cloud round-trip, no third-party server seeing your footage, no latency penalty from network hops. The detection happens right where the camera feed lives.

It also means the system works even if your internet is spotty. The safety layer doesn't depend on a server being up. It's running next to OBS on your desktop.

This tech exists now

We built Airlock because this problem isn't theoretical. It happens to real streamers, with real audiences, and the current solution is "hope your moderator is fast enough" or "hope it doesn't happen to you."

Airlock is a desktop app that runs alongside OBS. It monitors your audio and video with on-device AI, applies the policies you configure, and takes protective action when it detects a problem. Profanity filtering, face blurring, sensitive info detection, brand masking, and a fully configurable response ladder. All local. All real-time.

We're in early access right now. If you want to be one of the first to try it, you can request access here.

Frequently Asked Questions

Could Airlock have prevented the IShowSpeed incident?

Yes. NSFW frame classification takes under 10ms on modern hardware (API4AI, 2025). Airlock would have detected the content in the first exposed frame and triggered an automatic scene switch before any viewer saw it. The entire incident would have been a brief, unexplained cut to a standby screen.

Does Airlock send my video to the cloud?

No. All processing happens locally on your computer. Your camera feed, audio, and any detected incidents never leave your machine. There is no cloud server involved in the analysis pipeline.

Will it cause false positives and interrupt my stream?

That's why the response is graduated. Low-confidence detections just warn you privately. Only high-confidence detections trigger visible actions like scene switching. You control all the thresholds, so you can tune it to be as cautious or as aggressive as you want.

Does it work with OBS?

Yes. Airlock connects to OBS via the OBS WebSocket v5 protocol. It can switch scenes, mute audio sources, and trigger other actions directly through OBS. No plugins or additional software needed.