For a decade, mobile games have been mostly finger games. Tap, swipe, drag, long-press. The phone was a thing you held and manipulated with thumbs. We think that’s about to change.
The 67 Challenge wasn’t really about the number 67. It was about a thing people noticed they enjoyed: their body in the camera, producing a number. That’s a game genre, and it’s been technically viable for about 18 months without anyone really showing up to claim it.
The three things that converged
On-device ML finally works. MediaPipe, CoreML, and TFLite can all do pose estimation at around 30 FPS on commodity phones. The inference runs locally, doesn’t need a network round-trip, and doesn’t cost anything per session. The models are small enough to ship inside an app bundle.
Front cameras got good. The iPhone front camera is now a reasonable sensor — wide angle, low noise, enough resolution that a 480x360 crop still produces useful pose inference. The same phone that ran GarageBand can now run a movement-tracking game.
TikTok retrained users. Three years ago, pointing a phone camera at yourself while doing a physical action felt weird to most adults. Today, tens of millions of adults do it every day. The cultural barrier that kept camera games niche is gone.
Why this is a category, not a trend
Trends come from a single viral moment. Categories come from a durable new primitive — a thing that was hard yesterday and easy today. Camera input is a new primitive.
Once it’s cheap to run real-time body tracking on a phone, you get a whole class of games that weren’t possible before: pump-speed games like 67 Speed, rhythm games where you dance on the beat, reaction games where you dodge, stretch games, pose-matching games, multiplayer games where two friends face two phones. The primitive is downstream; the genre branches are up to builders.
Why no one has dominated yet
Camera games look simple from the outside — “just wrap MediaPipe” — and are slightly hard from the inside. The tuning work (scoring thresholds, smoothing, camera placement hints, haptic timing) is the difference between a toy and a game. Most of the camera-input projects on GitHub are toys. The stuff they ship tomorrow determines whether they become games.
We also think distribution is harder than people expect. A camera game needs permission, needs a well-lit room, needs a few feet of space. Getting casual mobile users past those frictions in a 1-minute first-play window is its own research problem.
What we’re watching next
Multiplayer. Right now the state of the art is one person, one phone, one leaderboard. The next unlock is two people on two phones playing the same round — either head-to-head scoring or coordinated dance. The networking is easy; the UX of “point your camera at yourself while also looking at your friend’s screen” is hard.
Also watching: AR glasses. When the camera is on your face instead of in your hand, a whole new set of inputs opens up. We’re betting camera games transfer there naturally.
What we’re doing
Shipping Omoggle as well as we can, and treating it as the first entry in the category. The engineering is reusable (see how we built it). The brand is reusable. The leaderboard infrastructure is reusable. Whatever the next camera game is, we want to have built the one before it.