Why we run MediaPipe Face Mesh inside a WebView instead of native bindings, and how that decision lets Omoggle share camera scoring logic across app and web.

When we started Omoggle, we had a small decision tree: native iOS with Vision or ARKit, React Native with a custom bridge, or a WebView running browser JavaScript. We picked the WebView. Here is why, and what it cost us.

The decision

MediaPipe ships a first-class JavaScript build of Face Mesh. It’s well-maintained, stable, and fast enough to score a live duel on modern phones. The React Native native bindings we looked at were thinner, less maintained, and forced us to split the scoring path between app and web much earlier than we wanted.

Shipping native bindings means shipping Xcode work, build-system friction, and a separate Android story later. Shipping the WebView version means writing the scoring surface once and reusing it across the app shell and browser distribution.

We chose the WebView.

What the WebView cost us

Three things, honestly:

A WebView tax.WebView frame pacing isn’t quite native. On older iPhones we can still feel preview jitter before we feel scoring drift.
Message-passing overhead. The duel surface needs to communicate round state, scoring output, and call lifecycle events back to React Native. We use postMessage and a handler on the RN side. It adds a little ceremony, but not enough to matter in a ten-second round.
Permissions edge cases. Camera permissions, audio permissions, autoplay policy, and the order they resolve in are still messier inside a WebView than in a plain browser tab.

What the WebView gave us

Everything else. The same scoring surface can run inside the iPhone shell and in the browser build. The face-mesh logic is shared. When we tune a threshold or quality gate, both surfaces move together.

The browser version also means creators can link straight to omoggle.com and anyone with a supported camera can try the core loop instantly.

The tuning we couldn’t avoid

The tuning work moved from raw landmark smoothing to quality control. A single bad frame can swing a face metric more than we want, so we aggregate multiple frames, trim outliers, and gate rounds on confidence before we write the result.

That was the difference between a funny demo and a rating system we were willing to put in front of real users.

We also learned quickly that preview resolution is easy to overbuy. Lowering the working resolution cut latency without harming score quality, because the mesh model cared more about a stable face in frame than about excess pixels.

What we’d do the same

The WebView decision, again. It is the reason we can move quickly on the app while still keeping a browser surface alive for instant distribution and SEO.

For anything involving a browser-first ML SDK, the WebView approach is underrated. You pay a small tax for a lot of reach.

How We Built Omoggle in an Expo WebView

The decision

What the WebView cost us

What the WebView gave us

The tuning we couldn’t avoid

What we’d do the same

Ready to duel?