Discovery, segmentation, and the mobile-first call are in the previous case study. This picks up where that ends: scope defined, team assembled — now the harder part. No existing processes, first PM hire, small cross-functional team, AI-native app.
Before any frontend work, the goal was to prove the core technology worked. Partnered with ML engineers on the vector database and RAG architecture. Content extraction tested across real-world sources — web pages, articles, LinkedIn posts, Instagram, Facebook — to validate the AI could reliably parse and index diverse content formats.
Platform blocking: Some platforms like Instagram don't expose content for extraction. Rather than treat this as a blocker, I framed the trade-off clearly: accept a small failure rate, add graceful error handling, move on. Not worth the weeks a perfect fix would take.
AI eval — Collections clustering: One core feature was auto-organizing bookmarks into Collections based on semantic similarity. Before committing to it, I worked with the team to define what "good" meant: if someone saves ten articles about leadership, do they cluster together? What's the acceptable error rate before the feature creates confusion? Built a lightweight eval framework — test set of saved content, ran the model, reviewed outputs, set a quality threshold. That validated it was ready to ship, not just technically functional.
The designer proposed a home screen with a daily feed and recommendations engine. The timeline didn't support it.
Rather than write a spec arguing the point, I built a lo-fi prototype over a weekend — Top Collections, Top Tags, Simple CTA, Loft Tips — and shared it with the team. The goal wasn't pixel-perfect; it was giving the designer something concrete to react to and improve. That handoff — rough concept to high-fidelity — happened in a day.
Ran TestFlight builds every few days and shared them with early testers. The feedback that came back was clear: saving content felt like copying a link and sending it to yourself on WhatsApp — the current workaround most people were already using. Brought it to engineering, and the share-sheet solution came out of that conversation — embed Loft directly in the iOS and Android share sheet so saving is one tap, no context switch. Built it, re-tested, launched with it in.
Paywall SDK failure: The paywall SDK broke mid-sprint — product IDs stopped syncing with App Store Connect. Engineering was split on how to proceed. My role wasn't to diagnose the failure — it was to communicate the impact clearly to leadership and marketing, and unblock the product questions so engineering could move. When RevenueCat was proposed, I answered immediately: how freemium and pro tiers were structured, which features were gated, what the paywall UX needed to communicate. The integration moved fast once those questions were off the table.
App Store rejection: First submission rejected for unclear data collection disclosure. Fixed the privacy policy language, resubmitted, approved shortly after.
Targeted a tighter window. Shipped later. Slippage came from model optimization, scope decisions, the paywall tool switch, and the App Store review cycle. Each delay had a clear cause — and a decision attached to it.
The hardest part of execution isn't making decisions — it's translating them into work that engineering and design can actually pick up and run with. Clear tickets, sprint-sized chunks, staying present enough to unblock both teams — that's what moves an idea from a spec into something a user can hold, test, and tell you is wrong.
Validation doesn't start at launch. It starts the moment the first build lands in front of a real person.