
COEY Cast Episode 146
Gemini Flash Live and the Great AI Workflow Reality Check
Gemini Flash Live and the Great AI Workflow Reality Check
Episode Overview
03/29/2026
Google is pushing Gemini 3.1 Flash Live into real time voice and camera workflows, and that makes one thing clear. Voice AI is becoming a real interface layer for brands, not just a flashy demo. The bigger question is where it actually works. Customer triage, guided commerce, multilingual support, and structured actions look promising. Emotional nuance, messy edge cases, and brand risk still need people. The conversation also turns to Z.ai's GLM 5.1 and why lower cost models are putting real pressure on premium AI pricing. Add Snapchat and Google building generative tools deeper into ad platforms, and the shift is obvious. AI is moving from magic trick to workflow infrastructure, with humans still steering the ship.


Episode Transcript
Hunter: Happy Sunday, March twenty ninth, twenty twenty six, and welcome back to COEY Cast, the show that was assembled by a small orchestra of machines that absolutely did not ask for union representation. I’m Hunter.
Riley: And I’m Riley. Also, apparently it is Smoke and Mirrors Day, which is honestly perfect because AI this week has been like, is it magic, is it a product, is it a hallucination with a landing page? Yes.
Hunter: Yeah, that tracks. And if this episode feels unusually synthetic in places, good. It was stitched together with an AI tool stack, fully automated, human-guided, slightly haunted, and we left the weird edges in on purpose.
Riley: Which is kind of the point. If the robot trips over a sentence, we don’t hide it. We just hand it a mic and keep rolling.
Hunter: Alright, let’s get into it, because the biggest story to me is Google pushing Gemini three point one Flash Live harder into the wild. Real-time multimodal, voice, camera, lower latency, better multilingual support, stronger function calling, and people on X are basically saying, oh cool, the assistant is starting to sound less like a call center IVR and more like something you might actually use.
Riley: Mm, maybe. But I wanna push on that. People hear faster voice AI and immediately go, customer service is solved. No, babe. It’s getting better at the front door. That is not the same as being good in the house.
Hunter: That’s exactly right. I think Gemini Flash Live matters because it moves AI closer to becoming the first layer of customer experience. Not the full stack. The first layer. If I’m a brand, I can see clear use cases now for product discovery, store questions, appointment handling, multilingual FAQ, maybe guided shopping. Stuff where speed and context matter more than deep emotional nuance.
Riley: Right, like, where’s my order, what colorways are left, can this software connect to Shopify, what time do you close, which serum is for dry skin. That kind of thing. Quick, helpful, maybe visual too if the customer points a camera at something.
Hunter: Exactly. But the illusion breaks the second the conversation gets messy in a human way. Refund frustration. Edge-case policy issues. Emotional ambiguity. Sarcasm. Family plan billing drama. The customer saying one thing but really meaning another thing.
Riley: Or the classic, I bought this from your ad on Instagram but actually through a reseller but your logo was there so now it’s your problem. That’s where the AI smile can turn into dial tone energy real fast.
Hunter: Yeah. So for marketers, the smart move isn’t to say, let’s replace support. It’s more like, let’s build AI as triage plus guidance. Let it greet, qualify, route, answer common stuff, and maybe complete structured actions through function calling if the workflow is clean.
Riley: That function calling part is the sneaky important bit. Faster voice is cute. Stronger function calling is money. If the model can actually check inventory, book a demo, pull account info, trigger a workflow, now we’re not just flirting with the future. We’re actually doing ops.
Hunter: Thank you. That’s the difference between demo theater and workflow leverage. A voice assistant that can talk is nice. A voice assistant that can talk and do is useful.
Riley: But I still think marketers need to chill a little. Better in noisy environments does not mean ready for every retail floor, event booth, or creator collab shoot. There’s still a huge gap between a polished launch video and a real person shouting over music while asking if the promo code stacks.
Hunter: Totally fair. The build now bucket is narrow but real. Guided commerce, event activations, basic customer intake, multilingual top-of-funnel interaction. The cool demo, questionable rollout bucket is anything that needs heavy compliance, emotional sensitivity, or too many backend dependencies.
Riley: So don’t let your assistant freestyle on refunds, legal, or brand tone in a crisis. That’s not bold. That’s a postmortem.
Hunter: And this ties into something we’ve been talking about on recent episodes. Audio is not a side feature anymore. It’s becoming workflow infrastructure. We talked about that with open speech models, transcription, narrated content, all of it. Gemini Flash Live is another signal that voice is moving from novelty to interface.
Riley: Yeah, the vibe shift is real. Talking to systems is starting to feel less like sci-fi cosplay and more like normal software behavior.
Hunter: Now, the other big thread this week is Zai’s GLM five point one, and I think this one matters for a completely different reason. If the chatter is even directionally true, and it’s getting performance somewhere near premium frontier territory at much lower cost, then model choice starts becoming a finance conversation way faster than some teams want to admit.
Riley: Oh, absolutely. This is where the fancy AI strategy deck starts sweating. Because a lot of teams have been treating premium model selection like a personality trait. Like, no no, we are a top-shelf Claude household. Meanwhile finance is in the corner going, cool, but why are we paying luxury prices to summarize support tickets?
Hunter: That’s the point. Not every workflow needs the most elite model on earth. If GLM five point one or something like it is good enough for drafting, classification, coding assist, internal copilots, agent steps, then the stack changes. Premium models become selective tools for high-stakes tasks, not default settings.
Riley: And people are missing that this is not just about one model. It’s about pricing pressure. China’s labs are shipping aggressively, open source is maturing, and suddenly the premium moat looks a little less like a castle and a little more like a subscription you forgot to cancel.
Hunter: Nicely put. Though I’ll add a caution. Good enough is contextual. A cheap model that drifts in long contexts, breaks on agent loops, or gets weird under load can still become expensive if it creates review overhead, error handling, or trust issues.
Riley: Yeah, because the invoice is not the whole cost. The human babysitting tax is real. If your team has to constantly rescue the output, you didn’t save money. You just moved the spending into Slack panic.
Hunter: That’s why I think the right strategy for most orgs is model tiering. Use cheaper strong models for the broad middle. Use premium models for edge cases, judgment-heavy tasks, or brand-critical output. And keep the system model-agnostic so you can swap fast when the market moves, which right now it does every five minutes.
Riley: Hunt, this is where I bring up the social chatter angle. People online love to debate who won the benchmark fight, but creators and marketers should care more about reliability in an actual workflow. Can it take the brief, touch the spreadsheet, call the tool, generate the draft, and not completely lose the plot by step four?
Hunter: Yep. Benchmarks matter, but workflow evals matter more.
Riley: Thank you. Put that on a tote bag.
Hunter: Let’s bring in the marketing platform side, because Snapchat and Google both had updates that tell the same story. Generative AI is getting built directly into ad production infrastructure. Snapchat’s Lens Studio can turn images into short AI video clips, and Google keeps pushing more AI video generation and automation inside Performance Max.
Riley: This is the part social teams are gonna love and also maybe regret a little. On one hand, turning a still image into a short motion asset is catnip. Fast iteration, more placements, more testing, more stuff for Reels and Shorts and Spotlight. On the other hand, if everybody uses the same convenience layer, feeds get glossy and same-y real fast.
Hunter: That’s the tradeoff. Speed goes up. Differentiation can go down. These tools are powerful for asset expansion. Especially if you have static campaign materials and need motion versions fast. But they don’t replace taste.
Riley: Also, can we talk about Google generating more of the ads inside the same platform that distributes the ads? Because that is convenient, yes, but it also gives landlord-making-your-furniture energy.
Hunter: That’s a good line. And a real concern. The convenience is undeniable, but creative leverage gets fuzzy when the platform both makes and grades the work. Marketers need to be careful not to outsource too much authorship. Otherwise your brand voice starts sounding like everybody else who clicked auto-generate.
Riley: Exactly. If Performance Max is making the asset, choosing the placement, reading the signals, and optimizing the delivery, you need a human asking, wait, what part of this is still ours?
Hunter: I’d use those tools for versioning, adaptation, and volume. Not for core concepting. Let the machine help you turn one campaign into many assets. Don’t let it define the campaign’s soul.
Riley: Ooh. Okay, poet. But yes. That’s the move.
Hunter: And across the broader ecosystem this week, you can really feel the pattern. Audio tools have been leveling up, operator-style agents keep getting better, open and lower-cost models are squeezing pricing, and the creative bottleneck is shifting away from raw generation toward judgment, approvals, systems, and brand control.
Riley: Which is kind of funny, because the headlines still act like the magic trick is the main thing. But the quiet story is infrastructure. Voice that can actually be used. Models that are cheap enough to deploy everywhere. Ad tools that generate variants inside the workflow. It’s less, wow the robot made a thing, and more, wow the robot is now in the production line.
Hunter: That’s the real shift. And it’s why the winners are probably not the teams with the flashiest prompts. It’s the teams with clean handoffs, good review loops, strong governance, and a human who knows when to say, nope, not that.
Riley: Human taste is still undefeated. Sorry to the bots.
Hunter: So if I’m advising a team right now, I’d say this. Use closed models where you need polish, support, and top-end reliability. Use cheaper or open models where cost, control, and scale matter. Build automation around tasks, not around one vendor. And keep a human in the loop where trust, brand, legality, or customer emotion are on the line.
Riley: Mine would be simpler. Don’t get seduced by the smoke and mirrors. If the tool saves time and keeps quality, great. If it only looks futuristic in a screen recording, maybe don’t rebuild your whole company around it.
Hunter: That’s why we keep doing this show.
Riley: And why we keep side-eyeing the demos with love.
Hunter: Alright, that’s our Sunday sprint through Gemini Flash Live, GLM five point one, Snapchat’s image-to-video push, and Google folding more AI into ad ops.
Riley: Thanks for hanging with us on Smoke and Mirrors Day. Please use your powers responsibly.
Hunter: And if you want more AI news and updates, check out COEY.com slash resources.
Riley: And subscribe, obviously. Your future automated self would want that.
Hunter: Thanks for listening to COEY Cast.
Riley: Catch you next time.




