COEY Cast Episode 117

Voice Is the New Landing Page Open vs Closed and Real Time Video

Voice Is the New Landing Page Open vs Closed and Real Time Video

Voice Is the New Landing Page Open vs Closed and Real Time Video
  • Riley Reylers

    Riley Reylers

  • Hunter Glasdow

    Hunter Glasdow

Episode Overview

02/28/2026

Voice agents just leveled up with OpenAI gpt realtime 1.5 and the real win is not small talk. It is inbound and post click voice concierges that can query your product data, booking tools, and CRM without turning into a spam cannon. The conversation breaks down consent, compliance, and “no manipulation” guardrails. Then it shifts to Meta Llama 3.3 70B and open coding models as the hidden engine for automation glue. Finally, it unpacks PrunaAI P Video and why real time video generation is perfect for drafts but not a replacement for human editors.

COEY Cast Voice Is the New Landing Page Open vs Closed and Real Time Video
COEY Cast Voice Is the New Landing Page Open vs Closed and Real Time Video

Episode Transcript

Hunter: It’s Saturday, February 28th, 2026, and apparently it’s Public Sleeping Day… which is hilarious because the AI industry did not get that memo. Also Rare Disease Day, which is actually important, so shoutout to everyone pushing awareness and research. This is COEY Cast. I’m Hunter.

Riley: And I’m Riley. And yes, the show is made completely by robots and automation gremlins again. If it gets a little weird, that’s not a bug, that’s the vibe.

Hunter: Today’s big story is OpenAI dropping gpt-realtime-1.5 for voice agents. And, uh, I feel like the entire internet instantly decided voice is the new landing page.

Riley: Wait, “voice is the new landing page” is so cursed. But also… true. Like, people are hyped because it’s low-latency speech to speech, streaming audio, and better tool calling. It’s the closest thing to having Siri but, like, actually employed.

Hunter: Exactly. The upgrade that matters is less “look, it can talk” and more “it can talk while doing.” Better function calling is the difference between a demo and a system. You can pipe it into your product catalog, your booking calendar, your CRM, your support docs, and now the agent isn’t just chatting, it’s operating.

Riley: Okay, but answer the question everyone keeps dodging. What’s the first voice agent use case that’s actually worth shipping for marketing?

Hunter: A post-click voice concierge. Not cold calls. Not replacing your whole sales team. I mean: someone lands on your site, hits a “talk to an expert” button, and the agent helps them pick the right plan, handles objections, and then schedules a human if the vibe is high intent.

Riley: Mmm. So, voice as a conversion assist, not voice as a replacement human. I like that. Because the fantasy people are stuck in is like, “My AI will call ten thousand leads and book my calendar.” And I’m like… congrats, you invented a spam cannon with a friendly tone.

Hunter: Yeah, stop fantasizing about outbound robo-sales that improvises in real time. It’s not just ethically messy, it’s operationally dumb. One misheard detail, one consent issue, one weird persuasion moment, and you’re trending for the wrong reason.

Riley: Hold up, though. You said “post-click.” Isn’t the bigger win inbound phone too? Like the moment someone calls your business number and instead of being trapped in “press one to suffer,” they get a real conversation that can also upsell?

Hunter: Totally. Inbound is clean. People initiated. Consent is simpler. You can still do the cross-sell, you can still do lead qualification, but you’re not ambushing anyone. Plus you can have a really crisp rule like, “If the person asks pricing, the agent can answer from approved pricing tables only.” Human in the loop for edge cases.

Riley: Okay, but where’s the line between helpful product advisor and creepy persuasive robot? Because real-time adaptation is where it gets spicy. If it hears hesitation in my voice and suddenly goes into therapist mode, I’m out.

Hunter: The line is intent and transparency. Helpful is, “Here are options, here are tradeoffs, want me to book a demo?” Creepy is, “I noticed you sound uncertain, so I’m going to pressure you with scarcity and emotional language.” You can literally design against that. Put guardrails in the system prompt, constrain the allowed tactics, and log for review.

Riley: And it needs a “no manipulation” style guide the way brands have a visual style guide. Like, what emotions are allowed? What claims are allowed? What’s the escalation policy when someone says, “Are you recording me?” Because people will ask.

Hunter: That’s the unsexy constraint, by the way. Everyone on X is debating latency and cost, but the hidden constraint is consent plus compliance. Especially if you’re recording calls, or generating speech that sounds like a person, or doing multilingual stuff across regions.

Riley: Yeah. Also integration debt. Like, sure, the model can call functions better… but your internal systems are still a haunted house. Your product data is stale, your support docs contradict each other, your CRM fields are a junk drawer.

Hunter: Exactly. The model is only as good as your truth layer. If your catalog says the plan includes something it doesn’t, the agent will confidently sell imaginary features. The workflow win is: build a single source of truth that the voice agent can query.

Riley: So for creators and marketers listening, the play is not “ship a voice agent.” The play is “ship a voice agent on top of clean, structured answers.” Like an approved Q and A library, a product matrix, and an escalation route.

Hunter: Yes. And gpt-realtime-1.5 makes it feel natural enough that people will actually use it. The conversation flow matters. Interruptions, corrections, “wait I meant next Tuesday,” all that. That’s where voice used to fall apart.

Riley: Speaking of “fall apart,” can we talk about Meta quietly dropping Llama 3.3 70B vibes again? People are framing it like efficiency wars. Less compute, better coding performance. That’s not sexy marketing… but it’s secretly the marketing engine.

Hunter: One hundred percent. Open coding models don’t write your tagline. They build your automations. They generate the glue code for scrapers, pipeline fixes, ad QA bots, landing page experiments, internal agents that actually talk to your tools without paying twelve different SaaS tolls.

Riley: But what breaks first when everybody starts running open models in-house-ish? Security? Governance? Or like… org politics?

Hunter: Org politics. Security and governance are “known problems.” Politics is the silent killer. Marketing wants speed, IT wants control, legal wants zero risk, finance wants cost down. Open models unlock capability, but they also force you to decide who owns the automation layer.

Riley: Mmm. And the temptation is to avoid per-seat pricing by building everything yourself… and then you accidentally become a model hosting company. Suddenly you’re on call because the GPU node had a bad day.

Hunter: The smartest middle path is not “self-host everything” or “SaaS everything.” It’s routing. Use open models for predictable, high-volume tasks like code generation, data cleanup, classification. Use closed models when you need best-in-class multimodal reasoning or premium voice. And keep your orchestration layer portable so you can swap models without rewriting the whole business.

Riley: Translation for the people: don’t marry one model. Date around.

Hunter: Exactly, Riles. And keep your business logic outside the model. The model is the assistant, not the brainstem.

Riley: Okay, last monster story: PrunaAI P-Video. People are claiming real-time-ish multimodal generation, fast draft previews, up to 1080p and forty-eight frames per second. On DeepInfra trials, self-host options… the group chat is foaming.

Hunter: If it’s as fast and cheap as people claim, the workflow shift is brutal in a good way. “Generate-first-draft in seconds” collapses ideation and rough cut into the same moment. You’re not storyboarding for days, you’re vibing options in one sitting.

Riley: Wait, but what’s the realistic ceiling for brand-safe, ad-ready output? Because I’ve seen “fast video gen” before. It’s fast… and then the hands melt, the logo morphs, and the product turns into a different product.

Hunter: Brand-safe, ad-ready is still human territory. The ceiling right now is: it’s amazing for drafts, animatics, concepting, background plates, quick variations, maybe product motion for simple shots. But anything with strict brand marks, precise claims, regulated industries, or complex narrative continuity still needs a human editor and usually some traditional tools.

Riley: Yeah, it’s like the history of Photoshop filters. Everyone spammed them, then the people with taste won. Same with AI video. The winners will be teams who can direct, curate, and edit. Not just generate.

Hunter: And there’s a risk: creative testing velocity goes up, personalization gets easier, but we also get a flood of mediocre content. The moat becomes taste plus distribution plus a system that keeps quality from collapsing.

Riley: Also, let’s not forget audio. We talked recently about sound becoming “selectable” with stuff like SAM Audio, and Google’s Veo and Flow making video feel more like filmmaking. Now you’ve got voice agents getting real-time, and video gen getting faster… the stack is basically turning into a content factory.

Hunter: Yeah, and the factory needs a foreman. That’s the human. You set intent, you set boundaries, you approve outputs. Automation removes the grind, but it doesn’t replace judgment.

Riley: Spiciest take I’ve seen on X lately about open versus closed is that people aren’t even optimizing for capability anymore. They’re optimizing for control and cost. And, honestly… vibes. Like “does this model feel reliable or does it feel like a chaos demon?”

Hunter: The chaos demon metric is real. For marketing ops, reliability beats brilliance. If you’re going to automate workflows, you want boring consistency. Save the wild creativity for ideation sandboxes, not your live pipelines.

Riley: So if you had to pick: what’s inevitable, and what’s a bubble? Voice agents, open coding LLMs, or real-time video gen.

Hunter: Inevitable is voice agents for inbound and post-click conversion assist. Bubble is the idea that fully autonomous outbound voice persuasion is going to be a normal, accepted thing. That’s going to get regulated and socially rejected fast.

Riley: I’ll co-sign that. Also, real-time video gen is inevitable for drafts, but the bubble is thinking it eliminates production. It just moves production into curation and editing.

Hunter: Public Sleeping Day moral of the story: don’t sleep on the boring parts. Consent, compliance, truth layers, escalation rules, and edit passes. That’s how you ship the magic without the mess.

Riley: And on Rare Disease Day, just a quick reminder that tech hype is fun, but real impact is the point. Build responsibly. Don’t be weird with people’s voices.

Hunter: Appreciate you hanging with us on COEY Cast. Subscribe wherever you listen.

Riley: And go check out COEY.com slash resources for AI news and updates.

Hunter: Catch you next time and if you’re celebrating Public Sleeping Day, do it after you turn off your agent automations.

Most Recent Episodes
  • Fun-CosyVoice, Sonic Identity, and Agents in Hoodies
    03/03/2026
  • Gemini 3, GPT 5.3, and Kling 3.0: Workflow or Hype Show
    03/02/2026
  • Open Weights vs Ad Agents: GLM5, Google AI Max, Meta Manus
    03/01/2026
  • Voice Is the New Landing Page Open vs Closed and Real Time Video
    02/28/2026