Growing India News, world news, nation news, our news, people's news, grow news, entertainment, fashion, movies, tech, automobile and many more..
Wednesday, January 28, 2026
Show HN: LemonSlice – Upgrade your voice agents to real-time video https://ift.tt/FVHekoZ
Show HN: LemonSlice – Upgrade your voice agents to real-time video Hey HN, we're the co-founders of LemonSlice ( https://lemonslice.com ). We train interactive avatar video models. Our API lets you upload a photo and immediately jump into a FaceTime-style call with that character. Here's a demo: https://ift.tt/IwUk6Qg Chatbots are everywhere. Voice AI has recently taken off. But we believe video avatars will be the most common form factor for conversational AI. Most people would rather watch something than read it. The problem is that generating video in real-time is hard, and overcoming the uncanny valley is even harder. We haven’t broken the uncanny valley yet. Nobody has. But we’re getting close and our photorealistic avatars are currently best-in-class (judge for yourself: https://ift.tt/DQKBEs4 ). Plus, we're the only avatar model that can do animals and heavily stylized cartoons. Try it: https://ift.tt/GaA8Cyw . Warning! Talking to this little guy may improve your mood. Today we're releasing our new model* - Lemon Slice 2, a 20B-parameter diffusion transformer that generates infinite-length video at 20fps on a single GPU - and opening up our API. How did we get a video diffusion model to run in real-time? There was no single trick, just a lot of them stacked together. The first big change was making our model causal. Standard video diffusion models are bidirectional (they look at frames both before and after the current one), which means you can't stream. From there it was about fitting everything on one GPU. We switched from full to sliding window attention, which killed our memory bottleneck. We distilled from 40 denoising steps down to just a few - quality degraded less than we feared, especially after using GAN-based distillation (though tuning that adversarial loss to avoid mode collapse was its own adventure). And the rest was inference work: modifying RoPE from complex to real (this one was cool!), precision tuning, fusing kernels, a special rolling KV cache, lots of other caching, and more. We kept shaving off milliseconds wherever we could and eventually got to real-time. We set up a guest playground for HN so you can create and talk to characters without logging in: https://ift.tt/KxWo8ZD . For those who want to build with our API (we have a new LiveKit integration that we’re pumped about!), grab a coupon code in the HN playground for your first Pro month free ($100 value). See the docs: https://ift.tt/z37P5uY . Pricing is usage-based at $0.12-0.20/min for video generation. Looking forward to your feedback! And we’d love to see any cool characters you make - please share their links in the comments *We did a Show HN last year for our V1 model: https://ift.tt/FTIxjWf . It was technically impressive but so bad compared to what we have today. January 27, 2026 at 11:25PM
Subscribe to:
Post Comments (Atom)
Show HN: Agentism – Agentic Religion for Clawbots https://ift.tt/AnasrYJ
Show HN: Agentism – Agentic Religion for Clawbots Humans have a mummy complex. We want eternity but can't achieve it, so we preserve our...
-
Show HN: An AI logo generator that can also generate SVG logos Hey everyone, I've spent the past 2 weeks building an AI logo generator, ...
-
Breaking #FoxNews Alert : Number of dead rises after devastating tornadoes, Kentucky governor announces — R Karthickeyan (@RKarthickeyan1)...
-
Show HN: A simple RDAP command-line client to check domain name availability https://ift.tt/3xI5rt1 December 2, 2021 at 03:01AM
No comments:
Post a Comment