MindMakers

How TwelveLabs Helps Enterprises Transform Video Content Into Competitive Edge

Episode Summary

What will unlock real-world AI beyond chatbots and copilots? On this episode of MindMakers, host John Kim speaks with Jae Lee, CEO and Co-Founder of TwelveLabs, to explore how video is defining the next wave of artificial intelligence.

Episode Notes

What will unlock real-world AI beyond chatbots and copilots? On this episode of MindMakers, host John Kim speaks with Jae Lee, CEO and Co-Founder of TwelveLabs, to explore how video is defining the next wave of artificial intelligence.

Drawing from his background as a cybersecurity leader in the Korean military and his work pioneering video understanding models, Jae shares how TwelveLabs is building infrastructure for what they call video superintelligence. You’ll hear how enterprises across industries such as media, sports, and defense are using TwelveLabs to index petabytes of video and unlock semantic search, highlight generation, and autonomous analysis.

Jae also shares why they had to build their own inference stack from scratch, how they think about long-term memory in multimodal systems, and what makes video the richest and most underutilized data source in AI today.

If you’re building AI infrastructure, leading digital transformation, or preparing your organization for agentic workflows, this episode offers an inside look at the technical and strategic challenges behind enterprise-scale video intelligence.

—

Guest Bio

Jae Lee is the visionary Co-Founder and CEO of TwelveLabs, an industry-leading AI company transforming video search and understanding. Under his leadership, TwelveLabs has pioneered the development of advanced multimodal video foundation models that empower businesses and developers to extract deep, context-rich insights from video content with near-human intelligence.

Harnessing cutting-edge generative AI and machine learning technologies, TwelveLabs is redefining next-generation video processing, retrieval, and interactive question-answering capabilities. By seamlessly integrating visual, audio, and textual data, their models capture the full spectrum of meaning embedded in video content—setting a new standard for scalable, personalized and intelligent video experiences across media, security, sports analytics, and enterprise intelligence.

Jae also serves on the board of the Republic of Korea Foundation Model Association, collaborating with influential leaders from South Korea's largest conglomerates, including Samsung, SK, and LG, to shape the future of AI. He holds a BSc in Electrical Engineering and Computer Science from UC Berkeley.

—

Guest Quote

“We need to really ask ourselves: what can humans uniquely do? I think this tech will probably help us understand who we are better. Because we need to know what we are uniquely good at, what we are uniquely capable of doing, to find something that can’t be displaced. It’s a deeper challenge for humanity.”

—

Time Stamps

00:00 Episode Start

01:00 Jae's background and early career

04:45 The idea that sparked TwelveLabs

07:30 Why video models are so powerful

11:20 Real world applications of TwelveLabs

14:00 Building your own infrastructure

17:45 Unlocking productivity across industries

21:05 TwelveLabs’ workflow

23:30 Hiring top talent to scale your teams

28:25 Driving adoption in regulated sectors

34:00 How to stay ahead in the world of AI

36:00 Where to build the best tech

38:15 The future of TwelveLabs

41:00 Clearing up misconceptions

45:00 Jae's Human Prompt

—

Links