EXCLUSIVE: Voice-synthesis outfit Neosapience has closed an $11.5M funding round and said it will invest more in “emotionally intelligent” and conversational AI voice services ahead of an expected IPO. The injection of funds will also allow the company to boost its localization efforts, working in a variety of languages beyond its core of English, Chinese, Japanese, Korean, Spanish and Vietnamese.
The company is gearing up for a public offering on the Korea Stock Market, which is expected to happen by the end of next year. Its key product is Typecast, and it claims the platform’s USP is being able to discern and assign human emotions from a script without extensive prompts or pre-records.
Neosapience is based in Seoul, Korea, and San Francisco. It said Typecast can analyze any script and then “automatically generates the appropriate emotional delivery for each sentence without human intervention, making the resulting output indistinguishable from a human voice.” There are applications are across entertainment and marketing, and a focus on providing voice services for digital creators as the creator economy booms.
Use of AI tools in the entertainment biz remains hugely contentious, with above- and below-the-line talent and company staffers all concerned about the impact on jobs and on the creative process. Within that, the use of AI-powered voice is obviously a particular concern for voice artists. The use of AI-powered voice and audio services is, however, only increasing. Ukraine-based Respeecher has deployed its AI voice services on major movies including The Brutalist and series such as The Mandalorian, while the likes of Michael Caine and Liza Minnelli have lent their voices to ElevenLabs’ “Iconic Voice Marketplace”.
“Audiences are global, but content creation is still limited by language and budget, we are changing that,” Taesu Kim, co-founder and CEO of Neosapience told Deadline. “Typecast automatically analyzes a script, line-by-line, to understand tone, emotion and context. We call this, ‘smart emotion’, and it understands context to express emotion automatically, and it means the performance matches the storytelling.”
He added: “With just a few minutes of human audio we can develop a voice with AI that expands on the original, delivering rich performances well beyond what was recorded. For example, we can create voices that are even more emotive than the speaker is capable of producing on their own.”
















