Nari Labs

Generate high-quality speech with a wide range of styles and complexities

🚀 Explore Dialogue TTS: Better Quality, Lower Cost, Faster Processing

Support for up to 4,000 characters and 10 voice options — all at a more affordable price with improved processing speed.

Try Dialogue TTS

Text

0/800 (1 )

Audio

Generated audio will appear here

Open-Source AI Voice Technology

Nari Labs: Advanced Open-Source Text-to-Speech

Experience Nari Labs' revolutionary multi-speaker TTS technology that generates natural, expressive voices with direct multi-character dialogue capabilities

Overview

What is Nari Labs?

Team & Background

Nari Labs is a pioneering AI voice technology company founded in late 2024 to early 2025. Operating remotely and primarily based in Korea, the team is dedicated to creating high-quality open-source voice models with minimal resources, making advanced speech technology accessible to researchers and developers worldwide.

Open-Source Mission

With a focus on democratizing voice AI technology, Nari Labs releases their models under permissive licenses (MIT) that allow commercial use and redistribution. Their flagship technologies aim to provide open-source alternatives to commercial solutions, delivering comparable or superior quality with complete freedom for adaptation and integration.

Flagship Technology

Nari Labs: Next-Gen TTS

A powerful multi-speaker text-to-speech technology that generates natural, expressive voices with direct multi-character dialogue capabilities

Model Architecture

Built on a diffusion-decoder framework with multi-speaker conditional encoder, Nari Labs' technology directly outputs 24kHz, dual-channel audio. With approximately 1.6B parameters, it balances quality and efficiency for real-time generation.

Language Support

Initially optimized for English, Nari Labs' models have been successfully fine-tuned by the community for other languages including Chinese and Japanese, demonstrating their adaptability across linguistic boundaries.

Performance

Capable of real-time or near-real-time inference on consumer-grade GPUs like RTX 4090, Nari Labs delivers professional-quality voice synthesis with emotional range comparable to leading commercial solutions.

Capabilities

Key Features & Capabilities

Nari Labs leverages cutting-edge AI technology to deliver exceptional text-to-speech capabilities with unique multi-character dialogue generation.

Multi-Character Dialogue

Generate complete conversations with multiple distinct voices from a single text input, with automatic character switching and appropriate emotional tones.

Emotional Control

Fine-tune voice generation with explicit emotion tags or prosody embeddings to achieve the perfect tone, from excited and energetic to calm and contemplative.

Speed Control

Adjust speaking pace to match your needs, whether for natural conversation, rapid information delivery, or dramatic emphasis in storytelling.

High-Quality Audio

Produces 24kHz dual-channel audio with HiFi-GAN-V2 vocoder and optional SDPA-diffusion for enhanced quality, delivering crisp, natural-sounding speech.

Open-Source Freedom

Released under MIT license, allowing full commercial use and redistribution of model weights, enabling seamless integration into your applications and services.

Community Ecosystem

Benefit from a growing ecosystem of fine-tuned models, integration guides, and creative applications built around Nari Labs by an active developer community.

Technology

Technical Details

Explore the advanced technology behind Nari Labs' exceptional performance

Training Data

Large-scale public dialogue speech datasets (LibriTTS, LibriLight-TTS)
Custom YouTube data with automatic alignment
Diverse speaker profiles for multi-voice capabilities

Model Architecture

Diffusion-decoder framework with multi-speaker conditional encoder
Speaker and emotion embeddings explicitly concatenated during training
HiFi-GAN-V2 vocoder with optional SDPA-diffusion for enhanced quality

Future

Nari Labs Roadmap

Nari Labs' vision for the evolution of their voice AI technology

Model Scaling

Expansion to 4-5B parameters for enhanced quality and capabilities while maintaining efficient inference

Streaming TTS

Real-time streaming capabilities for immediate voice generation as text is input

Multilingual Expansion

Native support for additional languages beyond English, with preserved emotional range and natural prosody

Creative Tools Integration

Enhanced compatibility with video generation tools like Pika and Runway for complete AI-powered storytelling workflows

FAQ

Frequently Asked Questions

Explore answers to popular questions about Nari Labs. Need help? Reach us at [email protected].

What is Nari Labs TTS?

Nari Labs offers a cutting-edge text-to-speech solution that transforms written content into natural, lifelike speech. The system handles realistic dialogue, expressive cues, and even voice replication, making it perfect for audiobooks, voiceovers, animated content, and other spoken media.

How do I get started with Nari Labs TTS?

Getting started is quick and easy: just enter your text, tweak optional settings like tone or speed, and hit 'Generate'. Our system takes 2–3 minutes to process, after which you can listen to, download, or share your audio. Want to clone a voice? Upload a reference audio file to enable that feature.

How are credits used?

Credits are calculated based on text length—1 credit is used for every 10 characters, including punctuation and spaces. For example, an input of 100 characters would consume 10 credits. Please make sure you have enough credits available before submitting.

What are the limits for input text?

You can enter up to 800 characters per request. For the best results and smoothest speech output, we recommend keeping inputs close to 500 characters per entry.

Is commercial use of the audio allowed?

Yes, audio generated through Nari Labs is yours to use in any commercial project. Whether for advertising, content production, or product integration, you maintain full ownership without needing extra licensing or fees.

Which languages does Nari Labs support?

Currently, Nari Labs is optimized for English. While it can process other languages, performance may vary. We're actively working to expand our multilingual capabilities—stay tuned via our roadmap for updates on future language support.