Skip to content

osteele/speech-provider

Repository files navigation

speech-provider

A unified interface for browser speech synthesis and Eleven Labs voices.

Installation

# Using npm
npm install speech-provider

# Using yarn
yarn add speech-provider

# Using bun
bun add speech-provider

Documentation

Full API documentation is available at https://osteele.github.io/speech-provider/.

Usage

import { getVoiceProvider } from 'speech-provider';

// Use browser voices only
const provider = getVoiceProvider({});

// Use Eleven Labs voices if API key is available
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Use Eleven Labs with custom cache duration
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // Cache for 1 day
});

// Get voices for a specific language
const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 });

// Get default voice for a language
const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' });

// Create and play an utterance
if (defaultVoice) {
  const utterance = defaultVoice.createUtterance('Hello, world!');
  utterance.onstart = () => console.log('Started speaking');
  utterance.onend = () => console.log('Finished speaking');
  utterance.start();
}

Features

  • Unified interface for both browser speech synthesis and Eleven Labs voices
  • Automatic fallback to browser voices when Eleven Labs API key is not provided
  • Typesafe API with TypeScript support
  • Simple voice selection by language
  • Event listeners for speech start and end events
  • Automatic caching of Eleven Labs API responses to reduce API calls
  • Configurable cache duration for Eleven Labs responses

Used In

This package is used in Mandarin Sentence Practice, a web application for practicing Mandarin Chinese with listening and translation exercises. The app uses this package to provide high-quality text-to-speech for Mandarin sentences, with automatic fallback to browser voices when Eleven Labs is not available.

API

getVoiceProvider(options)

Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided.

function getVoiceProvider(options: {
  elevenLabsApiKey?: string | null;
  cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)
}): VoiceProvider;

createElevenLabsVoiceProvider(apiKey, options?)

Creates an Eleven Labs voice provider with optional configuration.

function createElevenLabsVoiceProvider(
  apiKey: string,
  options?: {
    validateResponses?: boolean;
    printVoiceProperties?: boolean;
    cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)
  }
): VoiceProvider;

Caching

The library implements automatic caching for Eleven Labs API responses:

  • Browser voices are cached automatically by the browser's speech synthesis engine
  • Eleven Labs responses are cached using IndexedDB with a default duration of 1 hour
  • Cache duration can be configured when creating the provider
  • Cached responses are automatically invalidated after the specified duration
  • Cache can be disabled by setting cacheMaxAge: null in the provider options

Examples of cache configuration:

// Use default 1-hour cache
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Cache for 1 day
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // 24 hours in seconds
});

// Cache for 1 week
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 604800 // 7 days in seconds
});

// Disable caching (preferred approach)
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: null
});

// Alternative way to disable caching
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 0
});

VoiceProvider Interface

interface VoiceProvider {
  name: string;
  getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise<Voice[]>;
  getDefaultVoice({ lang }: { lang: string }): Promise<Voice | null>;
}

Voice Interface

interface Voice {
  name: string;
  id: string;
  lang: string;
  provider: VoiceProvider;
  description: string | null;
  createUtterance(text: string): Utterance;
}

Utterance Interface

interface Utterance {
  start(): void;
  stop(): void;
  set onstart(callback: () => void);
  set onend(callback: () => void);
}

Browser Compatibility

Browser Speech Synthesis

The browser speech synthesis provider (BrowserVoiceProvider) is supported in all modern browsers:

  • Chrome/Edge: Full support (voices load asynchronously)
  • Firefox: Full support
  • Safari: Full support (iOS and macOS)
  • Opera: Full support

Note: Voice availability and quality vary by browser and operating system. Chrome and Edge typically offer the best selection of voices.

ElevenLabs Provider

The ElevenLabs provider (ElevenLabsVoiceProvider) requires:

  • IndexedDB: For caching API responses (supported in all modern browsers)
  • Fetch API: For making API requests (supported in all modern browsers)
  • Audio API: For playing synthesized speech (supported in all modern browsers)

Minimum Requirements

  • Modern browser with ES2022 support
  • IndexedDB support (for ElevenLabs caching)
  • No Internet Explorer support

Server-Side Rendering (SSR)

The library is designed for client-side use. When used in SSR environments:

  • Browser voice provider gracefully handles the absence of window.speechSynthesis
  • Returns empty arrays when browser APIs are unavailable
  • Safe to import in SSR frameworks (Next.js, Nuxt, etc.) but should only be used client-side

Contributing

Contributions are welcome! Please read the CONTRIBUTING.md guide for details on our code of conduct and the process for submitting pull requests.

Changelog

See CHANGELOG.md for a list of changes and version history.

License

Copyright 2025 by Oliver Steele

Available under the MIT License

About

A unified TypeScript interface for browser speech synthesis and Eleven Labs TTS voices

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors 2

  •  
  •