speech-provider

A unified interface for browser speech synthesis and Eleven Labs voices.

Installation

# Using npm
npm install speech-provider

# Using yarn
yarn add speech-provider

# Using bun
bun add speech-provider

Documentation

Full API documentation is available at https://osteele.github.io/speech-provider/.

Usage

import { getVoiceProvider } from 'speech-provider';

// Use browser voices only
const provider = getVoiceProvider({});

// Use Eleven Labs voices if API key is available
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Use Eleven Labs with custom cache duration
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // Cache for 1 day
});

// Get voices for a specific language
const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 });

// Get default voice for a language
const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' });

// Create and play an utterance
if (defaultVoice) {
  const utterance = defaultVoice.createUtterance('Hello, world!');
  utterance.onstart = () => console.log('Started speaking');
  utterance.onend = () => console.log('Finished speaking');
  utterance.start();
}

Features

Unified interface for both browser speech synthesis and Eleven Labs voices
Automatic fallback to browser voices when Eleven Labs API key is not provided
Typesafe API with TypeScript support
Simple voice selection by language
Event listeners for speech start and end events
Automatic caching of Eleven Labs API responses to reduce API calls
Configurable cache duration for Eleven Labs responses

Used In

This package is used in Mandarin Sentence Practice, a web application for practicing Mandarin Chinese with listening and translation exercises. The app uses this package to provide high-quality text-to-speech for Mandarin sentences, with automatic fallback to browser voices when Eleven Labs is not available.

API

`getVoiceProvider(options)`

Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided.

function getVoiceProvider(options: {
  elevenLabsApiKey?: string | null;
  cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)
}): VoiceProvider;

`createElevenLabsVoiceProvider(apiKey, options?)`

Creates an Eleven Labs voice provider with optional configuration.

function createElevenLabsVoiceProvider(
  apiKey: string,
  options?: {
    validateResponses?: boolean;
    printVoiceProperties?: boolean;
    cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)
  }
): VoiceProvider;

Caching

The library implements automatic caching for Eleven Labs API responses:

Browser voices are cached automatically by the browser's speech synthesis engine
Eleven Labs responses are cached using IndexedDB with a default duration of 1 hour
Cache duration can be configured when creating the provider
Cached responses are automatically invalidated after the specified duration
Cache can be disabled by setting cacheMaxAge: null in the provider options

Examples of cache configuration:

// Use default 1-hour cache
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Cache for 1 day
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // 24 hours in seconds
});

// Cache for 1 week
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 604800 // 7 days in seconds
});

// Disable caching (preferred approach)
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: null
});

// Alternative way to disable caching
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 0
});

`VoiceProvider` Interface

interface VoiceProvider {
  name: string;
  getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise<Voice[]>;
  getDefaultVoice({ lang }: { lang: string }): Promise<Voice | null>;
}

`Voice` Interface

interface Voice {
  name: string;
  id: string;
  lang: string;
  provider: VoiceProvider;
  description: string | null;
  createUtterance(text: string): Utterance;
}

`Utterance` Interface

interface Utterance {
  start(): void;
  stop(): void;
  set onstart(callback: () => void);
  set onend(callback: () => void);
}

Browser Compatibility

Browser Speech Synthesis

The browser speech synthesis provider (BrowserVoiceProvider) is supported in all modern browsers:

Chrome/Edge: Full support (voices load asynchronously)
Firefox: Full support
Safari: Full support (iOS and macOS)
Opera: Full support

Note: Voice availability and quality vary by browser and operating system. Chrome and Edge typically offer the best selection of voices.

ElevenLabs Provider

The ElevenLabs provider (ElevenLabsVoiceProvider) requires:

IndexedDB: For caching API responses (supported in all modern browsers)
Fetch API: For making API requests (supported in all modern browsers)
Audio API: For playing synthesized speech (supported in all modern browsers)

Minimum Requirements

Modern browser with ES2022 support
IndexedDB support (for ElevenLabs caching)
No Internet Explorer support

Server-Side Rendering (SSR)

The library is designed for client-side use. When used in SSR environments:

Browser voice provider gracefully handles the absence of window.speechSynthesis
Returns empty arrays when browser APIs are unavailable
Safe to import in SSR frameworks (Next.js, Nuxt, etc.) but should only be used client-side

Contributing

Contributions are welcome! Please read the CONTRIBUTING.md guide for details on our code of conduct and the process for submitting pull requests.

Changelog

See CHANGELOG.md for a list of changes and version history.

License

Available under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
.husky		.husky
examples		examples
src		src
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
justfile		justfile
nodemon.json		nodemon.json
package.json		package.json
tsconfig.json		tsconfig.json
typedoc.json		typedoc.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

speech-provider

Installation

Documentation

Usage

Features

Used In

API

`getVoiceProvider(options)`

`createElevenLabsVoiceProvider(apiKey, options?)`

Caching

`VoiceProvider` Interface

`Voice` Interface

`Utterance` Interface

Browser Compatibility

Browser Speech Synthesis

ElevenLabs Provider

Minimum Requirements

Server-Side Rendering (SSR)

Contributing

Changelog

License

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

osteele/speech-provider

Folders and files

Latest commit

History

Repository files navigation

speech-provider

Installation

Documentation

Usage

Features

Used In

API

getVoiceProvider(options)

createElevenLabsVoiceProvider(apiKey, options?)

Caching

VoiceProvider Interface

Voice Interface

Utterance Interface

Browser Compatibility

Browser Speech Synthesis

ElevenLabs Provider

Minimum Requirements

Server-Side Rendering (SSR)

Contributing

Changelog

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages

`getVoiceProvider(options)`

`createElevenLabsVoiceProvider(apiKey, options?)`

`VoiceProvider` Interface

`Voice` Interface

`Utterance` Interface