How to Contribute

We welcome contributions to the OSR however there are a few guidelines and groundrules.

1. Material

We reserve the right to reject material for reasons of recording quality, file format. suitability of content.

2. Contributor approval

We require that any individual whose voice is recorded to sign a release giving their approval for the recording to be published or broadcast according to the conditions and scope of the Open Speech Repository.

Please email us before you record or send in material and we can provide a contributor reference number and release form.

3. Source text for recordings

The source text for the recordings must be representative of the range of sounds that are typical of the language being recorded, for example the Harvard Phonetically Balanced Sentences in English. Test sentences should be simple and neutral in meaning, each sentence should take 2-3 seconds to read. Test subjects are expected to read 3-5 sentences, giving speech files approximately ten seconds in length.

The source text must be contributed and will be published on the site with the speech recordings, to allow others to provide additional recordings of the same material.

4. Audio file recording

Audio files should be recorded using a low noise microphone under quiet conditions, generally in accordance with ITU Recommendation P.800. Each file should comprise a single speaker reading approximately ten seconds of speech (as described in 3 above).

Files should be recorded with 16 bit resolution with a 16kHz sample rate. Files may be provided as either simple 16 bit linear PCM files or 16 bit linear PCM WAV format.

5. File name convention

File names should be constructed using the following format:

OSR_[language]_[contributor]_[index]_[format].[extension]

Where [language] is the customary two character identifier for the country typically associated with the language, [contributor] is a three character ID assigned by Open Speech Repository, [index] is a four digit index number, [format] is a two digit format identifier and [extension] is pcm or wav.

6. Release

The contributor (person supervising the recording) must submit a release form that describes the speech material and has a declaration that each person whose voice is recorded has given their permission for the resulting audio file to be published.

Back To Top

Conditions of use: The material on this site is freely available for use in VoIP testing, research, development, marketing and any other reasonable application. The material may be copied, downloaded, broadcast, modified, incorporated into web sites or test equipment. We do require that you identify the source of the speech materials as "Open Speech Repository"..