Convert speech to text

Abtipper.de IBM speech recognition
The Conversion from speech to text can be carried out both manually and automatically. The Manual transcription is still far superior to automatic transcription in terms of quality. However, the automatic processing of speech into text also has many advantages. The biggest advantage is the speed with which speech is converted into text.

Manually converting or transcribing an interview into text can take a lot of time. Depending on the typing speed, it takes around 3 - 7 times the number of audio minutes to do this manually. With an appropriate "speech to text" program, this can be done effortlessly within a few minutes to seconds.

You can also find many other useful tips in our eBook Recording, typing, analyzing - a guide to conducting interviews and transcriptions.

The book is available as a free download: Now everything about transcription & Co!

Automatically convert speech to text

With the help of artificial intelligence (AI), it is possible to automatically convert audio files into text. There are now a large number of programs that convert speech or audio files into text. The best-known providers include Google (Speech-to-Text), Apple (Siri), Amazon (Alexa) and Microsoft (Cortana), while Voicedocs and EML are less prominent. For the most part, the programs can process files in the common audio formats (MP3 and WAV).

Other file formats or video files can be converted online or with special programs (e.g. with the Online Audio Converter or with the VLC Media Player). When automatically converting speech to text, the files are usually saved temporarily. With regard to data protection, you should therefore check with the individual providers in advance.

Study on converting speech to text / German

In a detailed study, we tested the performance of programs that convert German audio to text and compared the results of the six programs mentioned. The lesser-known speech-to-text programs from Voicedocs and EML performed best in many categories in the German language. We are happy to make this study available on request. The quality of the generated transcripts is currently still heavily dependent on the audio files, i.e. the number of speakers, the recording conditions (quiet or noisy environment), the vocabulary (simple or specialized vocabulary) and deviations from the standard language (accents or dialects). Under perfect conditions, automatic speech recognition can achieve acceptable results, but with any restriction (e.g. as few as two speakers), the quality of speech-to-text conversion drops significantly.

The quality of automatic speech to text programs varies greatly for German. You should always do a test beforehand.

To avoid any unpleasant surprises, it is advisable to first produce a test transcript for automatic speech-to-text transcription. This is possible with us free of charge. We can provide you with a free sample transcript of the first 2 minutes of your file without obligation. All you have to do is send us your file and you will then receive the sample transcript and detailed information on the result. Click here for the order form:
All transcripts created with AI are checked manually by us. During the post-editing process, we correct gross errors and assign the speech to the individual speakers. In general, most speech-to-text programs do not yet reliably assign speakers. This must be entered manually. Subsequent correction is therefore necessary even with the best program and the best quality.

Overall, the effort required for automatic transcription from speech to text is therefore still very high and is only recommended in cases where the requirements for the material to be transcribed (good audio quality, preferably only one speaker, no dialect) are met. A consistently good and reliable quality and correctness of the transcripts can therefore still only be achieved by manually processing speech-to-text. Our manually created transcripts generally have a quality level of at least 97% and are therefore significantly more correct than any transcript created using AI.

Further questions and answers

✅ How can you convert speech into text?

For the conversion from speech to text There are basically two methods for converting speech to text:

automatic speech recognition a machine converts the spoken word into text. This already works reasonably well for recordings with one person without dialect or background noise. With several speakers, the quality is currently mediocre at best.

With manual transcription, a person types out the voice recording. The manual transcription still achieves a much higher quality than machine transcription.

✅ Is it possible to convert speech to text online?

For the automatic conversion of speech to text there are a number of providers who offer this speech recognition is sometimes offered free of charge. However, the quality is currently only mediocre for recordings with more than one speaker.

For manual transcription, there are a number of transcription services and transcription agencies such as the German market leader abtipper.de.

✅ Are there programs that allow you to convert audio to text free of charge?

There is a whole range of providers offering automatic speech recognition systems, some of which are free of charge. speech recognition free of charge.

abtipper.de is the market leader for the transcription of audio & video files. Our advantages:

for a non-binding offer