automatic chinese transcription

transcribing voice recordings is a painful process for non-professional transcribers like me. if only one can pipe the recordings into a computer, and the transcription will be automatically done. wait, yes, it’s possible now, albeit with a relatively low-mid accuracy.

thanks to Si Hui who introduced to me someone’s attempt to convert recorded audio to text on a Mac that started my exploration in the Windows environment (:

the software that you will need to transcribe Chinese audio recordings into Chinese text:

1. Audacity (or any audio playback software that allows you to select the Output path)

2. VB-Audio Virtual Cable (or any other software that captures all the Sound Out, and directs it to any system software listening to any audio output; this is the Windows equivalent of MacOS’ Soundflower)

3. 讯飞输入法 iFly Input PC version (this is the software responsible for the Chinese audio-to-text conversion)

install the above software in any order. to get them working together, follow these steps:

1. load Audacity. open the voice recording. In the “Audacity Device Toolbar”, select Playback Device and set it to CABLE input (VB-Audio virtual cable).

150606-audacity-playbackoutput

2. load 讯飞输入法’s 语音悬浮窗 (find it within Start menu). click on the 语音悬浮窗 to set it to “点击说话” to pause recognition.

150606-ifly-pause

3. load any word processor (e.g. microsoft word, notepad).

4. go back to Audacity. begin Playback of voice recording.

150606-audacity-play

5. click on 语音悬浮窗 to set it to “请说话” (voice recognition begins)

150606-ifly-start

6. switch to word processor, and maintain it as focus (with blinking cursor in text edit area)

7. your automated transcription should now begin, as follows:

things to note:
1. 讯飞输入法 requires an ACTIVE internet connection to function.
2. 讯飞输入法 may from time to time stop by itself. i think this happens when the server is too busy. one will need to repeat step 5 to reactivate the audio-to-text conversion.
3. if you would like to use Windows 8.1’s built-in voice recognition software, pls go ahead. but i can safely advise, be prepared to get near-zero accuracy. maybe the to be released windoze 10’s version would be better 😛

once it’s done, it’s time listen to your original voice recording, and begin the cleaning up of the automatically generated text. while the recognition accuracy is still not high enough, this process has however sped up my rate of transcription significantly. your mileage may vary, but i hope it helps in your work too.

oh btw, my all-time favourite software to assist in transcription remains as VoiceWalker.

enjoy (: