automatic chinese transcription

transcribing voice recordings is a painful process for non-professional transcribers like me. if only one can pipe the recordings into a computer, and the transcription will be automatically done. wait, yes, it’s possible now, albeit with a relatively low-mid accuracy.

thanks to Si Hui who introduced to me someone’s attempt to convert recorded audio to text on a Mac that started my exploration in the Windows environment (:

the software that you will need to transcribe Chinese audio recordings into Chinese text:

1. Audacity (or any audio playback software that allows you to select the Output path)

2. VB-Audio Virtual Cable (or any other software that captures all the Sound Out, and directs it to any system software listening to any audio output; this is the Windows equivalent of MacOS’ Soundflower)

3. 讯飞输入法 iFly Input PC version (this is the software responsible for the Chinese audio-to-text conversion)

install the above software in any order. to get them working together, follow these steps:

1. load Audacity. open the voice recording. In the “Audacity Device Toolbar”, select Playback Device and set it to CABLE input (VB-Audio virtual cable).


2. load 讯飞输入法’s 语音悬浮窗 (find it within Start menu). click on the 语音悬浮窗 to set it to “点击说话” to pause recognition.


3. load any word processor (e.g. microsoft word, notepad).

4. go back to Audacity. begin Playback of voice recording.


5. click on 语音悬浮窗 to set it to “请说话” (voice recognition begins)


6. switch to word processor, and maintain it as focus (with blinking cursor in text edit area)

7. your automated transcription should now begin, as follows:

things to note:
1. 讯飞输入法 requires an ACTIVE internet connection to function.
2. 讯飞输入法 may from time to time stop by itself. i think this happens when the server is too busy. one will need to repeat step 5 to reactivate the audio-to-text conversion.
3. if you would like to use Windows 8.1’s built-in voice recognition software, pls go ahead. but i can safely advise, be prepared to get near-zero accuracy. maybe the to be released windoze 10’s version would be better 😛

once it’s done, it’s time listen to your original voice recording, and begin the cleaning up of the automatically generated text. while the recognition accuracy is still not high enough, this process has however sped up my rate of transcription significantly. your mileage may vary, but i hope it helps in your work too.

oh btw, my all-time favourite software to assist in transcription remains as VoiceWalker.

enjoy (:

voice transcription software

it’s been a while since i needed this. a search quickly turns up Express Scribe Transcription from NCH. installed and tried using it. but found its Play with Stopping function’s not good enough for my poor memory.

hunted for a much older software, and found the good old VoiceWalker. there’s 2 versions, according to Department of Linguistics, University of California, Santa Barbara. you can download version 2 or version 1 from the local mirror. but i think i have been using version 2 all along.


simply love and cant do without the Walk (F5) and Looping function.

the only catch is, VoiceWalker only supports wav; mp3 is a no go. but fret not, just download dvdvideosoft’s Free audio converter will do.

another thing to note is the file size of .wav. i just witnessed a 134mb mp3 converted into a whopping 984 wav. that’s not too environmentally friendly 😛