Playing with the Dictation Program

Here are two voices with different rates, and no delay between words. (Well, I put in no delay. Cepstral might have.)

zz600-0050r-000d-Law-68wpm means:
file name zz600
rate setting of 0050 (50% default for voice)
delay of zero
voice Lawrence (or Allison)
68wpm

zz600 is straight text, all 600 words from assignment 54 of GSF2, used for calibration.

zzphrase is part of passage 206, where the phrases have been joined. Listen to the entire passage to get the full effect.

If rate and delay are stated, the wpm is calculated from the actual file length. If they're not stated, the program used the old calibration curve to achieve the stated speed. It's usually close enough for study purposes.

Another glitch came up with joining phrases. Sometimes what looks like a phrase isn't. This happens mostly when several common words are together.

If anyone has another program that uses ssml tags, I can send you some ssml files to experiment with. To fully automate the process, you have to be able to call the program from the command line.

SAPI tags are, in theory, as easy to add as ssml tags, but I haven't tried. I'd like to hear samples before automating it.

The purpose of these recordings is to help build speed. They're not as good as a human voice, but I did 1/3 of the book in an afternoon. The only part that takes time is typing and/or scanning and proof-reading. (We won't count debugging the program. Another purpose was to give me much-needed programming practice.)

Attachment: play-with-dictation-settings.zip

(by Cricket for everyone)

Labels: audiofiles, dictation