LSA 352 Homework 2: Summer 2007
Homework 2: TTS |
Due: Thursday July 19 before class |
WARNING: Please read this entire page before you start!!!!!
NOTE: These exercises are from Alan Black's course.
(2.0) Make sure you can get Festival to say "Hello World". Don't wait for this until the night before the homework is due.
(2.1) Make Festival say your entire name (first and last). If Festival doesn't say it correctly, fix it by adding explicit pronunciations to the lexicon. If it does say it correctly, find a friend's name that it doesn't say correctly and add a pronunciation to fix it.
(2.2) Copy 'text2pos' (in /afs/ir.stanford.edu/class/cs224s/newfestival/festival/examples)
and modify it to output the number of nouns (of any type) in a given file.
You run 'text2pos' by just typing text2pos on the command line; you don't run it inside the festival prompt. And then it takes its input from STDIN (for you non-UNIX people, this means you type text2pos
(2.3) Copy 'text2pos' and modify it to output the number of vowels (phoneme
vowels not letter vowels) in a given file.
(2.4)
Add a token to word rule,
to say money values in dollars and cents
in a standard full form.
For example, given $56.54, you should say "fifty six dollars
and fifty four cents". Hints for how to do this are given below.
Please send a plain text e-mail or file containing your code/commands, sample output, and responses to jurafsky@stanford.edu.
Here's various useful hints and helps about getting started
First, what should I read to understand Festival?
Second, where is Festival?
Kevin McGowan kindly created a page with some instructions for installing Festival on your Mac,
his page is here.
His page is designed only for Intel Macs; I've replicated that his instructions works great
for Intel Macs (on my MacBook Pro) but haven't tried an old powerpc mac yet.
Note that before peforming Kevin's step 2 (modifying the EST_math.h file and compiling festival)
you have to have installed the Apple Developer Tools, called Xcode, here.
In order to get the tools, you may have to sign up for a free developer account at that website.
Note: which parts of Festival do you need to install?
A minimal installation could have just the following:
It is compiled for Dell (Intel) LINUX machines.
That means it runs on certain machines that you can ssh to, in particular
myth, firebird, and raptor of what are called the Sweet
Hall machines; it will not run on Sun machines, (some of which
are also in Sweet Hall), only the linux machines.
You can only hear the output from festival if you are physically
sitting at the machines listening to the speakers.
So if you aren't installing your own copy, you'll need to do
the tricky file maneuvering I describe below.
If you have access to another Linux machine, some versions
of linux come preloaded with Festival. If you can find one that does,
(like if you have your own linux machine or something)
feel free to use that instead of the class copy of Festival.
It's also supposedly possible to install it in on Windows.
And in theory it should be compilable on a Mac.
The simplest way to run festival is to create
a small "Scheme" script file called "myrules.scm", which has
the following first line:
and the lines afterward have your "lex.add.entry" commands.
Then you run your new file as follows:
For advanced questions, a useful festival script
I recommend you add the following to your PATH variable:
Recall that there are three commonly-used ways to run festival:
You can have festival synthesize things directly from the shell:
But for this homework you should be able to use the default voice, so you
probably shouldn't need to reset this.
You will need to add a new definition for
Another hint: use
Another hint: once you decide on the condition, remember that you need to return
a list of words.
Those of you who don't have local Linux access and want to use festival
on the Linux machines remotely
may have noticed that if you run festival remotely to Sweet Hall,
you can't hear the sound. Unless you have remarkably good hearing.
Since it comes out somewhere in Sweet Hall. (Probably startling everyone there).
The following is instructions for how to make festival store the sound in a file
so you can open it locally. Those of you using Macs may have to
do this if you can't get the Mac sound to work.
In your home directory (on whatever machine you are running festival from),
put a
.festivalrc
file with the following three lines
Name your file whatever you want (and if you are properly kerberos logged-in,
you can use an AFS path to put it someplace you can
get it without having to sftp or scp it).
Hopefully this will work for you! I learned this from the Festival Manual section 6.3, which has some further
information. If you can't get it to work, you might have to run your homework
on a Linux machine at the actual console.
http://festvox.org/festival/downloads.html
http://www.cstr.ed.ac.uk/downloads/festival/1.95
festival-1.96-beta.tar.gz
festlex_CMU.tar.gz
festlex_POSLEX.tar.gz
festvox_kallpc16k.tar.gz
festvox_kedlpc8k.tar.gz
festvox_kedlpc16k.tar.gz
speech_tools-1.2.96-beta.tar.gz
klog
and give your password when it asks. Then you can
cd around over to the directory for cs224s, my class from winter
quarter where the files are.
/afs/ir/class/cs224s/newfestival/festival/bin/festival
(voice_kal_diphone)
festival myrules.scm
saytime
and other examples are in
/afs/ir/class/cs224s/newfestival/festival/examples
/afs/ir/class/cs224s/newfestival/festival/bin/
Alan Black's Hints for Exercise 2.0 and 2.1
echo My name is ... | festival --tts
or within the command interpreter with the command:
(SayText "My name is ...")
in the command interpreter, or you can write a script
like the script example saytime
(in "examples/saytime
").
If your name is not pronounced properly you can add new
entries to the lexicon using the the function lex.add.entry
For example the default synthesizer pronounces Ronald Reagan's
second name wrongly so we can redefine the pronunciation as
(lex.add.entry
'("reagan" n (((r ey) 1) ((g ax n) 0))))
To find out what the phoneme set is and possible formats, it is often useful
to lookup similar words. Use the lex.lookup
function as in
(lex.lookup 'reagan)
then copy the entry changing it as desired.
To keep the pronunciation add it to your `.festivalrc' in your
home directory. This file is automatically loaded every time you
run Festival so then it will always know about your name.
Because there are different lexicons for different languages/dialects you
must first select the lexicon/voice first before setting the new
pronunciation.
(voice_kal_diphone)
(lex.add.entry ...)
Alan Black's Hints for Exercise 2.2
(set! total_ns (+ 1 total_ns))
(format t "Total number of nouns %d\n" total_ns)
Alan Black's Hints for Exercise 2.3
# See `/afs/ir/class/cs224s/newfestival/festival/lib/synthesis.scm' for the definition of Tokens UttType for list of extra modules to call. You want to look at the Segment relation
(if (string-equals (item.feat seg "ph_vc") "+")
(set! total_vs (+ 1 total_vs))
)
Alan Black's Hints for Exercise 2.4
token_to_words
.
The normal convention here is to save the existing one and call that for things that
don't match what you are looking for. Thus your file will look
something like
(set! previous_token_to_words token_to_words)
(define (token_to_words token name)
(cond
;; here insert the condition to recognize money tokens
;; return list of words
(t
(previous_token_to_words token name))))
previous_token_to_words
to do the
hard parts (e.g., converting 56 to "fifty six" and 54 to "fifty four").
Your code should just do the easy part (adding the "dollars" and "cents"
in the right places).
A final bit on using the Sweet Hall Linux machines remotely
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Required_Format 'wav)
(Parameter.set 'Audio_Command "cp $FILE myfilename.wav")