Synthesizing speech from text has become a hot area for
programmers. Thanks to the new trend-that of reading e-books instead of large
bound volumes, reading news on the Web rather than on paper. On the flip side,
reading through the content sitting in front of the monitor for hours together
is at least irritating if it doesn't give you a headache. Software tools for
reading out the text are a handy way for reading the contents. Many software are
available for reading, some of them free, some of them for money. But, a good
programming API has been missing from the scene for a long time. FreeTTS is an
open-source project which has created a text to speech synthesizer in Java. For
source code in this article, visit the developer section at forums.pcquest.com
|
Speech synthesis involves splitting the text into suitable
tokens, analyzing the tokens and normalizing them into easily readable form.
This step converts abbreviations and numbers etc into fully readable text. Then,
words are converted into lexicons (pronunciation units). If a word does not have
predefined lexicon for a word (like a name, say Nagaradjane, which is too
difficult for anyone to pronounce), the synthesizer uses a set of rules to
convert letter combinations into suitable lexicons. Once the lexicons are
decided, reading quality is improved by fixing location and amount of pauses,
pitch of the voice, tone and stress of the voice. Then, audio information is
generated and spoken out. Hence, speech synthesis involves a fairly long
procedure. The FreeTTS team claims that Java performs better than C language
based speech synthesis engine. The performance improvement is attributed to the
availability of extensive collection classes, usage of multi-threading and the
optimization provided by the HotSpotTM technology. Speech synthesis is
complimentary to speech recognition. While speech synthesis creates audio output
with an accent decided by the engine, speech recognition requires the engine to
generate speech from highly varying accents of audio data. This is like posing a
problem with multiple solution possibility to an elementary-school student. The
engine processes the audio data, compares it with predefined patterns of audio
data for the chosen accent, and then speculates (only guesses) the text, which
could be matching the audio. Speech recognition is usually considered to be an
error-prone system, since the pronunciation and accent vary greatly from region
to region. Hence, a recognition system, at an average, guesses 60 to 90% of the
spoken words rightly.
Synthesizers are available in many forms. MS Office comes
with its own speech synthesis and recognition system. Adobe Acrobat Reader
provides support for reading PDF files (only on windows platform as of version
7). There is MS Reader for reading e-books. IBM has been releasing Via Voice for
a long time. Cross platform synthesis engines like Festival are available free
of cost. FreeTTS is one such speech synthesizer. It is provided as a Java
library, containing features required for generating speech out of text input.
It's an open-source project hosted at FreeTTS.sourceforge.net.
Java and speech
Sun Microsystems had developed a specification for speech handling both
speech synthesis and recognition, called JSAPI (Java Speech API) in
collaboration with other major IT players long before the actual implementation.
This ambitious specification provides for generating speech with Markup Language
called JSML (Java Speech Markup Language). JSML is capable of formatting voices
in just the way HTML formats text displayed by browsers. One can specify the sex
and age of speaker, with formatting tags available for paragraphs, sentences,
emphasis and breaks. JSAPI also includes facilities for speech recognition as
well, ie, it can generate text messages from audio input. But a full-fledged
implementation for JSAPI is still not available. FreeTTS is a step in the
direction of fulfilling the JSAPI specifications. While it doesn't provide the
entire specifications, it demonstrates Java's capability of fulfilling the
speech-related demands offered as part of JSAPI. As of now, FreeTTS has two male
voices called Kevin and Allan (both can speak at 8 KHz or 16 KHz). Allan is a
good quality voice but can speak only numbers. Kevin is a general-purpose voice
for reading text as well as numbers. It is expected to grow further and
incorporate a variety of voices and offer the backbone support for JSAPI
implementation.
Using FreeTTS in Java programs
Using FreeTTS in Java programs is simple. After downloading the FreeTTS
package, the 'classpath' environment variable should be made to include
'{tts-dir}\lib\freetts.jar;{tts-dir}\lib\jsapi.jar'. Including jsapi.jar
helps develop JSAPI based programs in addition to FreeTTS based programs.
FreeTTS library is contained in 'com.sun.speech.freetts' package. As novice
programmers, the following code snippet shows how easy it is to generate voice
from text. The task is to get a Voice instance from the VoiceManager class,
allocate the Voice and call 'speak' method on it.
import com.sun.speech.freetts.*;
class FreeTTSTest
{
public static void main(String arg<>) throws Exception {
Voice v = VoiceManager.getInstance().getVoice("kevin16");
v.allocate();
v.setPitch(150.0F); //Optional - set pitch for the voice
v.setVolume(1.0F); //Optional - set volume of output 0 to 1
v.setRate(160.0F); // Optional - set words spoken per minute
String s =
"Hello "+System.getProperty("user.name")+"!\n"+
"I am kevin speaking at 16 kHz!\nWill you take me as your friend?";
v.speak(s);
System.exit(0);
}
}
Compile this code with 'javac -cp {tts-dir}\lib\freetts.jar;.
FreeTTSTest.java' command and run it with 'java -cp {tts-dir}\lib\freetts.jar;.
FreeTTSTest' command. Enjoy your computer saying 'Hello {your name}! I am
Kevin speaking at 16 KHz! Will you take me as your friend?' with metallic
friendliness in its voice. If we want Kevin to speak at 8 KHz, say 'VoiceManager.getInstance().
getVoice(“kevin8”)' in place of the statement given in the listing.
Speaking with JSAPI
A partial implementation of JSAPI is available with FreeTTS. The library
consists of two main packages for simple synthesis: javax.speech and
javax.speech.synthesis. The job is to get javax.speech.synthesis.Synthesiser
instance using javax.speech. Central class. This requires the help of
javax.speech.synthesis.SynthesiserModeDesc class which holds data related to the
locale for the voice (
US
) and mode name for the voice.
import javax.speech.*;
import javax.speech.synthesis.*;
public class JSTest {
public static void main(String arg<>) throws Exception {
SynthesizerModeDesc desc =
new SynthesizerModeDesc();
desc.setModeName("general");
desc.setLocale(java.util.Locale.US);
Synthesizer
reader = Central.createSynthesizer(desc);
reader.allocate();
reader.resume();
Voice v = new
Voice();
v.setName("kevin16");
reader.getSynthesizerProperties().setVoice(v);
reader.getSynthesizerProperties().setPitch(150.0F);
reader.getSynthesizerProperties().setVolume(1.0F);
reader.getSynthesizerProperties().setSpeakingRate(160.0F);
String s = "Hello "+System.getProperty("user.name")+"!\n"+
"I am kevin speaking at 16 kHz!\nWill you take me as your friend?";
reader.speakPlainText(s, null);
reader.waitEngineState(Synthesizer.QUEUE_EMPTY);
System.exit(0);
}
}
This program will run only if the file called voices.txt
available in the {tts-dir}\lib folder is copied to the lib folder of JRE or
user's home directory with the name voices.properties. This can be done by
issuing the command 'copy {tts-dir}\ lib\voices.txt {jre-dir}\lib\
voices.properties' or 'copy {tts-dir}\lib\voices.txt{home}\voices.
properties'. Here {tts-dir} refers to directory containing FreeTTS (typically
c:\freetts-1.2.1) and {jre-dir} refers to directory in which the java.exe
command is located (typically c:\jre1.5.0 or c:\jdk1.5.0\jre) and {home} refers
to the home directory of user.
Otherwise, an error message is thrown stating some problems
in voice allocation. This program is compiled with the command javac -cp {tts-dir}\lib\freetts.
jar;{tts-dir}\lib\jsapi.jar;. JSTest.java and java -cp {tts-dir}\lib\
freetts.jar;{tts-dir}\lib\jsapi.jar;. JSTest.
V Nagaradjane