Advertisment

FreeTTS: Speaking with Java

author-image
PCQ Bureau
New Update

Synthesizing speech from text has become a hot area for



programmers. Thanks to the new trend-that of reading e-books instead of large

bound volumes, reading news on the Web rather than on paper. On the flip side,

reading through the content sitting in front of the monitor for hours together

is at least irritating if it doesn't give you a headache. Software tools for

reading out the text are a handy way for reading the contents. Many software are

available for reading, some of them free, some of them for money. But, a good

programming API has been missing from the scene for a long time. FreeTTS is an

open-source project which has created a text to speech synthesizer in Java. For

source code in this article, visit the developer section at forums.pcquest.com



Advertisment
Direct

Hit!
Applies to:

Java programmers

USP:

Explains speech synthesis using FreeTTS 
Links:

http://freetts.sourceforge.net 
Google keywords:

text to speech java

Speech synthesis involves splitting the text into suitable

tokens, analyzing the tokens and normalizing them into easily readable form.

This step converts abbreviations and numbers etc into fully readable text. Then,

words are converted into lexicons (pronunciation units). If a word does not have

predefined lexicon for a word (like a name, say Nagaradjane, which is too

difficult for anyone to pronounce), the synthesizer uses a set of rules to

convert letter combinations into suitable lexicons. Once the lexicons are

decided, reading quality is improved by fixing location and amount of pauses,

pitch of the voice, tone and stress of the voice. Then, audio information is

generated and spoken out. Hence, speech synthesis involves a fairly long

procedure. The FreeTTS team claims that Java performs better than C language

based speech synthesis engine. The performance improvement is attributed to the

availability of extensive collection classes, usage of multi-threading and the

optimization provided by the HotSpotTM technology. Speech synthesis is

complimentary to speech recognition. While speech synthesis creates audio output

with an accent decided by the engine, speech recognition requires the engine to

generate speech from highly varying accents of audio data. This is like posing a

problem with multiple solution possibility to an elementary-school student. The

engine processes the audio data, compares it with predefined patterns of audio

data for the chosen accent, and then speculates (only guesses) the text, which

could be matching the audio. Speech recognition is usually considered to be an

error-prone system, since the pronunciation and accent vary greatly from region

to region. Hence, a recognition system, at an average, guesses 60 to 90% of the

spoken words rightly.

Advertisment

Synthesizers are available in many forms. MS Office comes

with its own speech synthesis and recognition system. Adobe Acrobat Reader

provides support for reading PDF files (only on windows platform as of version

7). There is MS Reader for reading e-books. IBM has been releasing Via Voice for

a long time. Cross platform synthesis engines like Festival are available free

of cost. FreeTTS is one such speech synthesizer. It is provided as a Java

library, containing features required for generating speech out of text input.

It's an open-source project hosted at FreeTTS.sourceforge.net.

Java and speech



Sun Microsystems had developed a specification for speech handling both

speech synthesis and recognition, called JSAPI (Java Speech API) in

collaboration with other major IT players long before the actual implementation.

This ambitious specification provides for generating speech with Markup Language

called JSML (Java Speech Markup Language). JSML is capable of formatting voices

in just the way HTML formats text displayed by browsers. One can specify the sex

and age of speaker, with formatting tags available for paragraphs, sentences,

emphasis and breaks. JSAPI also includes facilities for speech recognition as

well, ie, it can generate text messages from audio input. But a full-fledged

implementation for JSAPI is still not available. FreeTTS is a step in the

direction of fulfilling the JSAPI specifications. While it doesn't provide the

entire specifications, it demonstrates Java's capability of fulfilling the

speech-related demands offered as part of JSAPI. As of now, FreeTTS has two male

voices called Kevin and Allan (both can speak at 8 KHz or 16 KHz). Allan is a

good quality voice but can speak only numbers. Kevin is a general-purpose voice

for reading text as well as numbers. It is expected to grow further and

incorporate a variety of voices and offer the backbone support for JSAPI

implementation.

Using FreeTTS in Java programs



Using FreeTTS in Java programs is simple. After downloading the FreeTTS

package, the 'classpath' environment variable should be made to include

'{tts-dir}\lib\freetts.jar;{tts-dir}\lib\jsapi.jar'. Including jsapi.jar

helps develop JSAPI based programs in addition to FreeTTS based programs.

FreeTTS library is contained in 'com.sun.speech.freetts' package. As novice

programmers, the following code snippet shows how easy it is to generate voice

from text. The task is to get a Voice instance from the VoiceManager class,

allocate the Voice and call 'speak' method on it.

Advertisment

import com.sun.speech.freetts.*;







class FreeTTSTest







{







public static void main(String arg<>) throws Exception {







Voice v = VoiceManager.getInstance().getVoice("kevin16");


            v.allocate();


v.setPitch(150.0F); //Optional - set pitch for the voice


v.setVolume(1.0F); //Optional - set volume of output 0 to 1


v.setRate(160.0F); // Optional - set words spoken per minute


            String s =
"Hello "+System.getProperty("user.name")+"!\n"+







"I am kevin speaking at 16 kHz!\nWill you take me as your friend?";


            v.speak(s);








System.exit(0);



            }


}




Compile this code with 'javac -cp {tts-dir}\lib\freetts.jar;.

FreeTTSTest.java' command and run it with 'java -cp {tts-dir}\lib\freetts.jar;.

FreeTTSTest' command. Enjoy your computer saying 'Hello {your name}! I am

Kevin speaking at 16 KHz! Will you take me as your friend?' with metallic

friendliness in its voice. If we want Kevin to speak at 8 KHz, say 'VoiceManager.getInstance().

getVoice(“kevin8”)' in place of the statement given in the listing.

Advertisment

Speaking with JSAPI



A partial implementation of JSAPI is available with FreeTTS. The library

consists of two main packages for simple synthesis: javax.speech and

javax.speech.synthesis. The job is to get javax.speech.synthesis.Synthesiser

instance using javax.speech. Central class. This requires the help of

javax.speech.synthesis.SynthesiserModeDesc class which holds data related to the

locale for the voice (



US




) and mode name for the voice.

import javax.speech.*;



import javax.speech.synthesis.*;


public class JSTest {


public static void main(String arg<>) throws Exception {







SynthesizerModeDesc desc =

new SynthesizerModeDesc();

Advertisment

desc.setModeName("general");

desc.setLocale(java.util.Locale.US);



            Synthesizer

reader = Central.createSynthesizer(desc);

reader.allocate();         



            reader.resume();


            Voice v = new

Voice();


            v.setName("kevin16");



Advertisment

reader.getSynthesizerProperties().setVoice(v);

reader.getSynthesizerProperties().setPitch(150.0F);




reader.getSynthesizerProperties().setVolume(1.0F);




reader.getSynthesizerProperties().setSpeakingRate(160.0F);




String s = "Hello "+System.getProperty("user.name")+"!\n"+








"I am kevin speaking at 16 kHz!\nWill you take me as your friend?";





Advertisment

reader.speakPlainText(s, null);

reader.waitEngineState(Synthesizer.QUEUE_EMPTY);

System.exit(0);



            }


}


This program will run only if the file called voices.txt



available in the {tts-dir}\lib folder is copied to the lib folder of JRE or
user's home directory with the name voices.properties. This can be done by

issuing the command 'copy {tts-dir}\ lib\voices.txt {jre-dir}\lib\

voices.properties' or 'copy {tts-dir}\lib\voices.txt{home}\voices.

properties'. Here {tts-dir} refers to directory containing FreeTTS (typically

c:\freetts-1.2.1) and {jre-dir} refers to directory in which the java.exe

command is located (typically c:\jre1.5.0 or c:\jdk1.5.0\jre) and {home} refers

to the home directory of user.


Otherwise, an error message is thrown stating some problems

in voice allocation. This program is compiled with the command javac -cp {tts-dir}\lib\freetts.

jar;{tts-dir}\lib\jsapi.jar;. JSTest.java and  java -cp {tts-dir}\lib\

freetts.jar;{tts-dir}\lib\jsapi.jar;. JSTest.

V Nagaradjane

Advertisment