Advertisment

Goodbye Keyboards

author-image
PCQ Bureau
New Update

Handwriting recognition, a recent development in computing,has found wide applications the world over, and being able to write on your PDAis just one of them. It promises to make data entry simpler and faster byletting you enter text in your device in a more "natural" way thantyping it. It can also reduce workloads for people who have to feed largeamounts of data into computers on a regular basis.

Advertisment

There are two types of handwriting recognition–OpticalCharacter Recognition (OCR), and Pen Computing. OCR is used for entering largeamounts of data, as in maintaining medical records in hospitals. Instead ofhaving a data operator enter all records into a computer, you can scan therecords, use OCR software to read the data, and save them as a document file.This is called off-line handwriting recognition, because the handwriting that isalready written on paper is put through the recognition process. Pen computing,on the other hand, uses online handwriting recognition. Handheld devices are atypical example, where your device recognizes your handwriting as you write it.

A number of techniques are used for handwriting recognition,which are different for OCR and pen computing. While OCR uses bitmapped imagesof text, online recognition uses vectorized images. So, in pen computing,information like location (co-ordinates) of input text, pressure of the pentowards an input area, angle, speed, etc, helps the system get used to a user’swriting style.

Handwriting can be recognized either by character or by word.Some systems start with segmenting the whole text into words or parts of wordsand then recognize them using built-in dictionaries, while others segment texton the basis of strokes and characters. Most online recognition systems followthe latter.

Advertisment

Let’s look at some of the techniques used for handwritingrecognition.

Optical Character Recognition

This process begins with scanning a document you wantrecognized. The scanning resolution has to be adjusted according to the qualityof the original document. After this, the OCR software/hardware extracts textfrom the document image, a process known as document analysis. If the originaldocument is of poor quality, enhancement techniques can be used. One is toseparate text from other "noise", like postmarks on envelopes, inkstains, etc. The system then uses sophisticated algorithms to isolate individualcharacters from the text image. This is easier to do for printed text, or wheretext has been entered in boxes (as in many medical or other forms) rather thanfor cursive, handwritten text.

Advertisment

The next stage is character recognition, which involvesfeature extraction and classification. Features of various characters aredetermined, and are then given to the classifier, which then tries to recognizethem. Several classification methods can be used for this. The most popular oneis called template matching. Here, individual pixel images are used as features,and are compared to a set of templates from each character class. Eachcomparison gives a similarity measure between the character image and thetemplate. After all templates have been compared to the character image, thecharacter is identified on the basis of the closest matching template. Templatematching is trainable, that is, you can change template characters and use fontsother than those that come with the OCR system.

The final stage in recognition is called contextualprocessing, where a character is determined by the context in which it is used.For example, if the software is confused between a character being "O"or the number "0", and the character appears between two numbers, it’llrecognize it as a number because it’s unlikely for "O" to appear inthis context. Similarly, the software can also use built-in dictionaries orspell checkers to verify its results.

You can see the final result on the OCR system’s outputinterface, and can then save or export it to applications like word processors,spreadsheets, or databases.

Advertisment

Online character recognition

The recognition process remains the same here–input,feature extraction and recognition, and the final output. However, it’s morecomplicated than its offline counterpart, because it involves recognizinghandwritten input on the fly, which not only varies widely among users, but isalso cursive in nature, making it difficult to isolate individual charactersduring recognition. Most of the techniques used in this case involve assigningmeasures of possibility to the occurrence of a certain character. Let’s lookat two such techniques.

Neural networks

Advertisment

Neural networks simulate the way our brain works to processinformation. It works as people do, in that it learns from experience. A networkhas to be trained at the development stage, where it’s exposed to data thatrepresents the data that it’s designed to recognize.

Neural networks use a set of interconnected processingelements–called nodes–to identify patterns in data as they’re exposed tothe data. A network consists of an input layer, one or more hidden layers, andan output layer (see diagram). The input layer is where you input text. Eachnode in this layer is connected to all nodes in the hidden layer(s), where theinformation is processed and interdependencies in the information determined. Sowhat is learned in any hidden node is based on all the inputs. The output layerdisplays results based on information processed in the hidden layer.

In handwriting recognition, neural networks are used forclassification or character recognition. Let’s look at how ApplePrint-Recognizer, in Apple’s Newton MessagePad, was developed using neuralnetworks.

Advertisment

The MessagePad used individual strokes and characters at theinput stage. As data was fed into the PDA, each individual stroke was recordedand grouped with those preceding and succeeding it in all possible combinations.This data was fed into the neural network. The hidden layers of the network weresplit into two parallel layers, and were joined together at the final outputlayer. One part of the network accepted an anti-aliased image of a strokefeature and other details, while the other part accepted an anti-aliased imageof a character. Both parts were further connected to several other hiddenlayers. Each layer accepted different kinds of data to determine variousfeatures of the input character. For example, one layer could focus on the toppart of the image, and another on the bottom part. This was done to makerecognition more accurate.

The network classified these inputs into possible characters and assignedprobabilities to them. The combined results of both hidden layers were sent tothe output layer. These outputs decided what the next character in the currentword could be. These were then entered into a search engine, which searchedbuilt-in dictionaries to determine what a particular word could be, given thestring of possible characters.

The Apple Print-Recognizer was also exposed to the same input data repeatedly after varying combinations of scaling, skewing, etc, so that it could recognize different

kinds of handwriting.

Advertisment

Fuzzy logic

Fuzzy logic is used to deal with situations where

black-and-white, or "completely true" and "completely false"

statements don’t help. For example, statements like "The room is

warm" may be true for some and false for others, so it’s neither

completely true nor false. Fuzzy logic thus deals with "degrees of

truth", where it uses functions–called membership functions–to assign

numbers to these degrees of truth.

Handwriting recognition products use fuzzy logic systems to

identify characters and assign degrees of truth to the occurrence of particular

characters. Fuzzy variables are defined for each character and are stored in

prototype memory. The definition is abstract–for example, "T"

consists of two straight lines, meeting at the top. When you enter text in your

device, features extracted from the image, for example, line segment types, are

compared to the fuzzy variables. The comparison computes degrees of truth for

this character matching to a particular fuzzy variable. In the end, a process of

"defuzzification" gives a "crisp" final guess on what the

character is. The process uses various mathematical techniques at each step.

Unistrokes

Natural handwriting recognition is a complex task, as you

would appreciate from the above discussion. People write in various ways, and

the language in which they write has its own complexities. One easy solution is

to make a new, universal language which will be easy for your device to

recognize, and not too difficult to learn. For example, the Palm Pilot use a

script called Graffiti, where you enter characters with its stylus, which are

then recognized with near-100 percent accuracy.

Graffiti is a kind of unistroke script, where each character

is written with a single stroke without lifting pen from paper. You can even

write one character on top of the other because the system recognizes your input

character by character. The benefit of this script is that recognition rates are

very high, once you’ve mastered the language.

Most handwriting recognition products use one or more of the

above techniques. They are still fairly new technologies, both in terms of

research and applications, but hold great possibilities for the future.

Pragya Madan

Advertisment