Advertisment

In Search of Better 'Search'

author-image
PCQ Bureau
New Update
Advertisment

If you were to find out twenty fact about, say Iceland, and were given an

Internet connection, how would you go about it? Most probably, you would fire up

a well-known search engine like Google. You would enter Iceland as the keyword

and hit the Search button. This would spring up lots of websites related to

Iceland. You would then visit some of them that are displayed on the first

Search page to gather all your facts.

Sure this modern day search technology works, but it has several limitations.

For one, the search results simply point you to websites where you 'might' find

the information you're looking for. Two, you have to yourself judge the accuracy

of facts thrown up. Three, the process of finding the right information is time

consuming because you have to go through so many links. With the amount of

information available on the Internet growing by leaps and bounds, and other

information like audio, video, images also growting, this method of searching

will soon loose its effectiveness. For instance, what if you have a photograph

and want to find other similar ones like it on the Internet? Or, if you want to

download a particular song, but don't remember its lyrics, only the tune. How do

you find it on the Internet? These are some of the things being developed for

Internet search.

Wolfram|Alpha, the Computational engine



Computer scientist Stephen Wolfram, the inventor of Mathematica -a multi-faceted
program created in 1988 to provide a uniform system for all forms of algorithmic

computation, has landed in with yet another approach called computational

engine.

Advertisment
Computational engine gets you

direct information in visual representations unlike regular search engines

like Google, which simply returns links to Web pages.

Instead of searching the web and returning links, the computational engine

called Wolfram Alpha generates output by doing computations from its own

internal knowledge base. The search engine basically brings you systematic

factual knowledge, gets you things that are known, and are somehow public. It

only deals with facts and not opinions. Data that this engine comes up with are

mainly from internal knowledge base. An interesting thing here is that, the data

in Wolfram|Alpha is derived by computations, often based on multiple sources. It

deploys formulas and algorithms to compute answers for searchers. We can ask

WolphramAplha manythings in WolphramAplha . For example, you can ask about the

molecular weight of cholesterol, location of a gene in the human genome, the

number of people named John born in a particular year, the life expectancy of

50-year-olds in a country, the performance of Google stock, the height of Mt.

Everest, etc.

Components of Alpha



The main technologies behind this engine include:

Advertisment
  • Data curation: Wolfram|Alpha uses public and licensed proprietary

    data sources, and the company uses automated processes and human choices to

    prepare the data.
  • Algorithms: Alpha must pick the right computational processes to

    present its results. Inside Wolfram Alpha are 5 million to 6 million lines of

    Mathematica code that implement all those methods and models.
  • Linguistic analysis to understand what a person typed.
  • Presentation: Inside Alpha, there are tens of thousands of possible

    graphs.

    Wolfram can carry out complex

    math



    problems of Algebra, Matrices, Calculus, Trigonometry etc.

Picture Based Search



So, you can't remember who that person was in your wedding album or in the
conference. Not a problem, scan it and upload it to an image based search site

and ask it to find the person for you. This is what the nexgen search engines

are working on. Today if you did a picture search of a celebrity like Katrina

Kaif, you will be more successful! And a site called Picollator.com is trying to

search for other pictures you upload to it as well. The site uses pattern

recognition technologies to identify similar looking images on the Internet. But

as it has to run complex algorithms on so many images available on the net, it's

very slow as of today. The search is also not very accurate, but it works.

Google is also doing something similar with its web based version of Picasa,

which is an image and picture management portal. With the help of this

application, one can find similar faces in his/her complete photo library and

tag them with a name. This of course is not able to find all pictures of a

person but can find and recognize the face of a person wearing certain clothing

and ambient situations very well.

But, don't think that similar technology can only have leisure advantages. It

makes great business sense as well, and that's why the makers of VizSeek

http://www.vizseek.com” came up with the idea of developing a search engine

which can search for any tool just by a photograph or doodle sketch of it. This

site was devised by some engineers keeping in mind that remembering the name of

a tool or a part in repair work can sometimes become very difficult.

Advertisment

Search enhancements from Google



Google rolled out a similar service in search which makes it possible to search
and compare public data in an interactive graph. Among the new search features

include Google Squared, Google Options and tool for Android.

Google Squared: This extracts information from the Web and displays it

in a table. For instance, if you type “fantasy television shows,” it may return

a table of shows with information like their release dates, directors, actors,

etc. However, users can click on individual entries to check the source, and if

the number is incorrect, can correct the numbers through new searches. Finally,

you can also save the customized table for future reference.



Google Squared is different from Wolfram|Alpha, which rather than searching the
Web for data ,taps databases licensed by Wolfram Research. In Alpha, the

emphasis is on computing and visualizing range of data on subjects like

astronomy, computer science, and weather from its own sources. Google will be

opening it up to users later this month on Google Labs.

Google Options: After doing a search, you will see a new icon saying

'Show options.' In the case of 'Switches', clicking on 'Show options' offers you

a range of options on what sorts of results you want: 'videos,' 'forums,'

'reviews,' results sorted by time frame (past 24 hours, past week, past year),

or the most recently created pages or images. This option is available now.

Advertisment
Google Options enables you view

your search results in terms of 'videos; 'forums', 'reviews', and also in

timeframe as shown above in the left side.

Sound Based Search



This is something very interesting for people who like music. Remember sometimes
how difficult and frustrating it becomes when you forget the name of the song

which you want to search online. To top it all, sometimes you even forget the

lyrics. All you remember is the tune of the song. But being a diehard fan of

music, you can't just leave the feeling of listening to that music.

That's the type of customers which Musipedia.com is trying to harness. In

this website, one can find any music and purchase/download it. The searching can

be done either by typing the name of the song, or by playing the melody of the

song on a virtual keyboard or just by whistling the melody to the computer's

microphone or even by tapping the keyboard. The website recognizes the timing

and nodes of the song and accordingly it searches for the correct song

instantly. Then you can either play or just purchase the song. Well! I am not

very sure about some other usability of such technology, but yes, I whistled out

some five songs to it and it was only able to search two for me. So either My

Whistling is bad (which is quite possible) or this technology has to go a long

way before it's accepted by the actual netizans.

Advertisment

Plagiarism Search



This is something very useful for media companies like ours. Checking the
authenticity of a guest article or even checking if an article is being used by

someone else or not on the Internet was never easy before these Plagiarism

searches came in. These websites use APIs of Google or similar search engines.

The main aim of such web portals is to search for each and every sentence in any

webpage and search for exactly same or similar sentences/word sequences in some

other articles and then give a plagiarism score to that article. It also

highlights the copied/similar texts in all the articles. One example of such a

site is http://copyscape.com .

Another very intuitive use of such service for hunting down phishing

websites. A bank can pass its site's content to copyescape.com or similar

website to check if someone is phishing its website. As a phishing site must

have the same text and similar layout, it would be easily caught.

Musipedia's virtual keyboard to

play the tune of the music you are searching.
Advertisment

Semantic Search



This is something that might change the complete search paradigm in the near
future. Semantic search is very much practical, easier to use than traditional

search, faster and more accurate. Semantic search refers to the technology of

precise vocabulary-based search. Though such kind of natural language processing

has been in progress for years, it was only recently that it started to take

off. Some start-ups like Powerset, Textdigger and Hakia are working on semantic

search engines. A semantic web agent does not necessarily include artificial

intelligence. Instead it relies on structured sets of information and inference

rules that allow it to understand the relationship between data sources. A

computer may not understand information the way humans can, but it has enough

information to create logical connections and take decisions accordingly. The

data itself becomes a part of the web in case of semantic web -unlike the World

Wide Web, which has endless information in the form of documents - and is

processed irrespective of platform, application or domain. We can search for

documents on the World Wide Web, but their interpretation is left to us . On the

other hand, semantic web is all about data as well as documents on the Web so

that machines can process and even act on the data in practical ways. So while

in the non-semantic web, we'll term the word 'snake' as snake. In semantic web,

it would be treated as a animal.

Let's take another example. A Semantic Search Engine can answer questions

like 'Which Indian author won Booker prize in the year 1997?' It will apply the

reasoning based on the fact that the Web knows the difference between the names

of Indian Booker winners, respective years and even the names of books.

If we search for the keywords Semantic Web in Google, it shows all sites

containing information about it. However, in a Semantic Web search such as the

one provided by Powerset, you get the definition of 'Semantic Web' along with

relevant links

So the emphasis in Semantic Web goes to the back end. There is a rich set of

links from the Semantic Web to HTML documents. These relations

characteristically unite a concept in the semantic Web with the pages that are

most relevant.

The Backend for the Bots



We talked about Symantic search which does not necessarily include artificial
intelligence, Instead it relies on structured sets of information and inference

rules that allow it to understand the relationship between data sources. Just

imagine if we can rely on artificial intelligence and NLP or Natural Language

Processing. We can get a robot connected to the Internet who can listen to one's

voice and respond to the questions in a very friendly manner. And as it is

connected to the Internet, no question goes unanswered. So we actually can

convert the Web into a Brain for our robots. This reminds me of VIKI (Virtual

Interactive Konnective Intelligence) of the Hollywood blockbuster iRobot. I just

hope in the real world it doesn't go bad as VIKI did.

Today, in real world we don't have VIKI, but we have something called ALICE

which is an AI chat Bot which work on a AIML or Artificial Intelligence Markup

Language. Before I go on, just read the following interaction of mine with

Alicebot. You can visit her at

http://alicebot.blogspot.com/

So, in the following interaction, I was able to talk with ALICE with normal

language and asked her for some information. And it was able to understand the

correct meaning and intent of my question and then respond with a most

appropriate answer. Just imagine, if you could have a similar interface for

Google or Wikipedia. What will be the level of user interaction? And coupling it

with voice reorganization and text to speech we can actually have a VIKI in

place. Let me just leave you with these thoughts on the future of Search. For

any queries or questions on this topic, visit us at

http://forums.pcquest.com

With help from Anindya Roy

Advertisment