We all face a deluge of information around us-web pages, email, books,
magazines, research papers and what not. Ever since Internet evolved, content
sharing across networks has been like a breeze. And it's walloping @ 33%
annually. While all this has helped in knowledge sharing, there has also been a
fair share of spoils. It's just a matter of time before a PhD student can gather
info on his subject and pass it of as original research to an unsuspecting
professor. There have been software available since ages to detect plagiarism
from websites or the intranet. Most of these only compare strings of words to
check for similarities and do not check for semantics, before labeling the
content as being plagiarized. So, the offender could bypass them by using
synonyms or changing tenses of sentences. This project was funded by Min of
Communications and IT to setup a Patent Referral Center at the institute, to
detect and discard untenable patents. The software has an in-built dictionary
that stores the pre-computed hashes of synonyms of all common words in English.
It compares with similar documents on the Internet and any soft repository that
may be assigned, and labels a document as being plagiarized if it crosses a
certain percentage of similarity.
Project Specs |
Project Head: Prof R C Tripathi, Dean (R&D) & Head (IPRs) Deployment Location: Team Size: 6 Tech Used: Intended audience:Universities, Project status:The system has already Implementation |