|
Indian Institute of Information Technology, Allahabad : Content Plagiarism Detection
An intuitive software for professors to check plagiarism in content that goes beyond the older techniques of 'word string' comparisons
Wednesday, May 07, 2008
We all face a deluge of information around us-web pages, email, books,
magazines, research papers and what not. Ever since Internet evolved, content
sharing across networks has been like a breeze. And it's walloping @ 33%
annually. While all this has helped in knowledge sharing, there has also been a
fair share of spoils. It's just a matter of time before a PhD student can gather
info on his subject and pass it of as original research to an unsuspecting
professor. There have been software available since ages to detect plagiarism
from websites or the intranet. Most of these only compare strings of words to
check for similarities and do not check for semantics, before labeling the
content as being plagiarized. So, the offender could bypass them by using
synonyms or changing tenses of sentences. This project was funded by Min of
Communications and IT to setup a Patent Referral Center at the institute, to
detect and discard untenable patents. The software has an in-built dictionary
that stores the pre-computed hashes of synonyms of all common words in English.
It compares with similar documents on the Internet and any soft repository that
may be assigned, and labels a document as being plagiarized if it crosses a
certain percentage of similarity.
|
Project Specs |
| Project Head:
Prof R C Tripathi,
Dean (R&D) & Head (IPRs) Deployment Location:
Allahabad
Team Size: 6
Tech Used:
C++, PHP, MySQL and Yahoo Web API
Intended audience:Universities,
research institutions, publishing houses, patent offices and legal
professionals
Project status:The system has already
been deployed at IIIT Allahabad to screen research papers for the last two
International Conferences on 'Wireless Communications and Sensor
Networks
Implementation
Partner
Inhouse |
Page(s) 1
|