Proposed System for Plagiarism Detection

Chapter 3
The Proposed System

Don't use plagiarized sources. Get Your Custom Essay on
Proposed System for Plagiarism Detection
Just from $13/Page
Order Essay

Introduction

This chapter introduces ZPLAG as proposed system, and its most important design issues are explained in details.
It is very easy for the student to find the documents and magazines using advanced search engines, so the problem of electronic thefts is no longer local or regional, but has become a global problem occurring in many areas.
Due to the Hugging of information, and correlation networks, the discovery of electronic thefts is a difficult task, and the discovery of the thefts started in the Arabic language and the most difficult task no doubt.
And in light of the growing e-learning systems in the Arab countries, this requires special techniques to detect thefts electronic written in Arabic. And although it could use some search engines like Google, it is very difficult to copy and paste the sentences in the search engines to find these thefts.
For this reason, it must be develop a good tool for the discovery of electronic thefts written Arabic language to protect e-learning systems, and to facilitate and accelerate the learning process, where it can automatically detect electronic thefts automatically by this tool.
This thesis shows, ZPLAG, a system that works on the Internet to enable specialists to detect thefts of electronic texts in Arabic so it can be integrated with e-learning systems to ensure the safety of students and research papers and scientific theses of electronic thefts.
The thesis also describes the major components of this system, including stage outfitted, and in the end we will establish an experimental system on a set of documents and Arabic texts and compared the results obtained with some of the existing systems, particularly TurnItIn.
The chapter is organized as follow; Section 3.2 presents an overview of the Arabic E-Learning, Section 3.3 presents and explains the General Overview of the Proposed System, Section 3.4 explains in details the system architecture of the proposed system ZPLAG. Section 3.5 gives a summery for this chapter.

General Overview of the Proposed System

The proposed system consists of three different phases namely; (1) Preparation phase, (2) Processing phase, and (3) Similarity detection phase. Figure 3.1 depicts the phases of the proposed system.

Figure 3.1 Proposed system phases
Preparation Phases: this phase is responsible for collecting and prepares the documents for the next phase. It consists of five modules: text editor module, check language module, check spelling module, check grammar module, and Sentences analysis module.
Text editor module allows the user to input a text or upload a text file in document format, these files can be processed in the next phase.
The check language module is responsible for checking the input file written language, If it is an Arabic language then use Arabic process, or English language then use English process.
The check spelling module use to check the words are written correct or there is some misspelling.
This phase consists of three modules explained as follows:

Tokenization: break up the input text as some token .
SWR: remove the common words that appear in the text but carry little meaning.
Rooting: is the process of removing: (prefixes, infixes, or/and suffixes) from words to get the roots or stems of this word
Replacement of Synonym: words are converted to their synonyms.

Similarity detection Phases: It is consists of three modules Fingerprinting, documents representation and similarity detection, this phase discussed as follows: To calculate fingerprints of any document, first cut up the text into small pieces called chunks, the chunking method that responsible for cutting up the text will be determined [12]. A unit of chunk could be a sentence or a word. In case of chunking using sentences called sentence-based, the document can be cutted into small chunks based on ‘C’ parameter. For example, a document containing sentences ds1 ds2 ds3 ds4 ds5, if C=3 then the calculated chunks will be ds1 ds2 ds3, ds2 ds3 ds4, ds3 ds4 ds5. For example, a document containing words dw1 dw2 dw3 dw4 dw5, if C=3 then the calculated chunks will be dw1 dw2 dw3, dw2 dw3 dw4, dw3 dw4 dw5. The chunking using Word gives higher precision in similarity detection than the chunking sentence.

The Architecture pf Proposed System

The following properties should be satisfied by any system detecting plagiarism in natural language:

Insensitivity to small matches.
Insensitivity to punctuation, capitalization, etc.
Insensitivity to permutations of the document content.

The system main architecture of ZPLAG is illustrated in Figur1.

Preparation: text editor, check language, check spelling, and check grammar.
Preprocess: synonym replacement, tokenization, rooting, and stop-word removal.
Fingerprinting: the use of n-gram, where the user choses the parameter n.
Document representation: for each document, create a document tree structure that describes its internal representation.
Selection of a similarity: use of a similarity metric to find the longest match of two hash strings.

As mentioned in the previous section, the system architecture breakdown contains three main phases. Each phase will be composed to a set of modules in terms of system functionality. The following section contains the description of each phase and its modules in details.
3.4.1 The Preparation Phase
The main task of this phase is to prepare the data for the next phase. It consists of text editor module, check language module, check spelling module and check grammars module.
3.4.1.1. Text editor Module
Figure 3.2, illustrates text editor module. The users of the text editor module are faculty members and students, where the users need a text area to upload their files, so the brows helps for file path to make it easy for the users, After that check file format is very important , because the service upload files with doc or docx format, then after the user upload the file , the text editor module save the file in the database.

Figure 3.2 text editor module
 
3.4.1.2 Check Language Module
The raw text of the document is treated separately as well. In order to extract terms from text, classic Natural Language Processing (NLP) techniques are applied as. Figure 3.3 illustrates Check Language module and its functions: from the system database, whereas all the files are stored, the check language module bring the file and read it, then check for language either Arabic , English or combo (both Arabic and English), After that mark the document with its written language and save the file again in the system database.

Figure 3.3 check language module
3.4.1.3 Check Spelling Module
Figure 3.4 illustrates Check spelling module and its functions: after bringing the document from the system database, whereas all the files are stored, the check spelling module read the file, and use the web spelling checker, then the check spelling module make all the possible replacements for the words in false spelling check , After that save the file again in the system database.

Figure 3.4 check spelling module
3.4.1.4 Check Grammars Module
For English documents, Figure 3.5 illustrates Check grammar module and its functions: after bringing the document from the system database, whereas all the files are stored, the check grammar module read the file, and use the web grammar checker, After that the check grammar module mark the sentences with the suitable grammar mark and save the file again in the system database.
 

Figure 3.5 check grammar module
3.4.2 The processing Phase
3.4.2.1 The Tokenization Module
In the Tokenization module : after bringing the document from the system database, whereas all the files are stored, the Tokenization module read the file, and brake down the file into paragraphs, after that brake down the paragraphs into sentences, then brake down the sentence into words. After that save the file again in the system database.
3.4.2.2 The Stop Words Removal and Rooting Module
The raw text of the document is treated separately as well. In order to extract terms from text, classic Natural Language Processing (NLP) techniques are applied as. Figure 3.6 illustrates Stop Words Removal and rooting module and its functions:

Figure 3.6: SWR and Rooting module
SWR: Common stop words in English include: a, an, the, in, of, on, are, be, if, into, which etc. Whereas stop words in Arabic include: من , إلى , عن , على , في …etc. These words do not provide a significant meaning to the documents . Therefore, they should be removed in order to reduce ‘noise’ and to reduce the computation time.
Word Stemming: it will be changed into the word’s basic form.
3.4.2.3 Replacement of Synonym
Replacement of Synonym: It may help to detect advanced forms of hidden plagiarism. The first synonym in the list of synonyms of a given word is considered as the most frequent one.
3.4.3 The Similarity Detection Phase
3.4.3.1 The Fingerprinting Module
It is consists of three modules Fingerprinting, documents representation and similarity detection, this phase discussed as follows: To calculate fingerprints of any document, first cut up the text into small pieces called chunks, the chunking method that responsible for cutting up the text will be determined [12]. A unit of chunk could be a sentence or a word. In case of chunking using sentences called sentence-based, the document can be cutted into small chunks based on ‘C’ parameter. For example, a document containing sentences ds1 ds2 ds3 ds4 ds5, if C=3 then the calculated chunks will be ds1 ds2 ds3, ds2 ds3 ds4, ds3 ds4 ds5. In case of chunking using word called a word-based chunking, the document is cutted into small chunks based on ‘C’ parameter. For example, a document containing words dw1 dw2 dw3 dw4 dw5, if C=3 then the calculated chunks will be dw1 dw2 dw3, dw2 dw3 dw4, dw3 dw4 dw5. The chunking using Word gives higher precision in similarity detection than the chunking sentence. ZPLAG is based on a word-based chunking method: in every sentence of a document, words are first chunked and then use a hash function for hashing.
3.4.3.2 The Document Representation Module
Document representation: for each document, create a document tree structure that describes its internal representation.
3.4.3.3 The Similarity Detection Module
A tree representation is created for each document to describe its logical structure. The root represents the document itself, the second level represents the paragraphs, and the leaf nodes contain the sentences.

Summary

Being a growing problem, The electronic thefts is generally known as “plagiarism” and “dishonesty academic” and they constitute a growing phenomenon, It should be known that way to prevent its spread and preserve the ethical principles that control the academic environments, with easy access to information on the World Wide Web and the large number of digital libraries, electronic thefts have become one of the most important issues that plague universities and scientific centers and research.
This chapter presented in detailed description of the proposed system for plagiarism detection in electronic resources and its phases and its functions.
 

What Will You Get?

We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.

Premium Quality

Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.

Experienced Writers

Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.

On-Time Delivery

Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.

24/7 Customer Support

Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.

Complete Confidentiality

Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.

Authentic Sources

We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.

Moneyback Guarantee

Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.

Order Tracking

You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.

image

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

image

Trusted Partner of 9650+ Students for Writing

From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.

Preferred Writer

Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.

Grammar Check Report

Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.

One Page Summary

You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.

Plagiarism Report

You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.

Free Features $66FREE

  • Most Qualified Writer $10FREE
  • Plagiarism Scan Report $10FREE
  • Unlimited Revisions $08FREE
  • Paper Formatting $05FREE
  • Cover Page $05FREE
  • Referencing & Bibliography $10FREE
  • Dedicated User Area $08FREE
  • 24/7 Order Tracking $05FREE
  • Periodic Email Alerts $05FREE
image

Our Services

Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.

  • On-time Delivery
  • 24/7 Order Tracking
  • Access to Authentic Sources
Academic Writing

We create perfect papers according to the guidelines.

Professional Editing

We seamlessly edit out errors from your papers.

Thorough Proofreading

We thoroughly read your final draft to identify errors.

image

Delegate Your Challenging Writing Tasks to Experienced Professionals

Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!

Check Out Our Sample Work

Dedication. Quality. Commitment. Punctuality

Categories
All samples
Essay (any type)
Essay (any type)
The Value of a Nursing Degree
Undergrad. (yrs 3-4)
Nursing
2
View this sample

It May Not Be Much, but It’s Honest Work!

Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.

0+

Happy Clients

0+

Words Written This Week

0+

Ongoing Orders

0%

Customer Satisfaction Rate
image

Process as Fine as Brewed Coffee

We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.

See How We Helped 9000+ Students Achieve Success

image

We Analyze Your Problem and Offer Customized Writing

We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.

  • Clear elicitation of your requirements.
  • Customized writing as per your needs.

We Mirror Your Guidelines to Deliver Quality Services

We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.

  • Proactive analysis of your writing.
  • Active communication to understand requirements.
image
image

We Handle Your Writing Tasks to Ensure Excellent Grades

We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.

  • Thorough research and analysis for every order.
  • Deliverance of reliable writing service to improve your grades.
Place an Order Start Chat Now
image

Order your essay today and save 30% with the discount code Happy