User
Guides
Sapiens Aperio Veritas Engine (SAVE)
Table of Contents
0.0 SAVE tutorial video
Before starting reading the manual, please watch the tutorial videos below. The videos show how to use SAVE service with different modes.
0.3 Video: Private + Internet Mode
0.4 Video: Private + Pairwise Mode
1.0 Introduction
Sapiens Aperio Veritas Engine (SAVE) is a plagiarism detection
service providing 2 different modes of submitting files and 2 different modes
of detecting plagiarism to satisfy different scenarios.
1.1 Different Modes of Submitting Files
The 2 different modes of submitting files are:
* non-private mode
* private mode
In non-private mode, you submit the document to be detected
directly, currently it supports .txt, .pdf, .doc, .docx, .xlsx, .pptx file
formats, or you can also copy and paste the clear text to be detected to the
SAVE webpage directly.
If you’d like to detect multiple files at once, you can submit a
zipped file contains all the files to be detected.
In private mode, there is no clear text to be submitted to
SAVE service. Instead, you should use SAVE app (an individual client side
software) to encode the documents to be detected into a special FASTA format
.fasta file.
The special FASTA encoding is a one-way encryption which means
it’s not invertible so even-though these FASTA-encoded files being submitted to
SAVE service, it’s unable to reverse the process and get the original clear
text. It’s ensure the privacy.
Just like in non-private mode, you can submit a single .fasta
file, or copy and paste the FASTA formatted string to the SAVE webpage
directly, or you can submit a zipped file contains all the .fasta files to be
detected.
1.2 Different Modes of Detecting Plagiarism
The 2 different modes of detecting plagiarism are:
* Internet mode
* Pairwise mode
In Internet mode, the submitted documents would be compared
with content on the Internet to see there is any possible plagiarism.
In Pairwise mode, the submitted documents would be compared
with each other to see if they copied each other. Since the comparison is only
between the submitted documents in Pairwise mode, you should always submit a
zipped file contains multiple documents to make sense.
1.3 Which Mode Should You Use
There are 2 modes of submitting files and 2 modes of detecting
plagiarism, so there are 4 different combinations in total.
* non-private + Internet mode:
compare clear text documents with content on the Internet
* non-private + Pairwise mode:
compare clear text documents with each other
* private + Internet mode:
compare FASTA-encoded documents with content on the internet (currently
restricted to Wikipedia and Pubmed)
* private + Pairwise mode:
compare FASTA-encoded documents with each other
For the modes of submitting files:
If you don't worry about your clear text documents
being submitted to SAVE service and letting SAVE service knowing the content of
your documents, you can use non-private mode.
If you do care about protecting your documents and
trying to keep the content of your documents, first use SAVE to one-way encrypt
your documents and then submit the encrypted files using non-private mode.
For the modes of detecting plagiarism:
If you'd like to check whether the submitted
documents copied the content on the Internet, you should use Internet mode.
If you'd like to check whether the submitted
documents copied each other, you should use Pairwise mode.
1.4 Further Support/Help
If you require additional help (or information that is not
available in this document), please feel free to send an email to lwyang@life.nthu.edu.tw or Mr. Yuan-Yu Chang.
2.0 SAVE app
In private mode, the documents being submitted must be
FASTA-encoded, use our SAVE app to encode a clear text document into
FASTA-encoded format.
2.1 Download SAVE App
(1) Click “Download Local Software” button at the front page.
(2) Download the SAVE app according to your OS.
(3) Uncompress the downloaded zipped file and execute SAVE app
2.2 Use SAVE App to Encode a Single Document
(1) Select the “Encode one file” option.
(2) In this case we'd like to encode the file named “wiki clear text.docx”.
The content of “wiki clear text.docx” looks like this:
(3) the console would show the processing status:
SAVE app would create a folder for the encoded file:
In this folder you can see the encoded file:
The encoded file is just a plain text file looks like this:
2.3 Use SAVE App to Encode a Zipped File Contains Multiple
Documents
(1) Select the “Encode a ZIP file containing multiple documents” option.
(2) In this case we'd like to encode the file named “nonPrivate+pairwise.zip”.
Which contains 2 clear text files “ex2.docx” and “ex3.pdf”:
The content of “ex2.docx” looks like this:
The content of “ex3.pdf” looks like this:
(3) The console would show the processing status:
SAVE app would create a folder for a zipped file contains encoded files and temporarily stores the original clear text files:
In this case, the 2019_26_06_16_16_10.zip zipped file is the file should be submitted to SAVE service later:
The 2019_26_06_16_16_10.zip zipped file contains the FASTA-encoded files:
These .fasta files are just plain text files look like this:
3.0 Examples
The examples below show how to use SAVE service with different modes.
3.1 Non-private + Internet Mode
(1) Select a clear text file or a zipped file contains multiple clear text files you’d like to detect and then click submit.
(2) Wait for the job finished.
(3) Once the job is done, the status would be updated automatically, so there is no need to refresh the webpage yourself.
(4) Click the job for more details. You can see the statistics of the longest Verbatim copy, copy coverage and word count of the document here:
(5) Click the document to see the comparing result:
3.2 Non-private + Pairwise Mode
(1) Select the zipped file contains clear text files you’d like to detect and then click submit
(2) Wait for the job finished.
(3) Once the job is done, the status would be updated automatically, so there is no need to refresh the webpage yourself.
(4) Click the job for more details, you can see the statistics of the longest Verbatim copy, copy coverage and word count of each documents in the submitted zipped file here:
(5) Click one of these documents to see the comparing result, in this case ex1.rtf, as you can see ex1.rtf and ex3.pdf copied each other:
3.3 Private + Internet Mode
(1) Use SAVE app to encode the document into FASTA-encoded format first, please see Section 2.0 for more details. (2) Select a FASTA-encoded file or a zipped file contains multiple FASTA-encoded files you’d like to detect and then click submit:
(3) Wait for the job finished.
(4) Once the job is done, the status would be updated automatically, so there is no need to refresh the webpage yourself.
(5) Click the job for more details. You can see the statistics of the longest Verbatim copy, copy coverage and word count of the document here:
(6) Click the document to see the comparing result. the FASTA-encoded file you submitted is on the right hand side, and on the left hand side is the result SAVE service found. The result shows the submitted file copied the content of “Dynamic programming” from Wikipedia:
3.4 Private + Pairwise Mode
(1) use SAVE app to encode a zipped file contains multiple clear text documents into FASTA-encoded format first, please see Section 2.3 for more details. (2) Select the zipped file contains multiple FASTA-encoded files SAVE app just generated and then click submit.
(3) Wait for the job finished.
(4) Once the job is done, the status would be updated automatically, so there is no need to refresh the webpage yourself.
(5) Click the job for more details, you can see the statistics of the longest Verbatim copy, copy coverage and word count of each documents in the submitted zipped file here. Notice the “Download the Result for Local Software” button here. For further processing by SAVE_APP to reveal the exact plagiarized locations in original documents, you should click this button and download the result file (in .json extension)
(6) Click one of these documents to see the comparing result, in this case ex3.pdf, as you can see ex3.pdf and ex2.docx copied each other, but since these documents are FASTA-encoded, the exact plagiarized locations and contents are concealed.
(7) Open SAVE app again and select “Process pairwise comparison results”:
(8) Select the result .json file you just downloaded:
(9) The console would show the processing status:
(10) SAVE app would generate some .html files in the original folder:
(11) Click index.html and you can see the link for the comparison result for different documents:
(12) Click ex3.pdf, for example, and you can see the exact plagiarized locations and contents in the original clear text document: