From fed6ed193f3b1cd5613942d002d998757491d04f Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Sat, 13 Jan 2024 09:20:58 -0500 Subject: [PATCH 01/10] Update ReadMe file --- README.md | 69 +------------------------------------------------------ 1 file changed, 1 insertion(+), 68 deletions(-) diff --git a/README.md b/README.md index 3076250..a0236f4 100644 --- a/README.md +++ b/README.md @@ -1,68 +1 @@ -# Plagiarism-checker-Python - -This repo consists of a source code of a Python script which detects plagiarism in a textual document using **cosine similarity**. - -[![Become a patron](pictures/become_a_patron_button.png)](https://www.patreon.com/kalebujordan) - -## How is it Done? - -You might be wondering how plagiarism detection on textual data is done, well it ain't as complicated as you may think. - -We all know that computers are good with numbers; so in order to compute the similarity between two text documents, the textual raw data is transformed into vectors => arrays of numbers and from that, we make use of basic knowledge of vectors to compute the similarity between them. - -This repo contains a basic example on how to do that. - - -## Getting Started - -To get started with the code on this repo, you need to either *clone* or *download* this repo into your machine as shown below; - -```bash -git clone https://github.com/Kalebu/Plagiarism-checker-Python -``` - -## Dependencies - -Before you begin playing with the source code, you might need to install dependencies just as shown below; - -```bash -pip3 install -r requirements.txt -``` - -## Running the App - -To run this code you need to have your textual documents in your project directory with the **.txt** extension. When you run the script, it will automatically load all the documents with that extension and then compute the similarities between them as shown below; - -```bash -$-> cd Plagiarism-checker-Python -$ Plagiarism-checker-Python-> python3 app.py -('john.txt', 'juma.txt', 0.5465972177348937) -('fatma.txt', 'john.txt', 0.14806887549598566) -('fatma.txt', 'juma.txt', 0.18643448370323362) - -``` - -## A Python Library? - -Would you like to use a Python library instead to help you compare strings and documents without spending time writing the vectorizers by yourself, then take a look at [Pysimilar](https://github.com/Kalebu/pysimilar). - -## Explore it - -Explore it and twist it to your own use case. In case of any questions feel free to reach me directly at *isaackeinstein@gmail.com*. - -## Issues - -In case you have any difficulties or issues while trying to run the script -you can raise an issue. - -## Pull Requests - -If you have something to add, I welcome pull requests on improvement; your helpful contribution will be merged as soon as possible. - -## Give it a Star - -If you find this repo useful, give it a star so that many people can get to know it. - -## Credits - -All the credit goes to [kalebu](https://github.com/kalebu). +test delete \ No newline at end of file From 53e9d6641d31245063b682763322284be34ad260 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Sat, 13 Jan 2024 09:21:46 -0500 Subject: [PATCH 02/10] Fix comma in john.txt --- john.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/john.txt b/john.txt index 2d32910..c3d2942 100644 --- a/john.txt +++ b/john.txt @@ -1,2 +1,2 @@ Life is all about finding money and spending on luxury stuffs -Coz this life is kinda short , trust \ No newline at end of file +Coz this life is kinda short, trust \ No newline at end of file From c9fd365cd983a7f01794fbb03264ba2c0287be32 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Sat, 13 Jan 2024 09:38:46 -0500 Subject: [PATCH 03/10] Revert "Update ReadMe file" This reverts commit fed6ed193f3b1cd5613942d002d998757491d04f. :wq --- README.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a0236f4..3076250 100644 --- a/README.md +++ b/README.md @@ -1 +1,68 @@ -test delete \ No newline at end of file +# Plagiarism-checker-Python + +This repo consists of a source code of a Python script which detects plagiarism in a textual document using **cosine similarity**. + +[![Become a patron](pictures/become_a_patron_button.png)](https://www.patreon.com/kalebujordan) + +## How is it Done? + +You might be wondering how plagiarism detection on textual data is done, well it ain't as complicated as you may think. + +We all know that computers are good with numbers; so in order to compute the similarity between two text documents, the textual raw data is transformed into vectors => arrays of numbers and from that, we make use of basic knowledge of vectors to compute the similarity between them. + +This repo contains a basic example on how to do that. + + +## Getting Started + +To get started with the code on this repo, you need to either *clone* or *download* this repo into your machine as shown below; + +```bash +git clone https://github.com/Kalebu/Plagiarism-checker-Python +``` + +## Dependencies + +Before you begin playing with the source code, you might need to install dependencies just as shown below; + +```bash +pip3 install -r requirements.txt +``` + +## Running the App + +To run this code you need to have your textual documents in your project directory with the **.txt** extension. When you run the script, it will automatically load all the documents with that extension and then compute the similarities between them as shown below; + +```bash +$-> cd Plagiarism-checker-Python +$ Plagiarism-checker-Python-> python3 app.py +('john.txt', 'juma.txt', 0.5465972177348937) +('fatma.txt', 'john.txt', 0.14806887549598566) +('fatma.txt', 'juma.txt', 0.18643448370323362) + +``` + +## A Python Library? + +Would you like to use a Python library instead to help you compare strings and documents without spending time writing the vectorizers by yourself, then take a look at [Pysimilar](https://github.com/Kalebu/pysimilar). + +## Explore it + +Explore it and twist it to your own use case. In case of any questions feel free to reach me directly at *isaackeinstein@gmail.com*. + +## Issues + +In case you have any difficulties or issues while trying to run the script +you can raise an issue. + +## Pull Requests + +If you have something to add, I welcome pull requests on improvement; your helpful contribution will be merged as soon as possible. + +## Give it a Star + +If you find this repo useful, give it a star so that many people can get to know it. + +## Credits + +All the credit goes to [kalebu](https://github.com/kalebu). From a373ff2d13a8b710a1ba8ea332663c08ffb5e207 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 09:10:11 -0500 Subject: [PATCH 04/10] update in app.py --- README.md | 2 +- app.py | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3076250..d573519 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ This repo consists of a source code of a Python script which detects plagiarism [![Become a patron](pictures/become_a_patron_button.png)](https://www.patreon.com/kalebujordan) -## How is it Done? +## How is it You might be wondering how plagiarism detection on textual data is done, well it ain't as complicated as you may think. diff --git a/app.py b/app.py index 7a6f452..540e431 100644 --- a/app.py +++ b/app.py @@ -32,3 +32,4 @@ def check_plagiarism(): for data in check_plagiarism(): print(data) + print ("check for plagiarism") From 21c6a6b95e0d3d4b7d2056f9db3755aea7d965e6 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 09:43:50 -0500 Subject: [PATCH 05/10] updates app.py --- README.md | 2 +- app.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index d573519..3076250 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ This repo consists of a source code of a Python script which detects plagiarism [![Become a patron](pictures/become_a_patron_button.png)](https://www.patreon.com/kalebujordan) -## How is it +## How is it Done? You might be wondering how plagiarism detection on textual data is done, well it ain't as complicated as you may think. diff --git a/app.py b/app.py index 540e431..f021970 100644 --- a/app.py +++ b/app.py @@ -32,4 +32,4 @@ def check_plagiarism(): for data in check_plagiarism(): print(data) - print ("check for plagiarism") +print ("checked plagiarism") \ No newline at end of file From 03f23a9b318a43028f4d939ee0a97ea881f0d03b Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 10:00:17 -0500 Subject: [PATCH 06/10] completed the sentence in john.txt --- john.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/john.txt b/john.txt index c3d2942..6ed9675 100644 --- a/john.txt +++ b/john.txt @@ -1,2 +1,2 @@ Life is all about finding money and spending on luxury stuffs -Coz this life is kinda short, trust \ No newline at end of file +Coz this life is kinda short, trust the process ! \ No newline at end of file From 1a3b48048176f1db23bbf6a6f8cbacac4f545282 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 13:23:31 -0500 Subject: [PATCH 07/10] saved files --- README.md | 1 + app.py | 2 +- john.txt | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 3076250..1a6a097 100644 --- a/README.md +++ b/README.md @@ -40,6 +40,7 @@ $ Plagiarism-checker-Python-> python3 app.py ('fatma.txt', 'john.txt', 0.14806887549598566) ('fatma.txt', 'juma.txt', 0.18643448370323362) + ``` ## A Python Library? diff --git a/app.py b/app.py index f021970..d9937c9 100644 --- a/app.py +++ b/app.py @@ -32,4 +32,4 @@ def check_plagiarism(): for data in check_plagiarism(): print(data) -print ("checked plagiarism") \ No newline at end of file +print ("checked plagiarism") diff --git a/john.txt b/john.txt index 6ed9675..5dc9a39 100644 --- a/john.txt +++ b/john.txt @@ -1,2 +1,2 @@ Life is all about finding money and spending on luxury stuffs -Coz this life is kinda short, trust the process ! \ No newline at end of file +Coz this life is kinda short, trust the process ! From fdf2722c2dd64424020dd831a2605d5cbb4c08de Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 13:25:09 -0500 Subject: [PATCH 08/10] cherry-pick --- app.py | 1 + 1 file changed, 1 insertion(+) diff --git a/app.py b/app.py index d9937c9..4ae9b1b 100644 --- a/app.py +++ b/app.py @@ -33,3 +33,4 @@ def check_plagiarism(): for data in check_plagiarism(): print(data) print ("checked plagiarism") +print ("practising cherry-pick") \ No newline at end of file From d7a654ef44da45f9213720324297eb531456bb44 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Mon, 15 Jan 2024 16:27:49 -0500 Subject: [PATCH 09/10] cherry pick practice --- app.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/app.py b/app.py index 4ae9b1b..9e0ad27 100644 --- a/app.py +++ b/app.py @@ -33,4 +33,4 @@ def check_plagiarism(): for data in check_plagiarism(): print(data) print ("checked plagiarism") -print ("practising cherry-pick") \ No newline at end of file +print ("practising cherry-pick") From dff3161bd8852a27ebc578a1cf14fa21b15ee386 Mon Sep 17 00:00:00 2001 From: Aishwarya Shukla Date: Tue, 16 Jan 2024 10:40:04 -0500 Subject: [PATCH 10/10] gfg --- app.py | 1 + 1 file changed, 1 insertion(+) diff --git a/app.py b/app.py index 9e0ad27..c484ea2 100644 --- a/app.py +++ b/app.py @@ -34,3 +34,4 @@ def check_plagiarism(): print(data) print ("checked plagiarism") print ("practising cherry-pick") +hftdgh \ No newline at end of file