A python package to detect plagiarism in document
- Python == 3.8
- scikit-learn
Using pip
pip install git+https://github.com/u2rafi/python-plagiarism.git
Using source
git clone https://github.com/u2rafi/python-plagiarism.git
cd python-plagiarism
python setup.py install
>>> from plagiarism.core import Plagiarism
>>> from plagiarism.source import DataSetSource
>>> src = DataSetSource('plagiarism/dataset')
>>> plg = Plagiarism(source=src)
# get similarity percentage in number (float)
>>> plg.compare('Big bang').get()
# matching with dataset (multiple files)
>>> plg.compare('Big bang').getlist()
python -m plagiarism.cli runserver -h 0.0.0.0 -p 5000
# docker
cd python-plagiarism
docker build -t plagiarism .
docker run -p 5000:5000 plagiarism:latest
# docker compose
docker-compose up -d --build
kubectl apply -f kubernetes-deployment.yml -n default
plagiarize compare --file test_input.txt
plagiarize runserver
Test cases are in .tests
directory
python3 -m pytest tests
pytest --cov=plagiarism .
A python-plagiarism app has been deployed in heroku docker and can be accessed using this link