-
Notifications
You must be signed in to change notification settings - Fork 17
miniproject: viral epidemics and zoonoses
Which Viral Zoonoses lead to Viral Epidemic?
SANA SAIFI
_
Zoonoses are diseases transmissible from animals, to Humans. Both new and old viral zoonoses are important in emerging and reemerging virus diseases leading to a epidemic. Scientists estimate that more than 6 out of every 10 known infectious diseases in people can be spread from animals , and 3 out of every 4 new or emerging infectious diseases in people come from animals.
OBJECTIVE
This Mini Project is set to find, How and which zoonotic diseases lead to the Viral Epidemic.
METHODOLOGY
-
Using the communal corpus
Viral Epidemic
50 articles were downloaded using get papers.🟩FINISHED
-
Binary Classification of the 50 articles into True Positives/ False Positivesi.e, the articles are based on Viral Epidemics or not.🟩
FINISHED
-
Using ami search to find whether the articles mentioned any comorbidity in a viral epidemic or not, annotating with dictionaries to create ami DataTables.🟩
FINISHED
-
Sectioning the articles using ami sectionto split a document in a
Ctree
into sections. Based on tags from JATS, etc.🟩FINISHED
-
Re-run the query to get a corpus of 950 articles on the _ Viral Epidemics and Zoonoses_.🟩
FINISHED
-
Scrutinizing the 950 articles for true positives and false positives and creating a spreadsheet.🟨
STARTED
-
Using ami search to create DataTables and ami section for sectioning the 950 articles.🟩
FINISHED
-
Create a dictionary, specifically related to the Mini Project.🟪
IN PROGRESS
-
Sectioning the papers on the basis of the diseases related to animals.🟨
STARTED
-
Use relevant machine learning techniques for the classification of data based on whether the papers are related to viral epidemics and the which Viral Zoonotic Disease were reported.🟨
STARTED
-
Displaying of results using
R
/KNIME
. 🟥NOT STARTED
PROGRESS
◾ Spreadsheet of 50 articles classified into the subcategories of viruses, funders, countries, year of publish, testing and tracing, and type of paper.🟩FINISHED
◾ Sectioning of the 950 papers using ami section 🟩FINISHED
◾ Downloaded a corpus of 950 articles on viral epidemics and zoonoses using getpapers
🟩FINISHED
◾ Created a dictionary with 135 entries on zoonotic disease using ami dict
.🟩FINISHED
◾ Created a Dictionary using Wikidata Query Service and SPARQL.🟩FINISHED
◾ Run ami search on corpus 950. 🟩FINISHED
◾ Release corpus 950 using Github desktop. 🟩FINISHED
◾ Installation of Anaconda for installing various tools i.e., Jupyter. 🟩FINISHED
◾ Initially the communal corpus of 50 articles on viral epidemics
.
getpapers -q viral epidemics -k 950 -o viral epidemics -x -p
◾ Next, a new corpus of 950 articles using the Dictionary Zoonoses.
◾ Downloaded the corpus of 950 articles using getpapers with the syntax:
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
◾ This corpora was classified, searched and sectioned.
There are three methods to upload the corpus.
- Through VISUAL CODE STUDIO.
See @Ambreen's Page for the instructions
- Through COMMAND PROMPT
pre-required: openVirus repository in pc. if not clone it from the following syntax.
git clone https://github.com/petermr/openVirus.git
then follow these command lines.
C:\Users\admin>cd openVirus
C:\Users\admin\openVirus> cd miniproject
C:\Users\admin\openVirus\miniproject> cd zoonoses
C:\Users\admin\openVirus\miniproject\zoonoses>git status
C:\Users\admin\openVirus\miniproject\zoonoses>dir
C:\Users\admin\openVirus\miniproject\zoonoses>git add *
C:\Users\admin\openVirus\miniproject\zoonoses>git status
C:\Users\admin\openVirus\miniproject\zoonoses>git commit -am "first commit all corpus"
C:\Users\admin\openVirus\miniproject\zoonoses>git pull
C:\Users\admin\openVirus\miniproject\zoonoses>git push
- Through Github Desktop
pre-required: Github Desktop (install from here and cloned openVirus Repository.
- Open the folder where we cloned the repository. Open your files in CProject.
- Copy the files and Paste to the folder in openVirus repository(remote repository) where we want to commit the files.
- Open the Github desktop.
- Go to 'File', then 'Add Local Repository'.
- Now, choose the openVirus repository from your system.
- Add a commit message and go to 'Commit to master'.
- After committing, go to 'Push to origin'.
- After completion of pushing the repository, your uploaded files can be viewed on the Github repository.
- How to create dictionary?
(https://github.com/petermr/openVirus/wiki/Dictionary:-Zoonosis#how-i-created-)
-
The Test Dictionary created using
amidict
. -
The dictionary created using SPARQL Query Service from Wikidata.
-
Results
-
The Test Dictionary created using
amidict
was done manually and lacked synonyms, host, variable name, description, wikidata links, wikipedia links and etc. -
The Dictionary created using
SPARQL
had descriptions, links, some synonyms, labels and ids. However, the rendered results were _Scientific Articles and Journals _.This need refining as we want the ids which is on Zoonotic diseases/viruses.
As PMR suggested this zoonotic disease dictionary has to be done manually. This work is in progress.
-
nodejs
nvm
for installing get papers -
getpapers
for retrieving 950 articles from EuPMC -
AMI
for sectioning and searching. -
SPARQL
andamidict
for creating dictionaries. -
KNIME
for displaying results.
AMI
SECTIONING :
Sectioning of the dataset is usually done for greater precision.
-
Downloaded the corpus of 950 papers using getpapers in XML, PDF and JSON file.
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
-
To easy the process, made 5 subfolders of 200 corpus.
-
To divide the content of papers into sections of front, body, back and float groups, again open the Command Prompt and give the syntax:
ami -p <name of directory> section
-
This will create a subfolder of sections in each folder of the scientific paper which is there in your directory.
AMI
SEARCH
-
Downloaded the corpus of 950 papers using the above same syntax in XML, PDF and JSON file.
-
To search the dictionary of country drugs funders diseases, open the command prompt and give syntax:
ami -p <name of directory> search --dictionary country drugs funders diseases
-
Open the directory and at the end of folder you will find various HTML Document.
AMI
VALIDATION
Open command prompt and type :
cd ami3
git pull
mvn clean install -Dmaven.test.skip=true
Wait! ... BUILD SUCCESS!
NOT STARTED: KNIME, Keras, R
STARTED : dictionary
BLOCKED : .
FINISHED : downloading and installing get papers, manual classification, list of zoonotic diseases, installing ami, getpapers, maven, jdk, sectioning of corpus950, ami search of corpus 950.