-
Notifications
You must be signed in to change notification settings - Fork 17
miniproject: viral epidemics and zoonoses
SANA SAIFI
Zoonoses are diseases transmissible from animals, to Humans. Both new and old viral zoonoses are important in emerging and reemerging virus diseases leading to a epidemic. Scientists estimate that more than 6 out of every 10 known infectious diseases in people can be spread from animals, and 3 out of every 4 new or emerging infectious diseases in people come from animals.
OBJECTIVE
This Mini Project is set to find, How and which zoonotic diseases lead to the Viral Epidemic.
METHODOLOGY
◾ Using the communal corpus Viral Epidemic
on 50 articles.
◾ Binary Classification based on various parameters- related to viral epidemic or not, funders named or not, country mentioned or not and so on.
◾ Re-run the query to get a corpus of 950 articles on the same.
◾ Create a dictionary on zoonotic diseases, specifically related to the Mini Project.
◾ Sectioning the papers on the basis of the diseases related to animals.
PROGRESS
◾ Spreadsheet of 50 articles classified into the subcategories of viruses, funders, countries, year of publish, testing and tracing, and type of paper is done.
◾ Initially the communal corpus of 50 articles on viral epidemics
.
getpapers -q viral epidemics -k 950 -o viral epidemics -x -p
◾ Next, a new corpus of 950 articles using the Dictionary Zoonoses.
◾ Downloaded the corpus of 950 articles using getpapers with the syntax:
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
◾ nodejs
nvm
for installing get papers
◾ getpapers
for retrieving papers in new corpora.
◾ ami
for sectioning and searching.
SECTIONING : Sectioning of the dataset is usually done for greater precision.
-
downloaded the corpus of 950 papers using getpapers in XML, PDF and JSON file.
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
-
To easy the process, made 5 subfolders of 200 corpus.
-
To divide the content of papers into sections of front, body, back and float groups, again open the Command Prompt and give the syntax:
ami -p <name of directory> section
-
This will create a subfolder of sections in each folder of the scientific paper which is there in your directory.
◾ amidict
will be used for creating dictionaries.
◾ R
for data analysis / KNIME
NOT STARTED : KNIME, Keras, R
STARTED : dictionary
BLOCKED : .
FINISHED : downloading and installing get papers, manual classification, list of zoonotic diseases, installing ami, getpapers, maven, jdk, sectioning of corpus950