Skip to content

jranks123/women-in-media-article-entity-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Setup

You will need to add your aws credentials in a file located at ~/.aws/credentials (create it if it does not exist. Add the following to the file:

[bechdel]
aws_access_key_id = AKIAJTRXJSBJADGLOBAQ
aws_secret_access_key = <secret key>


How to run program


1) Open command prompt
2) paste the following command and press enter:  cd C:\Users\w1515247\golang\src\women-in-media-article-entity-analysis\cmd\cli
2.5) paste the following command and press enter:  cd C:\Users\w1515247\golang\src\women-in-media-article-entity-analysis\cmd\git pull to check for updates
2.6) If you get a message along the lines of "Please commit your changes or stash them before you merge. Aborting", as long as you haven't made any code changes, you should be safe to run
"git stash" (which will wipe your changes) followed by git pull. WARNING: this will delete any files containing analysis that have ended up in the directory, so make sure you have copied this
somewhere else if you want to keep it going forwards.
3) make sure that the query condition in cmd/query_condition.sql returns the articles that you want to analyse
4) Click "File -> Save all" in Atom
4) paste the following command and press enter:  go build && ./cli -runType=<run type> -manuallyCorrectGender=<true or false>

in the above statement, you will need to replace <run type> with ENTITIES_AND_GENDER, JUST_ENTITIES or JUST_GENDER, and <true or false> with either true or false

This will analyse all articles that match the query.

To see the results (for now), these commands will be useful


/* see all the entities */
select * from public.article_entities
where <inset conditions>

/* see all the entities with their gender */
SELECT *, n.gender
from public.article_entities a
join public.names n
on a.text = n.name
where  <inset conditions>

/* see the gender count */
select n.gender, count(*)
from public.article_entities a
join public.names n
on a.text = n.name
where  <inset conditions>
group by 1

/* see the entities without a gender */
select * from
public.article_entities a
join public.names n
on a.text = n.name
where  <inset conditions>
and n.gender = ''

/* see summary of all the articles, their counts and their entities */
SELECT article.id,
author.name as journlist_name,
n2.gender as journalist_gender,
count(*) FILTER (WHERE n.gender = 'Male') as men_count,
    count(*) FILTER (WHERE n.gender = 'Female') as female_count,
array_agg(json_build_object('text', coalesce(text, ''), 'gender', n.gender, 'nextWord', nextWord, 'score', coalesce(score, 0)))
    FROM article article
    LEFT join article_entities ae
    ON article.id = ae.article_id
    LEFT join names n
    ON ae.text = n.name
    left join author_attr aa
    ON article.id = aa.article_id
    LEFT join author
    ON aa.author_id = author.id
    LEFT join names n2
    ON author.name = n2.name
where  <inset conditions>


/* see oveall summary of the analysis */
with analysis as (SELECT article.id,
	author.name as journlist_name,
	n2.gender as journalist_gender,
	count(*) FILTER (WHERE n.gender = 'Female') as number_of_women,
	count(*) FILTER (WHERE n.gender = 'Male') as number_of_men,
	array_agg(json_build_object('text', coalesce(text, ''), 'gender', n.gender, 'nextWord', nextWord, 'score', coalesce(score, 0)))
		FROM article article
		LEFT join article_entities ae
		ON article.id = ae.article_id
		LEFT join names n
		ON ae.text = n.name
		left join author_attr aa
		ON article.id = aa.article_id
		LEFT join author
		ON aa.author_id = author.id
		LEFT join names n2
		ON author.name = n2.name
WHERE <insert condition>
group by 1,2,3)

select count(*) as number_of_articles,
count(*) FILTER (WHERE journalist_gender = 'Female') as female_journalise_count,
count(*) FILTER (WHERE journalist_gender = 'Male') as male_journalise_count,
sum(number_of_men) as overall_men_count,
sum(number_of_women) as overall_women_count
from analysis

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published