Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating an eHRAF 'society set' in D-PLACE #247

Open
kirbykat opened this issue May 9, 2019 · 2 comments
Open

Creating an eHRAF 'society set' in D-PLACE #247

kirbykat opened this issue May 9, 2019 · 2 comments
Labels

Comments

@kirbykat
Copy link
Collaborator

kirbykat commented May 9, 2019

Rationale for including eHRAF as a society set in D-PLACE:

Both D-PLACE 1.0 and 2.0 were linked to eHRAF where possible. However, we felt eHRAF should be included in D-PLACE as a stand-alone 'society set' for a number of reasons:

  • eHRAF is an incredible resource (digitized, searchable ethnographies in which every paragraph has been tagged with one or more subjects as well as with a large amount of meta-data (focal year to which paragraph refers, author of paragraph, qualifications of author (anthropologist? missionary?) etc.), yet not all eHRAF societies are currently represented in D-PLACE

  • even when eHRAF societies are included in an existing D-PLACE dataset (e.g., the Ethnographic Atlas, or Binford Forager dataset), users cannot quickly 'eyeball' the spatial distribution of all eHRAF societies on the D-PLACE global map, or filter the main D-PLACE 'societies' table to look only at eHRAF societies

  • as with many cross-cultural datasets, eHRAF's 'cultural 'units' (cases) do not always correspond 1:1 with 'cultural units' in other datasets. For example, a single eHRAF case may include documents describing a number of closely related 'cultures', each of which has received its own ID and codes in datasets like the EA and SCCS. The advantage of eHRAF is that its "data" is simply raw text, i.e., cultural observations in HRAF are not reduced or summarized as "codes". By identifying which eHRAF texts (documents) refer to which subgroups, it is therefore possible to de-aggregate HRAF 'data' and to help users identify which texts are likely to best complement coded data for particular 'societies' in other D-PLACE datasets.

Steps taken to link eHRAF to D-PLACE

Much of the work linking HRAF to D-PLACE datasets was done by the HRAF team, and in publications such as Ember (2007). However, this early work did not (generally) attempt to de-aggregate HRAF cases that described multiple language or cultural groups. eHRAF cases also were not linked to glottocodes or to latitude/longitude coordinates, limiting the ease with which eHRAF could be combined with the linguistic phylogenies and environmental data that are corner stones of D-PLACE. To prepare eHRAF for inclusion in D-PLACE we thus:

  • split eHRAF cases corresponding to distinct D-PLACE societies; named these new "subcases" and modified their HRAF OWC IDs to include a suffix (e.g., AN01 became AN01a and AN01b), such that subcases could be referred to by their unique identifiers.

  • identified the language- (or dialect-) level glottocode(s) for each eHRAF case/subcase

  • identified latitude-longitude coordinates for each subcase, using a combination of Glottolog language coordinates, geographic coordinates for other D-PLACE societies, and web-based triangulation

  • identified, where possible, the HRAF document numbers that should be linked to a particular subcase (ongoing!)

  • decided on a set of eHRAF cases to exclude, until more information on the focal population/geographic location/years to which the descriptions refer can be collected. This includes many of eHRAF's 'North American Regional Ethnic Group cases (e.g., Italian Americans, Hmong Americans), with the exception of those that are strongly linked to both a language and geographic location (e.g., Sea Islanders; Appalachians).

  • We also excluded HRAF microfiche cases that are not on HRAF's "planned list" of cases to be digitized.

DATA COMING SOON...

References
Ember, C. 2007. Using the HRAF Collection of Ethnography in Conjunction with the Standard Cross-Cultural Sample and the Ethnographic Atlas. Cross-Cultural Research 41: 396.

@SimonGreenhill
Copy link
Collaborator

@kirbykat - what would this take to complete?

@kirbykat
Copy link
Collaborator Author

I estimate 2.5 days. It is a case of "the last 2%" of the work. It is very close to done.

@SimonGreenhill SimonGreenhill modified the milestone: v1.2 Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants