Skip to content

Latest commit

 

History

History
64 lines (40 loc) · 5.91 KB

README.md

File metadata and controls

64 lines (40 loc) · 5.91 KB

Omaiyo Language Resources Wiki

This work is licensed under a Creative Commons Attribution 4.0 International License. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

Please visit the project on OSF for all related data files. Machine-readable text files for the project are hosted here on GitHub.

1. Introduction

This project is a collection of data from a variety of different sources, all related to the unclassified language of northern Tanzania known as Omaiyo or Omaio. As of the last data collection period in 2018, this language was moribund and spoken by only three semi-speakers:

  • Nairuko Siruwananga: Female. The speaker who remembers the most, sister of Leperia. Hard of hearing.
  • Leperia Mbogo: Male, ~75 years old. Knows a few words, brother of Nairuko. Very frail and unable to see well.
  • Langaseki Olepaw: Male, ~80 years old. Learned some Omaiyo from the other speakers but was not raised as an Omaiyo speaker.

Image of speakers A photo of the three remaining speakers of Omaiyo. From right to left: the local chairman, Langaseki Olepaw, Leperia Mbogo, Daudi Peterson, Nairuko Siruwananga, and Alina Redka.

Others exist who self-identify as members of the Omaiyo community but do not speak the language at all, and it is remotely possible that there are other speakers who have not been contacted. The three speakers who have been recorded have not actively spoken the language since childhood and thus exhibit the effects of language attrition.

According to interviews with the speakers, the Omaiyo were at one point residing in the Serengeti before being expelled in the 1950's together with the Maasai. They now live near the border between the Ngorongoro Conservation Area and the Maswa Game Reserve, near the village of Makao.

2. Description of the data sources

Currently, there are five sources for data, collected over a period of six years. All data was collected in or around Makao village. Most of the transcription data were created by untrained transcribers and were created using an orthographic system similar to that of Swahili. In total, there are approximately 360 distinct form-meaning pairs, although there is occasionally overlap in the meaning or form of different pairs.

Source #1: Daudi Peterson (2012)

File types: .pdf, .csv

Description: Transcriptions by Daudi only, no audio. Transcriptions were later sent around to linguists, but no clear analysis was produced. The .csv file was manually created by Richard Griscom following the contents of the .pdf file. Two male speakers.

Source #2 Daudi Peterson (2014)

File types: .m4a, .PDF, .jpg, .csv

Description: During a visit to the Hadzabe in the Dunduhia area, Daudi visited the Omaiyo again. Transcriptions made by Daudi and two others, audio recorded with an iPhone. Bonny Sands later created transcriptions based on the iPhone recordings. There are two .csv files: one with vertical orientation without transcriptions from Sands, and one with horizontal orientation with transcriptions and notes from Sands. Two male speakers + female speaker

Source #3 Lazaro R. Ole wanga (2017)

File types: .jpg, .csv

Description: Karsten Legère sent a Tanzanian colleague of his to meet with the Omaiyo speakers. Transcriptions only. The .csv file was manually created by Richard Griscom following the contents of the .pdf file. Five speaker names listed, but none correspond 100% with names given during other data collection sessions.

Source #4 Daudi Peterson (2018)

File types: .csv

Description: Daudi invited the speakers to the Mwiba lodge for more data collection. Transcriptions only, made by Daudi, Paulo Parana, and Alina from Mwiba lodge. Two male speakers + female speaker

Source #5 Richard Griscom (2018)

File types: .wav, .pdf, .eaf, .TextGrid, .tsv

Description: Recordings with an audio recorder (and headset microphone for two recordings). Transcriptions by Richard Griscom and Yonah Ndege, a speaker of Asimjeeg Datooga. The .pdf file is of the original hand-written transcriptions. Four recordings, all created in the same day:

  • 2018-07-26_Omaiyo_1: Two male speakers giving speaker metadata and a few words
  • 2018-07-26_Omaiyo_2: Female speaker giving speaker metada and a few words
  • 2018-07-26_Omaiyo_3: Female speaker, sometimes aided by the two male speakers, giving words
  • 2018-07-26_Omaiyo_4: Female speaker, sometimes aided by the two male speakers, giving words

The .eaf files are for use with ELAN, and the .TextGrid files are for use with Praat. The .TextGrid and .tsv files were exported from ELAN, and all time-aligned annotations were created by Richard Griscom. Additional post-hoc transcriptions were added using the IPA.

Compiled data

In addition to original data sources, there is a "compiled data" .csv file which includes compiled transcriptions and translations from all of the five data sources.

3. Data organization

Machine-readable text data are organized into folders using GitHub for version control and collaborative development. Resource data, including audio, images, and documents, are organized into folders using Dropbox. All of the data are organized into folders by the source of origin.