-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
7b37e89
commit 51b56d0
Showing
5 changed files
with
323 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
<a name="ds-cldfmetadatajson"> </a> | ||
|
||
# Wordlist CLDF dataset derived from Chacon et al.'s "Diversity of Arawakan Languages" from 2019 | ||
|
||
**CLDF Metadata**: [cldf-metadata.json](./cldf-metadata.json) | ||
|
||
**Sources**: [sources.bib](./sources.bib) | ||
|
||
property | value | ||
--- | --- | ||
[dc:bibliographicCitation](http://purl.org/dc/terms/bibliographicCitation) | Chacon, T. C.; Gonçalves, A. G.; and da Silva, L. F (2019): A diversidade linguística Aruák no Alto Rio Negro em gravações da década de 1950 [The diversity of Arawakan languages from the upper Rio Negro in recordings from the 1950s]. Forma y Función, 32.2, 41-67. DOI: 10.15446/fyf.v32n2.80814. | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF Wordlist](http://cldf.clld.org/v1.0/terms.rdf#Wordlist) | ||
[dc:format](http://purl.org/dc/terms/format) | <ol><li>http://concepticon.clld.org/contributions/Chacon-2019-220</li></ol> | ||
[dc:identifier](http://purl.org/dc/terms/identifier) | https://github.com/lexibank/chaconbaniwa/ | ||
[dc:license](http://purl.org/dc/terms/license) | https://creativecommons.org/licenses/by/4.0/ | ||
[dcat:accessURL](http://www.w3.org/ns/dcat#accessURL) | https://github.com/lexibank/chaconbaniwa | ||
[prov:wasDerivedFrom](http://www.w3.org/ns/prov#wasDerivedFrom) | <ol><li><a href="https://github.com/lexibank/chaconbaniwa/tree/7b37e89">lexibank/chaconbaniwa v1.0.1-25-g7b37e89</a></li><li><a href="https://github.com/glottolog/glottolog/tree/v4.4">Glottolog v4.4</a></li><li><a href="https://github.com/concepticon/concepticon-data/tree/v2.5.0">Concepticon v2.5.0</a></li><li><a href="https://github.com/cldf-clts/clts/tree/v2.1.0">CLTS v2.1.0</a></li></ol> | ||
[prov:wasGeneratedBy](http://www.w3.org/ns/prov#wasGeneratedBy) | <ol><li><strong>lingpy-rcParams</strong>: <a href="./lingpy-rcParams.json">lingpy-rcParams.json</a></li><li><strong>python</strong>: 3.8.10</li><li><strong>python-packages</strong>: <a href="./requirements.txt">requirements.txt</a></li></ol> | ||
[rdf:ID](http://www.w3.org/1999/02/22-rdf-syntax-ns#ID) | chaconbaniwa | ||
[rdf:type](http://www.w3.org/1999/02/22-rdf-syntax-ns#type) | http://www.w3.org/ns/dcat#Distribution | ||
|
||
|
||
## <a name="table-formscsv"></a>Table [forms.csv](./forms.csv) | ||
|
||
|
||
Raw lexical data item as it can be pulled out of the original datasets. | ||
|
||
This is the basis for creating rows in CLDF representations of the data by | ||
- splitting the lexical item into forms | ||
- cleaning the forms | ||
- potentially tokenizing the form | ||
|
||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF FormTable](http://cldf.clld.org/v1.0/terms.rdf#FormTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 2354 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string` | Primary key | ||
[Local_ID](http://purl.org/dc/terms/identifier) | `string` | | ||
[Language_ID](http://cldf.clld.org/v1.0/terms.rdf#languageReference) | `string` | References [languages.csv::ID](#table-languagescsv) | ||
[Parameter_ID](http://cldf.clld.org/v1.0/terms.rdf#parameterReference) | `string` | References [parameters.csv::ID](#table-parameterscsv) | ||
[Value](http://cldf.clld.org/v1.0/terms.rdf#value) | `string` | | ||
[Form](http://cldf.clld.org/v1.0/terms.rdf#form) | `string` | | ||
[Segments](http://cldf.clld.org/v1.0/terms.rdf#segments) | list of `string` (separated by ` `) | | ||
[Comment](http://cldf.clld.org/v1.0/terms.rdf#comment) | `string` | | ||
[Source](http://cldf.clld.org/v1.0/terms.rdf#source) | list of `string` (separated by `;`) | References [sources.bib::BibTeX-key](./sources.bib) | ||
`Cognacy` | `string` | | ||
`Loan` | `boolean` | | ||
`Graphemes` | `string` | | ||
`Profile` | `string` | | ||
|
||
## <a name="table-languagescsv"></a>Table [languages.csv](./languages.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF LanguageTable](http://cldf.clld.org/v1.0/terms.rdf#LanguageTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 14 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string` | Primary key | ||
[Name](http://cldf.clld.org/v1.0/terms.rdf#name) | `string` | | ||
[Glottocode](http://cldf.clld.org/v1.0/terms.rdf#glottocode) | `string` | | ||
`Glottolog_Name` | `string` | | ||
[ISO639P3code](http://cldf.clld.org/v1.0/terms.rdf#iso639P3code) | `string` | | ||
[Macroarea](http://cldf.clld.org/v1.0/terms.rdf#macroarea) | `string` | | ||
[Latitude](http://cldf.clld.org/v1.0/terms.rdf#latitude) | `decimal` | | ||
[Longitude](http://cldf.clld.org/v1.0/terms.rdf#longitude) | `decimal` | | ||
`Family` | `string` | | ||
|
||
## <a name="table-parameterscsv"></a>Table [parameters.csv](./parameters.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF ParameterTable](http://cldf.clld.org/v1.0/terms.rdf#ParameterTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 243 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string` | Primary key | ||
[Name](http://cldf.clld.org/v1.0/terms.rdf#name) | `string` | | ||
[Concepticon_ID](http://cldf.clld.org/v1.0/terms.rdf#concepticonReference) | `string` | | ||
`Concepticon_Gloss` | `string` | | ||
`Portuguese_Gloss` | `string` | | ||
|
||
## <a name="table-cognatescsv"></a>Table [cognates.csv](./cognates.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF CognateTable](http://cldf.clld.org/v1.0/terms.rdf#CognateTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 2354 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string` | Primary key | ||
[Form_ID](http://cldf.clld.org/v1.0/terms.rdf#formReference) | `string` | References [forms.csv::ID](#table-formscsv) | ||
[Form](http://linguistics-ontology.org/gold/2010/FormUnit) | `string` | | ||
[Cognateset_ID](http://cldf.clld.org/v1.0/terms.rdf#cognatesetReference) | `string` | | ||
`Doubt` | `boolean` | | ||
`Cognate_Detection_Method` | `string` | | ||
[Source](http://cldf.clld.org/v1.0/terms.rdf#source) | list of `string` (separated by `;`) | References [sources.bib::BibTeX-key](./sources.bib) | ||
[Alignment](http://cldf.clld.org/v1.0/terms.rdf#alignment) | list of `string` (separated by ` `) | | ||
`Alignment_Method` | `string` | | ||
`Alignment_Source` | `string` | | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
{ | ||
"_color": "Model: color\nInfo: Model for colored sound class output based on Dolgopolsky (1986)\nSource: Dolgopolsky (1986)\nCompiler: Johann-Mattis List\nDate: 2012-03", | ||
"align_classes": true, | ||
"align_factor": 0.3, | ||
"align_gap_weight": 0.5, | ||
"align_gop": -2, | ||
"align_mode": "global", | ||
"align_modes": [ | ||
[ | ||
"global", | ||
-2, | ||
0.5 | ||
], | ||
[ | ||
"local", | ||
-1, | ||
0.5 | ||
] | ||
], | ||
"align_notransform": { | ||
"A": 1, | ||
"B": 1, | ||
"C": 1, | ||
"L": 1, | ||
"M": 1, | ||
"N": 1, | ||
"T": 1, | ||
"X": 1, | ||
"Y": 1, | ||
"Z": 1, | ||
"_": 1 | ||
}, | ||
"align_scale": 0.5, | ||
"align_scorer": {}, | ||
"align_sonar": true, | ||
"align_stamp": "# MSA\n# dataset : {0}\n# collection : {1}\n# aligned by : LingPy Version {2} <www.lingpy.org>\n# created on : {3}\n# parameters : {4}\n", | ||
"align_transform": { | ||
"A": 1.6, | ||
"B": 1.3, | ||
"C": 1.2, | ||
"L": 1.1, | ||
"M": 1.1, | ||
"N": 0.5, | ||
"T": 1.0, | ||
"X": 3.0, | ||
"Y": 3.0, | ||
"Z": 0.7, | ||
"_": 0.0 | ||
}, | ||
"align_tree_calc": "neighbor", | ||
"art": "Model: art\nInfo: Specific sound-class model for the creation of prosodic strings.\nSource: List (2012)\nCompiler: Johann-Mattis List\nDate: 2012", | ||
"asjp": "Model: asjp\nInfo: Sound-Class model following Brown et al. (2008) and Brown et al. (2011)\nSource: Brown et al (2008), Brown et al. (2011)\nCompiler: Johann-Mattis List\nDate: 2011", | ||
"basic_orthography": "fuzzy", | ||
"breaks": ".-", | ||
"classes": true, | ||
"cmodules": false, | ||
"combiners": "\u0361\u035c", | ||
"comment": "#", | ||
"cv": "Model: cv\nInfo: Specific sound-class model for the creation of consonant vowel templates.\nSource: None\nCompiler: Johann-Mattis List\nDate: 2015", | ||
"diacritics": "!:|\u00af\u02b0\u02b1\u02b2\u02b3\u02b4\u02b5\u02b6\u02b7\u02b8\u02b9\u02ba\u02bb\u02bc\u02bd\u02be\u02bf\u02c0\u02c0 \u02c1\u02c2\u02c3\u02c4\u02c5\u02c6\u02c8\u02c9\u02ca\u02cb\u02cc\u02cd\u02ce\u02cf\u02d0\u02d1\u02d2\u02d3\u02d4\u02d5\u02d6\u02d7\u02de\u02df\u02e0\u02e1\u02e2\u02e3\u02e4\u02ec\u02ed\u02ee\u02ef\u02f0\u02f1\u02f2\u02f3\u02f4\u02f5\u02f6\u02f7\u02f8\u02f9\u02fa\u02fb\u02fc\u02fd\u02fe\u02ff\u0300\u0301\u0302\u0303\u0304\u0305\u0306\u0307\u0308\u0309\u030a\u030b\u030c\u030d\u030e\u030f\u0310\u0311\u0312\u0313\u0314\u0315\u0316\u0317\u0318\u0319\u031a\u031b\u031c\u031d\u031e\u031f\u0320\u0321\u0322\u0323\u0324\u0325\u0326\u0327\u0328\u0329\u032a\u032b\u032c\u032d\u032e\u032f\u0330\u0331\u0332\u0333\u0334\u0335\u0336\u0337\u0338\u0339\u033a\u033b\u033c\u033d\u033e\u033f\u0300\u0301\u0342\u0313\u0308\u0301\u0345\u0346\u0347\u0348\u0349\u034a\u034b\u034c\u034d\u034e\u034f\u0350\u0351\u0352\u0353\u0354\u0355\u0356\u0357\u0358\u0359\u035a\u035b\u035d\u035e\u035f\u0360\u0362\u0363\u0364\u0365\u0366\u0367\u0368\u0369\u036a\u036b\u036c\u036d\u036e\u036f\u0483\u0484\u0485\u0486\u0487\u0488\u0489\u0559\u0656\u0670\u0711\u07eb\u07ec\u07ed\u07ee\u07ef\u07f0\u07f1\u07f2\u07f3\u1d2c\u1d2d\u1d2e\u1d2f\u1d30\u1d31\u1d32\u1d33\u1d34\u1d35\u1d36\u1d37\u1d38\u1d39\u1d3a\u1d3b\u1d3c\u1d3d\u1d3e\u1d3f\u1d40\u1d41\u1d42\u1d43\u1d44\u1d45\u1d46\u1d47\u1d48\u1d49\u1d4a\u1d4b\u1d4c\u1d4d\u1d4e\u1d4f\u1d50\u1d51\u1d52\u1d53\u1d54\u1d55\u1d56\u1d57\u1d58\u1d59\u1d5a\u1d5b\u1d5c\u1d5d\u1d5e\u1d5f\u1d60\u1d61\u1d62\u1d63\u1d64\u1d65\u1d66\u1d67\u1d68\u1d69\u1d6a\u1d78\u1d9b\u1d9c\u1d9d\u1d9e\u1d9f\u1da0\u1da1\u1da2\u1da3\u1da4\u1da5\u1da6\u1da7\u1da8\u1da9\u1daa\u1dab\u1dac\u1dad\u1dae\u1daf\u1db0\u1db1\u1db2\u1db3\u1db4\u1db5\u1db6\u1db7\u1db8\u1db9\u1dba\u1dbb\u1dbc\u1dbd\u1dbe\u1dbf\u1dc0\u1dc1\u1dc2\u1dc3\u1dc4\u1dc5\u1dc6\u1dc7\u1dc8\u1dc9\u1dca\u1dcb\u1dcc\u1dcd\u1dce\u1dcf\u1dd3\u1dd4\u1dd5\u1dd6\u1dd7\u1dd8\u1dd9\u1dda\u1ddb\u1ddc\u1ddd\u1dde\u1ddf\u1de0\u1de1\u1de2\u1de3\u1de4\u1de5\u1de6\u1dfc\u1dfd\u1dfe\u1dff\u2071\u207a\u207b\u207c\u207d\u207e\u207f\u208a\u208b\u208c\u208d\u208e\u2090\u2091\u2092\u2093\u2094\u2095\u2096\u2097\u2098\u2099\u209a\u209b\u209c\u20d0\u20d1\u20d2\u20d3\u20d4\u20d5\u20d6\u20d7\u20d8\u20d9\u20da\u20db\u20dc\u20e5\u20e6\u20e7\u20e8\u20e9\u20ea\u20eb\u20ec\u20ed\u20ee\u20ef\u20f0\u2192\u21d2\u2a27\u2c7c\u2c7d\u2d6f\u2de0\u2de1\u2de2\u2de3\u2de4\u2de5\u2de6\u2de7\u2de8\u2de9\u2dea\u2deb\u2dec\u2ded\u2dee\u2def\u2df0\u2df1\u2df2\u2df3\u2df4\u2df5\u2df6\u2df7\u2df8\u2df9\u2dfa\u2dfb\u2dfc\u2dfd\u2dfe\u2dff\u3099\u309a\ua66f\ua67c\ua67d\ua69c\ua69d\ua71b\ua71c\ua71d\ua71e\ua71f\ua788\ua789\ua78a\ua8e0\ua8e1\ua8e2\ua8e3\ua8e4\ua8e5\ua8e6\ua8e7\ua8e8\ua8e9\ua8ea\ua8eb\ua8ec\ua8ed\ua8ee\ua8ef\ua8f0\ua8f1\uaa70\uab5c\uab5e\ufe20\ufe21\ufe22\ufe23\ufe24\ufe25\ufe26\uf1af\u0332", | ||
"dolgo": "Model: dolgo\nInfo: Sound-Class model based on Dolgopolsky (1986)\nSource: Dolgopolsky (1986)\nCompiler: Johann-Mattis List\nDate: 2012-03", | ||
"factor": 0.3, | ||
"figsize": [ | ||
10, | ||
10 | ||
], | ||
"filename": "lingpy-2021-07-21", | ||
"gap_symbol": "-", | ||
"gap_weight": 0.5, | ||
"gop": -2, | ||
"internal_morpheme_separator": "_", | ||
"jaeger": "Model: jaeger\nInfo: Sound-Class model based on PMI scores calculated for ASJP data.\nSource: Jaeger (2015)\nCompiler: unknown\nDate: 2016-03-29", | ||
"lexstat_bad_chars_limit": 0.1, | ||
"lexstat_cluster_method": "upgma", | ||
"lexstat_limit": 10000, | ||
"lexstat_modes": [ | ||
[ | ||
"global", | ||
-2, | ||
0.5 | ||
], | ||
[ | ||
"local", | ||
-1, | ||
0.5 | ||
] | ||
], | ||
"lexstat_preprocessing_method": "sca", | ||
"lexstat_preprocessing_threshold": 0.7, | ||
"lexstat_rands": 1000, | ||
"lexstat_ratio": [ | ||
2, | ||
1 | ||
], | ||
"lexstat_runs": 1000, | ||
"lexstat_scoring_method": "shuffle", | ||
"lexstat_scoring_threshold": 0.7, | ||
"lexstat_threshold": 0.45, | ||
"lexstat_transform": { | ||
"A": "C", | ||
"B": "C", | ||
"C": "C", | ||
"L": "c", | ||
"M": "c", | ||
"N": "c", | ||
"T": "T", | ||
"X": "V", | ||
"Y": "V", | ||
"Z": "V", | ||
"_": "_" | ||
}, | ||
"lexstat_vscale": 1.0, | ||
"merge_vowels": true, | ||
"model": "Model: sca\nInfo: Extended sound class model based on Dolgopolsky (1986)\nSource: List (2012)\nCompiler: Johann-Mattis List\nDate: 2012-03", | ||
"morpheme_separator": "+", | ||
"morpheme_separators": "\u25e6+\u2192\u2190", | ||
"nasal_placeholder": "\u223c", | ||
"ref": "cogid", | ||
"restricted_chars": "_T", | ||
"sca": "Model: sca\nInfo: Extended sound class model based on Dolgopolsky (1986)\nSource: List (2012)\nCompiler: Johann-Mattis List\nDate: 2012-03", | ||
"scale": 0.5, | ||
"schema": "qlc", | ||
"scorer": {}, | ||
"sonar": true, | ||
"stress": "\u02c8\u02cc'", | ||
"timestamp": "2021-07-21 10:56", | ||
"tones": "\u00b9\u00b2\u00b3\u2074\u2075\u2076\u2077\u2078\u2079\u2070\u2081\u2082\u2083\u2084\u2085\u2086\u2087\u2088\u2089\u20800123456789\u02e5\u02e6\u02e7\u02e8\u02e9\u02ea\u02eb-\ua708-\ua709-\ua70a-\ua70b-\ua70c-\ua70d-\ua70e-\ua70f-\ua710-\ua711-\ua712-\ua713-\ua714-\ua715-\ua716-\ua717-\ua718-\ua719-\ua71a-\ua700-\ua701-\ua702-\ua703-\ua704-\ua705-\ua706-\ua707", | ||
"tree_calc": "neighbor", | ||
"unique_sequences": true, | ||
"vowels": "\u1e4d\u02af\u03b5aeiouy\u00e1\u00e3\u00e6\u00ed\u00f5\u00f8\u00fa\u0129\u0131\u0153\u0169\u016b\u01d2\u01dd\u0207\u0217\u0250\u0251\u0252\u0254\u0258\u0259\u025a\u025b\u025c\u025e\u0264\u0268\u026a\u026f\u0275\u0276\u0277\u027f\u0285\u0289\u028a\u028c\u028f\u1d00\u1d07\u1d1c\u1ebd\u1ef9\u1e73", | ||
"word_separator": "_", | ||
"word_separators": "_#" | ||
} |
Oops, something went wrong.