-
Notifications
You must be signed in to change notification settings - Fork 3
/
releases.html
99 lines (88 loc) · 4.23 KB
/
releases.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
layout: default
title: childes-db Releases
---
<div id="wrapper">
<div class="container">
<h2>
Releases
</h2>
<div class="hero-unit">
<div class="row">
<p><code>childes-db</code> will be updated every year with a most recent parse of the CHILDES database to reflect new corpora and to revise existing annotations (more details can be found in the <a href="docs.html">documentation</a>). These releases may also include new features (e.g., 2020.1 introduces phonological transcriptions for datasets in Phonbank). Visualizations and the childesr package will always use the most recent database version by default. The R API <code>childesr</code> and the Python API <code>childespy</code> can be directed to use any recent database version by using the <code>version</code> parameter.</p>
</div>
<center>
<div class="row">
<table style="width:60%">
<tr>
<td><h4 style="color:red;">2018.1 </h4><a name="2018.1"></a></td>
<td> </td>
<td>Initial release</td>
</tr>
<tr>
<td><h4 style="color:red;">2019.1 </h4><a name="2019.1"></a></td>
<td> </td>
<td> Re-parsed to reflect 2019 changes in CHILDES. Note that this excludes key datasets like Providence which were moved to Phonbank.</td>
</tr>
<tr>
<td><h4 style="color:red;">2020.1 </h4><a name="2020.1"></a></td>
<td> </td>
<td>Re-parsed to reflect 2020 changes in CHILDES, as well as the 2020 verison of Phonbank.</td>
</tr>
<tr>
<td><h4 style="color:red;">2021.1 </h4><a name="2021.1"></a></td>
<td> </td>
<td>Re-parsed to reflect 2021 changes in CHILDES and PhonBank, using a new version of the corpus procesing code and a better set of tests.</td>
</tr>
</table>
</div>
</center>
</div>
<h2>Using childes-db Locally</h2>
<div class="hero-unit">
<div class="row">
<p>
For intensive use cases, e.g. repeatedly transferring more than 5 GB of data, users may wish to download one or more
yearly releases of the database for installation on a local MySQL server (either on their own machine or a machine on their local network). The release databases can be downloaded <code>mysqldump</code> command:
</p>
<p><code>mysqldump -v -u $USER_FROM_JSON -p$PASSWORD_FROM_JSON -h $HOST_FROM_JSON --single-transaction --no-tablespaces -C --quick --databases $DATABASES | mysql -u $LOCAL_USER -p$LOCAL_PASSWORD</code></p>
Depending on your <code>mysqlclient</code> version, you might have to add the <code>--column-statistics=0</code> option.
<p>
The first part of this command (<code>mysqldump</code>) outputs the content of the database as a text stream of SQL statements. The second part reads it into end-user's local MySQL server. Each yearly release is around 40 GB in size.
We leave it as an exercise to the reader to replace the variables above (such as <code>$HOSTNAME</code>) with the correct values from the the JSON file that is used by the R and Python APIs to coordinate and authorize MySQL access, <a href="childes-db.json"> childes-db.json</a>. The corresponence between variables is as follows:
</p>
</div>
<center>
<div class="row">
<table style="width:60%">
<tr>
<td>$HOST_FROM_JSON</td><td>"host" field in JSON</td>
</tr>
<tr>
<td>$USER_FROM_JSON</td><td>"user" field in JSON</td>
</tr>
<tr>
<td>$PASSWORD_FROM_JSON</td><td>"password" in JSON</td>
</tr>
<tr>
<td>$DATABASES</td><td>{2020.1, 2019.1, 2018.1}</td>
</tr>
<tr>
<td>$LOCAL_USER</td><td>Local MySQL user (possibly <i>root</i>)</td>
</tr>
<tr>
<td>$LOCAL_PASSWORD</td><td>Local MySQL password for user </td>
</tr>
</table>
<br /><br />
</div>
</center>
<div class="row">
<p>Once you have a local MySQL installation, refer to the documentation for <code>childesr</code> or <code>childespy</code> regarding how to use a local database server. For most uses cases, using the API with the default remote server (hosted on Amazon on EC2) should be sufficient.</p>
</div>
</div>
<!--end: Row-->
</div>
<!--end: Container-->
</div>
{% include footer.html %}