Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
hennyu committed Feb 23, 2024
2 parents 4e6e68c + 7ac0bd6 commit c47b764
Show file tree
Hide file tree
Showing 21 changed files with 474 additions and 303 deletions.
257 changes: 142 additions & 115 deletions data/JTEI/12_2019-20/jtei-cc-ra-flanders-176-source.xml

Large diffs are not rendered by default.

155 changes: 95 additions & 60 deletions data/JTEI/13_2020-22/jtei-cc-pn-kuhry-188-source.xml

Large diffs are not rendered by default.

47 changes: 26 additions & 21 deletions data/JTEI/13_2020-22/jtei-cc-ra-parisse-182-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -234,16 +234,16 @@
<p>Many software packages dedicated to editing spoken language transcription contain
utilities that can convert many formats: for example, <ptr type="software" xml:id="R15"
target="#exmaralda"/>
<rs type="soft.name" ref="#R15">EXMARaLDA</rs> ( <rs type="soft.Bib.Ref" target="#R15"
<rs type="soft.name" ref="#R15">EXMARaLDA</rs> ( <rs type="soft.bib.ref" ref="#R15"
><ref type="bibl" target="#schmidt2004">Schmidt 2004</ref>
</rs>; see <rs type="soft.url" target="#R15"><ptr target="https://exmaralda.org"
</rs>; see <rs type="soft.url" ref="#R15"><ptr target="https://exmaralda.org"
/></rs>), <ptr type="software" xml:id="R16" target="#anvil"/>
<rs type="soft.name" ref="#R16">Anvil</rs> (<rs type="soft.Bib.Ref" target="#R16">
<rs type="soft.name" ref="#R16">Anvil</rs> (<rs type="soft.bib.ref" ref="#R16">
<ref type="bibl" target="#kipp2001">Kipp 2001</ref></rs>; see <rs type="soft.url"
target="#R16"><ptr target="https://www.anvil-software.org"/></rs>), and <ptr
ref="#R16"><ptr target="https://www.anvil-software.org"/></rs>), and <ptr
type="software" xml:id="R17" target="#elan"/><rs type="soft.name" ref="#R17">ELAN</rs>
(<rs type="soft.bib.ref" target="#R17"><ref type="bibl" target="#wittenburg2006"
>Wittenburg et al. 2006</ref></rs>; see <rs type="soft.url" target="#R17">
(<rs type="soft.bib.ref" ref="#R17"><ref type="bibl" target="#wittenburg2006"
>Wittenburg et al. 2006</ref></rs>; see <rs type="soft.url" ref="#R17">
<ptr target="https://archive.mpi.nl/tla/elan"/></rs>). However, in all cases, the
conversions are limited to the features implemented in the tool itself—for example, with
a limited set of metadata—and they cannot always be used to prepare data to be used by
Expand All @@ -260,9 +260,9 @@
tools missing in the <ptr type="software" xml:id="R18" target="#teicorpo"/>
<rs type="soft.name" ref="#R18">TEICORPO</rs> approach are <ptr type="software"
xml:id="R19" target="#exmaralda"/><rs type="soft.name" ref="#R19">EXMARaLDA</rs> and
<ptr type="software" xml:id="R19" target="#folker"/>FOLKER (<rs type="soft.bib.ref"
target="#R19"><ref type="bibl" target="#schmidts2010">Schmidt and Schütte
2010</ref></rs>; see <rs type="soft.url" target="#R19"><ptr
<ptr type="software" xml:id="R241" target="#folker"/>FOLKER (<rs type="soft.bib.ref"
ref="#R19"><ref type="bibl" target="#schmidts2010">Schmidt and Schütte
2010</ref></rs>; see <rs type="soft.url" ref="#R241"><ptr
target="https://exmaralda.org/en/folker-en/"/></rs>), but this was only because the
conversion tools from and to <ptr type="software" xml:id="R20" target="#EXMARaLDA"/><rs
type="soft.name" ref="#R20">EXMARaLDA</rs>, <ptr type="software" xml:id="R21"
Expand All @@ -279,22 +279,22 @@
<ptr type="software" xml:id="R25" target="#folker"/>
<rs type="soft.name" ref="#R25">FOLKER</rs> software fit within the process chain of
<ptr type="software" xml:id="R26" target="#teicorpo"/><rs type="soft.name"
target="#R26"> TEICORPO</rs>. This demonstrates the usefulness of a well-known and
ref="#R26"> TEICORPO</rs>. This demonstrates the usefulness of a well-known and
efficient format such as TEI.</p>
<p>There are, however, differences between the two projects that make them nonredundant
but complementary, each project having specificities that can be useful or damaging
depending on the user’s needs. One minor difference is that the <ptr type="software"
xml:id="R27" ref="#teicorpo"/>
<rs type="soft.name" target="#R27">TEICORPO</rs> project is not a functionality of an
xml:id="R27" target="#teicorpo"/>
<rs type="soft.name" ref="#R27">TEICORPO</rs> project is not a functionality of an
editing tool, but is a standalone tool for converting data between one format and
another. This had certain effects on the user interface and explains some of the choices
made in the development of the two tools.</p>
<p>There are two major differences between <ptr type="software" xml:id="R28"
target="#teicorpo"/>
<rs type="soft.name" target="#R28">TEICORPO</rs> and Schmidt’s approach, which affected
<rs type="soft.name" ref="#R28">TEICORPO</rs> and Schmidt’s approach, which affected
both the design of the tools and how they can be used. The first difference is that in
developing <ptr type="software" xml:id="R29" ref="#teicorpo"/><rs type="soft.name"
target="#R29">TEICORPO</rs>, it was decided that the conversion between the original
developing <ptr type="software" xml:id="R29" target="#teicorpo"/><rs type="soft.name"
ref="#R29">TEICORPO</rs>, it was decided that the conversion between the original
formats and TEI had to be lossless (or as lossless as possible) because we wanted to
offer a means to store the research data for long-term conservation and dissemination in
a standard XML format instead of in proprietary formats such as those used by <ptr
Expand Down Expand Up @@ -1004,7 +1004,7 @@
<rs type="soft.name" ref="#R117">TEICONVERT</rs> makes spoken language data available
for <ptr type="software" xml:id="R118" target="#txm"/><rs type="soft.name" ref="#R118"
>TXM</rs> (<rs type="soft.bib.ref" ref="#R118"><ref type="bibl" target="#heiden2010"
>Heiden 2010</ref></rs>; see <rs type="soft.turl" ref="#R118"><ptr
>Heiden 2010</ref></rs>; see <rs type="soft.url" ref="#R118"><ptr
target="http://textometrie.ens-lyon.fr"/></rs>), <ptr type="software" xml:id="R119"
target="#letrameur"/>
<rs type="soft.name" ref="#R119">Le Trameur</rs> (<rs type="soft.bib.ref" ref="#R119"
Expand Down Expand Up @@ -1149,11 +1149,13 @@
<rs type="soft.name" ref="#R144">TEICORPO</rs> includes the ability to use any
syntactic model. For French data, we used the PERCEO model (<ref type="bibl"
target="#benzitoun2012">Benzitoun, Fort, and Sagot 2012</ref>).</p>
<p>The command line to be used is: <code>java -cp <ptr type="software" xml:id="R208"
<p>The command line to be used is: <ptr type="software" xml:id="R240" target="#java"/><code>
<rs type="soft.name" ref="#R240">java</rs> -cp <ptr type="software" xml:id="R208"
target="#teicorpo"/>
<rs type="soft.name" ref="#R208">TEICORPO</rs>.jar fr.ortolang.<ptr type="software"
xml:id="R209" target="#teicorpo"/>
<rs type="soft.name" ref="#R209">TEICORPO</rs>.TeiTreeTagger filenames...</code>
<rs type="soft.name" ref="#R209">TEICORPO</rs>.<ptr type="software" xml:id="R239" target="#treetagger"/>
Tei <rs type="soft.name" ref="#R239">TreeTagger</rs> filenames...</code>
with additional parameters:</p>
<table xml:id="table2">
<row role="label">
Expand Down Expand Up @@ -1329,7 +1331,10 @@
<rs type="soft.name" ref="#R153">TreeTagger</rs> . The -model and -syntaxformat
parameters can be used in a similar way to specify the grammatical model to be used
and the output format. A command line example is:</p>
<p><code>java -cp "teicorpo.jar:directory_for_SNLP/*" fr.ortolang.teicorpo.TeiSNLP
<p><code><ptr type="software" xml:id="R236" target="#java"/>
<rs type="soft.name" ref="#R236">java</rs> -cp "<ptr type="software" xml:id="R237" target="#teicorpo"/>
<rs type="soft.name" ref="#R237">teicorpo</rs>.jar:directory_for_SNLP/*" fr.ortolang.<ptr type="software" xml:id="R238" target="#teicorpo"/>
<rs type="soft.name" ref="#R238">teicorpo</rs>.TeiSNLP
-syntaxformat svalue -model filename.tei_corpo.xml</code></p>
<p>The <term>directory_for_SNLP</term> is the name of the location on a computer where
all the <ptr type="software" xml:id="R212" target="#stanfordcorenlp"/>
Expand Down Expand Up @@ -1392,7 +1397,7 @@
<p>Export can be done from TEI into a format used by textometric software (see <ptr
target="#example_code_11" type="crossref"/>). This is the case for <ptr
type="software" xml:id="R160" target="#txm"/><rs type="soft.name" ref="#R160">TXM</rs>,<note>
<p>See the Textométrie website, last updated June 29, 2020, <rs type="soft.ulr" ref="#R160"
<p>See the Textométrie website, last updated June 29, 2020, <rs type="soft.url" ref="#R160"
><ptr target="http://textometrie.ens-lyon.fr/?lang=en"/></rs>.</p>
</note> a textometric software application. In this case, instead of using a partition
representation, the information from the grammatical analysis is inserted at the word
Expand Down Expand Up @@ -1591,7 +1596,7 @@
target="https://www.fon.hum.uva.nl/paul/papers/speakUnspeakPraat_glot2001.pdf"
/>.</bibl>
</rs>
<ptr type="software" xml:id="R226" target="#teimata"/>
<ptr type="software" xml:id="R226" target="#teimeta"/>
<rs type="soft.bib.ref" ref="#R226">
<bibl xml:id="etienne"><rs type="soft.agent" ref="#R226"><author>Etienne,
Carole</author></rs>, <rs type="soft.agent" ref="#R226"><author>Loïc
Expand Down
8 changes: 4 additions & 4 deletions data/JTEI/13_2020-22/jtei-cc-ra-winslow-186-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -530,8 +530,8 @@
favoring one period or type of document over another, a generic element seems both
desirable and advisable. The proposed element, as implemented in the TEI_CEI
ODD,<note>See the <ref xml:id="ref13" type="bibl" target="#winslowetal2019"
>CEI2TEI <ptr type="software" xml:id="GitHub" target="#GitHub"/><rs
type="soft.name" ref="#GitHub">GitHub</rs> repository, accessed June 25,
>CEI2TEI <ptr type="software" xml:id="R1" target="#github"/><rs
type="soft.name" ref="#R1">GitHub</rs> repository, accessed June 25,
2021</ref>, <ptr
target="https://github.com/GVogeler/CEI2TEI/blob/master/tei_cei.odd"/>.</note>
is simple (<ident>attList</ident> items suppressed for brevity: they follow the
Expand Down Expand Up @@ -576,8 +576,8 @@
proposed vocabulary, provided in SKOS (Simple Knowledge Organization System) format
(as part of the project’s <ref
target="https://github.com/GVogeler/CEI2TEI/blob/master/Authentication/authen.skos.ttl">
<ptr type="software" xml:id="GitHub" target="#GitHub"/><rs type="soft.name"
ref="#GitHub">GitHub</rs> repository</ref>,<note>Accessed July 13, 2021, <ptr
<ptr type="software" xml:id="R2" target="#github"/><rs type="soft.name"
ref="#R2">GitHub</rs> repository</ref>,<note>Accessed July 13, 2021, <ptr
target="https://github.com/GVogeler/CEI2TEI/blob/master/Authentication/authen.skos.ttl"
/>.</note> at <ptr
target="https://github.com/GVogeler/CEI2TEI/blob/master/Authentication/authen.skos.ttl"
Expand Down
14 changes: 7 additions & 7 deletions data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -261,11 +261,11 @@
text are presented in such a way that the reader is granted more informed access
to them.</p>
<p>The edition will be published online using a specifically tailored version of <ptr
type="software" xml:id="R29" target="#evt"/><rs type="soft.name" ref="R29"
type="software" xml:id="R29" target="#evt"/><rs type="soft.name" ref="#R29"
>EVT</rs> (Edition Visualization Technology<note><quote source="#quote1">A
light-weight, open source tool specifically designed to create digital
editions from XML-encoded texts</quote>
<rs type="soft.bib" ref="R20">(<ref type="bibl" target="#delturco2013"
<rs type="soft.bib" ref="#R20">(<ref type="bibl" target="#delturco2013"
xml:id="quote1">Rosselli Del Turco et al. 2013</ref>)</rs>.</note>) and
will present, on the one hand, each witness in its continuum from facsimile to
multiple levels of normalization and, on the other hand, the three main witnesses
Expand Down Expand Up @@ -790,7 +790,7 @@
with no manual intervention on the resulting files.</item>
<item>The generated editions files will conform to the TEI subset understood by
<ptr type="software" xml:id="R30" target="#evt"/><rs type="soft.name"
ref="R30">EVT</rs>.</item>
ref="#R30">EVT</rs>.</item>
</list>
<p>Some of these desiderata clash with each other. For instance, the desire to
directly edit the XML file makes it hard and error-prone to keep in a single
Expand Down Expand Up @@ -855,7 +855,7 @@
target="#delturcond">Roberto Rosselli Del Turco (n.d.)</ref>: here two
levels of edition are offered, a diplomatic and a more interpretative one. The
user can compare the two editions visualizing them synoptically in the <ptr
type="software" xml:id="R31" target="#evt"/><rs type="soft.name" ref="R31"
type="software" xml:id="R31" target="#evt"/><rs type="soft.name" ref="#R31"
>EVT</rs> software used for the edition.</p>
</div>
</div>
Expand Down Expand Up @@ -1474,10 +1474,10 @@
version</biblScope>. Accessed <date>October 22, 2021</date>. <ptr
target="http://vbd.humnet.unipi.it/beta2/"/>.</bibl>
<bibl xml:id="delturco2013"><ptr type="software" xml:id="R28" target="#evt"
/><author><rs type="soft.agent" ref="R28">Rosselli Del Turco,
Roberto</rs></author>, et al. <rs type="soft.bib" ref="R28">
/><author><rs type="soft.agent" ref="#R28">Rosselli Del Turco,
Roberto</rs></author>, et al. <rs type="soft.bib" ref="#R28">
<date>2013</date>. <title level="m">Edition Visualization Technology</title>.
</rs> Accessed <date>April 19, 2021</date>.<rs type="soft.url" ref="R28"><ptr
</rs> Accessed <date>April 19, 2021</date>.<rs type="soft.url" ref="#R28"><ptr
target="http://evt.labcd.unipi.it/"/></rs>.</bibl>
<bibl xml:id="stella2020"><editor>Stella, Francesco</editor>, ed. <date>2020</date>.
<title level="m">Corpus Rhythmorum Musicum.</title> Last modified <date>July
Expand Down
11 changes: 6 additions & 5 deletions data/JTEI/14_2021-23/jtei-cc-pn-erjavec-195-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -326,9 +326,9 @@
<head>Presentation of Parla-CLARIN</head>
<p>Like the TEI Guidelines, the Parla-CLARIN recommendations are available on <ref
target="https://github.com/clarin-eric/parla-clarin/"><ptr type="software" xml:id="R1"
target="#GitHub"/><rs type="soft.name" ref="#R1">GitHub</rs></ref>, as a
target="#github"/><rs type="soft.name" ref="#R1">GitHub</rs></ref>, as a
project<note>Tomaž Erjavec and Andrej Pančur, Parla-CLARIN project <ptr
type="software" xml:id="R2" target="#GitHub"/><rs type="soft.name" ref="#R2"
type="software" xml:id="R2" target="#github"/><rs type="soft.name" ref="#R2"
>GitHub</rs> site, last updated March 17, 2021, <ptr type="software" xml:id="R9"
target="#parlaclarinscripts"/><rs type="soft.url" ref="#R9"><ptr
target="https://github.com/clarin-eric/parla-clarin/"/></rs>.</note> of the CLARIN
Expand Down Expand Up @@ -581,7 +581,7 @@
and into developing the <ptr type="software" xml:id="R12" target="#parlaclarinscripts"
/><rs type="soft.name" ref="#R12">conversion from Akoma Ntoso to Parla-CLARIN</rs>. We
have not included examples of the encoding, as these are readily available on the <ptr
type="software" xml:id="R3" target="#GitHub"/><rs type="soft.name" ref="#R3">GitHub</rs>
type="software" xml:id="R3" target="#github"/><rs type="soft.name" ref="#R3">GitHub</rs>
documentation page of the project, and large Parla-CLARIN encoded corpora are openly
available.</p>
<p>Apart from the siParl 2.0 corpus mentioned above (<ptr type="crossref"
Expand Down Expand Up @@ -632,7 +632,7 @@
specification from the default ones in the TEI Guidelines to ones taken or adapted from
the collected parliamentary corpora.</p>
<p>Second, as we have already done for ParlaMint, we plan to add to the <ptr type="software"
xml:id="R4" target="#GitHub"/><rs type="soft.name" ref="#R4">GitHub</rs> Parla-CLARIN
xml:id="R4" target="#github"/><rs type="soft.name" ref="#R4">GitHub</rs> Parla-CLARIN
project more down-conversion scripts with which we would increase the usability of the
Parla-CLARIN corpora. As mentioned, work also needs to be done to develop a conversion to
RDF.</p>
Expand Down Expand Up @@ -803,7 +803,8 @@
<bibl xml:id="kilgarriff14"><author>Kilgarriff, Adam</author>, <author>Vít Baisa</author>,
<author>Jan Bušta</author>, <author>Miloš Jakubíček</author>, <author>Vojtěch
Kovář</author>, <author>Jan Michelfeit</author>, <author>Pavel Rychlý</author>, and
<author>Vít Suchomel</author>. <rs type="soft.bib.ref" ref="ewfew"><date>2014</date>.
<author>Vít Suchomel</author>. <ptr type="software" xml:id="R30"
target="#sketchengine"/><rs type="soft.name soft.bib.ref" ref="#R30"><date>2014</date>.
<title level="a">The Sketch Engine: Ten Years On.</title></rs>
<title level="j">Lexicography: Journal of ASIALEX</title>
<biblScope unit="volume">1</biblScope> (<biblScope unit="issue">1</biblScope>):
Expand Down
4 changes: 2 additions & 2 deletions data/JTEI/8_2014-15/jtei-8-boschetti-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@
</list>. The continuous integration and release are supported by open source Integrated
Development Environments (IDEs) like <ptr type="software" xml:id="R10" target="#eclipse"/>
<rs type="soft.name" ref="#R10">Eclipse</rs> or <ptr type="software" xml:id="R11"
target="netbeans"/><rs type="soft.name" ref="#R11"> NetBeans</rs> and by a software
target="#netbeans"/> <rs type="soft.name" ref="#R11">NetBeans</rs> and by a software
configuration management tool such as <ptr type="software" xml:id="R13" target="#svn"/>
<rs type="soft.name" ref="#R13">SVN</rs> or <ptr type="software" xml:id="R12"
target="#git"/>
Expand Down Expand Up @@ -722,7 +722,7 @@
Environment: Metadata, Vocabularies and Techniques in the Digital Humanities</title>,
article no. 11. <pubPlace>New York</pubPlace>: <publisher>ACM</publisher>. doi:<idno
type="doi">10.1145/2517978.2517990</idno>.</bibl>
<ptr type="software" xml:id="R39" target="g2a"/>
<ptr type="software" xml:id="R39" target="#g2a"/>
<rs type="soft.ref.bib" ref="#R39">
<bibl xml:id="bozzi13"><author><rs type="soft.agent" ref="#R39">Bozzi,
Andrea</rs></author>. <date>2013</date>. <title level="a">G2A: A Web Application to
Expand Down
Loading

0 comments on commit c47b764

Please sign in to comment.