-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update to rev34; support BCP47 language variants
- Loading branch information
Showing
11 changed files
with
438 additions
and
229 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
XDXF format - Sergei Snegov, Leonid Soshinskiy [https://github.com/soshial] | ||
XDXF format specification - Sergei Snegov, Leonid Soshinskiy [https://github.com/soshial] | ||
makedict (Deprecated) - Evgeniy Dushistov <[email protected]>, kubtek [https://github.com/kubtek] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
### Changelog (rev. 34) 2022-01-20 | ||
* since rev. 34 the format is only semantic and cannot store any presentational or visual data | ||
* the language code limitation is removed: all languages that exist in BCP47 standard are supported (use http://schneegans.de/lv/?tags=hy-Latn-IT-arevela for validation) | ||
* multilingual dictionaries are now supported: a dictionary may have multiple languages, that are translated from and into. It is also allowed to mark `<k>` and `<def>` tags with `xml:lang` | ||
* description supports line breaks | ||
* transcription info can be directly inside `def` tag | ||
|
||
### Changelog (rev. 33) | ||
* `<deftext>` introduced in order to fix multiple errors in DTD scheme | ||
* `<rref>` tag: added `lctn` and `type` attributes, links are not stored inside the tag anymore | ||
* `<kref>` tag: `idref` attribute introduced | ||
* `<c>` tag: added necessary hash sign # in attribute | ||
* `<categ>` tag now is a list of `<kref>` tags | ||
* `<etm>` may now have `<mrkd>` inside to mark etymological ancestors/cognates | ||
* `<dtrn>` may now contain `<kref>` tag(s) inside | ||
* `<u>` tag for underlined text introduced | ||
* `<br/>` tag introduced for newlines inside articles | ||
* `<ex>` now might have `<iref>` tag inside | ||
* `<ex>`, `<tr>`, `<co>` tags now may have user-set attribute values |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
<!ELEMENT xdxf (meta_info,lexicon)> | ||
<!ATTLIST xdxf format (visual|logical) "logical"> | ||
<!ATTLIST xdxf revision CDATA #REQUIRED> | ||
|
||
<!ELEMENT meta_info (title,full_title,description,publisher?,authors?,file_ver,creation_date,last_edited_date,dict_edition?,publishing_date?,dict_src_url?,abbreviations?)> | ||
<!ELEMENT title (#PCDATA)> | ||
<!ELEMENT full_title (#PCDATA)> | ||
<!ELEMENT description (#PCDATA)> | ||
<!ELEMENT publisher (#PCDATA)> | ||
<!ELEMENT authors (author+)> | ||
<!ELEMENT author (#PCDATA)> | ||
<!ATTLIST author role CDATA #IMPLIED> | ||
<!ELEMENT file_ver (#PCDATA)> | ||
<!ELEMENT creation_date (#PCDATA)> | ||
<!ELEMENT last_edited_date (#PCDATA)> | ||
<!ELEMENT dict_edition (#PCDATA)> | ||
<!ELEMENT publishing_date (#PCDATA)> | ||
<!ELEMENT dict_src_url (#PCDATA)> | ||
<!ELEMENT abbreviations (abbr_def+)> | ||
<!ELEMENT abbr_def (abbr_k+,abbr_v)> | ||
<!ATTLIST abbr_def type (stl|grm|aux|knl|oth) #IMPLIED> | ||
<!ELEMENT abbr_k (#PCDATA)> | ||
<!ELEMENT abbr_v (#PCDATA)> | ||
|
||
<!ELEMENT lexicon (ar+)> | ||
<!ELEMENT ar (k+,def)> | ||
<!ATTLIST ar f (v|l) "l"> | ||
<!ELEMENT k (#PCDATA|opt|sup|sub)*> | ||
<!ATTLIST k id ID #IMPLIED> | ||
<!ELEMENT opt (#PCDATA|sup|sub)*> | ||
<!ENTITY % style "c|sup|sub|i|b|u"> | ||
<!ENTITY % ref "kref|rref|iref"> | ||
<!ELEMENT def (gr?,co*,(def+|deftext),ex*,sr?,etm?,categ*)> | ||
<!ELEMENT deftext (#PCDATA|tr|dtrn|abbr|co|di|%ref;|%style;|br)*> | ||
<!ATTLIST def id ID #IMPLIED> | ||
<!ATTLIST def cmt CDATA #IMPLIED> | ||
<!ATTLIST def freq CDATA #IMPLIED> | ||
<!ELEMENT sr (kref+)> | ||
<!ELEMENT etm (#PCDATA|tr|abbr|co|di|mrkd|%ref;|%style;|br)*> | ||
<!ELEMENT categ (kref+)> | ||
<!ELEMENT gr (#PCDATA|tr|abbr|co|di|%ref;|%style;)*> | ||
<!ELEMENT tr (#PCDATA)> | ||
<!ATTLIST tr format (IPA|X-SAMPA|erkIPA|CDATA) "IPA"> | ||
<!ELEMENT dtrn (#PCDATA|kref)*> | ||
<!ELEMENT kref (#PCDATA|%style;)*> | ||
<!ATTLIST kref idref IDREF #IMPLIED> | ||
<!ATTLIST kref type (syn|ant|hpr|hpn|par|spv|mer|hol|ent|rel|etm) #IMPLIED> | ||
<!ATTLIST kref kcmt CDATA #IMPLIED> | ||
<!ELEMENT rref (#PCDATA)> | ||
<!ATTLIST rref start CDATA "0"> | ||
<!ATTLIST rref size CDATA #IMPLIED> | ||
<!ATTLIST rref lctn CDATA #IMPLIED> | ||
<!ATTLIST rref type CDATA #IMPLIED> | ||
<!ELEMENT iref (#PCDATA|%style;)*> | ||
<!ATTLIST iref href CDATA #REQUIRED> | ||
<!ELEMENT abbr (#PCDATA)> | ||
<!ELEMENT ex (ex_orig+,ex_tran*,iref*)> | ||
<!ATTLIST ex type (exm|phr|prv|oth|PCDATA) "exm"> | ||
<!ATTLIST ex source CDATA #IMPLIED> | ||
<!ATTLIST ex author CDATA #IMPLIED> | ||
<!ELEMENT ex_orig (#PCDATA|mrkd|co|%ref;|%style;|br)*> | ||
<!ELEMENT ex_tran (#PCDATA|mrkd|co|%ref;|%style;|br)*> | ||
<!ELEMENT mrkd (#PCDATA|kref|%style;)*> | ||
<!ELEMENT co (#PCDATA|co|tr|abbr|di|%ref;|%style;|br)*> | ||
<!ATTLIST co type CDATA #IMPLIED> | ||
<!ELEMENT i (#PCDATA|%style;)*> | ||
<!ELEMENT b (#PCDATA|%style;)*> | ||
<!ELEMENT u (#PCDATA|%style;)*> | ||
<!ELEMENT c (#PCDATA|%style;)*> | ||
<!ATTLIST c c CDATA #IMPLIED> | ||
<!ELEMENT sup (#PCDATA)> | ||
<!ELEMENT sub (#PCDATA)> | ||
<!ELEMENT di (#PCDATA)> | ||
<!ELEMENT br EMPTY> | ||
|
||
|
||
<!ATTLIST xdxf lang_from ( | ||
AAR|ABK|ACE|ACH|ADA|ADY|AFA|AFH|AFR|AIN|AKA|AKK|ALB|ALE|ALG|ALT|AMH|ANG|APA|ARA|ARC|ARG|ARM|ARN| | ||
ARP|ART|ARW|ASM|AST|ATH|AUS|AVA|AVE|AWA|AYM|AZE|BAD|BAI|BAK|BAL|BAM|BAN|BAQ|BAS|BAT|BEJ|BEL|BEM| | ||
BEN|BER|BHO|BIH|BIK|BIN|BIS|BLA|BNT|BOS|BRA|BRE|BTK|BUA|BUG|BUL|BUR|BYN|CAD|CAI|CAR|CAT|CAU|CEB| | ||
CEL|CHA|CHB|CHE|CHG|CHI|CHK|CHM|CHN|CHO|CHP|CHR|CHU|CHV|CHY|CMC|COP|COR|COS|CPE|CPF|CPP|CRE|CRH| | ||
CRP|CSB|CUS|CZE|DAK|DAN|DAR|DAY|DEL|DEN|DGR|DIN|DIV|DOI|DRA|DSB|DUA|DUM|DUT|DYU|DZO|EFI|EGY|EKA| | ||
ELX|ENG|ENM|EPO|EST|EWE|EWO|FAN|FAO|FAT|FIJ|FIL|FIN|FIU|FON|FRE|FRM|FRO|FRY|FUL|FUR|GAA|GAY|GBA| | ||
GEM|GEO|GER|GEZ|GIL|GLA|GLE|GLG|GLV|GMH|GOH|GON|GOR|GOT|GRB|GRC|GRE|GRN|GUJ|GWI|HAI|HAT|HAU|HAW| | ||
HEB|HER|HIL|HIM|HIN|HIT|HMN|HMO|HSB|HUN|HUP|IBA|IBO|ICE|IDO|III|IJO|IKU|ILE|ILO|INA|INC|IND|INE| | ||
INH|IPK|IRA|IRO|ITA|JAV|JBO|JPN|JPR|JRB|KAA|KAB|KAC|KAL|KAM|KAN|KAR|KAS|KAU|KAW|KAZ|KBD|KHA|KHI| | ||
KHM|KHO|KIK|KIN|KIR|KMB|KOK|KOM|KON|KOR|KOS|KPE|KRC|KRO|KRU|KUA|KUM|KUR|KUT|LAD|LAH|LAM|LAO|LAT| | ||
LAV|LEZ|LIM|LIN|LIT|LOL|LOZ|LTZ|LUA|LUB|LUG|LUI|LUN|LUO|LUS|MAC|MAD|MAG|MAH|MAI|MAK|MAL|MAN|MAO| | ||
MAP|MAR|MAS|MAY|MDF|MDR|MEN|MGA|MIC|MIN|MIS|MKH|MLG|MLT|MNC|MNI|MNO|MOH|MOL|MON|MOS|MUL|MUN|MUS| | ||
MWL|MWR|MYN|MYV|NAH|NAI|NAP|NAU|NAV|NBL|NDE|NDO|NDS|NEP|NEW|NIA|NIC|NIU|NNO|NOB|NOG|NON|NOR|NSO| | ||
NUB|NWC|NYA|NYM|NYN|NYO|NZI|OCI|OJI|ORI|ORM|OSA|OSS|OTA|OTO|PAA|PAG|PAL|PAM|PAN|PAP|PAU|PEO|PER| | ||
PHI|PHN|PLI|POL|PON|POR|PRA|PRO|PUS|QAA-QUE|RAJ|RAP|RAR|ROA|ROH|ROM|RUM|RUN|RUP|RUS|SAD|SAG|SAH| | ||
SAI|SAL|SAM|SAN|SAS|SAT|SCC|SCN|SCO|SCR|SEL|SEM|SGA|SGN|SHN|SID|SIN|SIO|SIT|SLA|SLO|SLV|SMA|SME| | ||
SMI|SMJ|SMN|SMO|SMS|SNA|SND|SNK|SOG|SOM|SON|SOT|SPA|SRD|SRR|SSA|SSW|SUK|SUN|SUS|SUX|SWA|SWE|SYR| | ||
TAH|TAI|TAM|TAT|TEL|TEM|TER|TET|TGK|TGL|THA|TIB|TIG|TIR|TIV|TKL|TLH|TLI|TMH|TOG|TON|TPI|TSI|TSN| | ||
TSO|TUK|TUM|TUP|TUR|TUT|TVL|TWI|TYV|UDM|UGA|UIG|UKR|UMB|UND|URD|UZB|VAI|VEN|VIE|VOL|VOT|WAK|WAL| | ||
WAR|WAS|WEL|WEN|WLN|WOL|XAL|XHO|YAO|YAP|YID|YOR|YPK|ZAP|ZEN|ZHA|ZND|ZUL|ZUN) #REQUIRED> | ||
|
||
<!ATTLIST xdxf lang_to ( | ||
AAR|ABK|ACE|ACH|ADA|ADY|AFA|AFH|AFR|AIN|AKA|AKK|ALB|ALE|ALG|ALT|AMH|ANG|APA|ARA|ARC|ARG|ARM|ARN| | ||
ARP|ART|ARW|ASM|AST|ATH|AUS|AVA|AVE|AWA|AYM|AZE|BAD|BAI|BAK|BAL|BAM|BAN|BAQ|BAS|BAT|BEJ|BEL|BEM| | ||
BEN|BER|BHO|BIH|BIK|BIN|BIS|BLA|BNT|BOS|BRA|BRE|BTK|BUA|BUG|BUL|BUR|BYN|CAD|CAI|CAR|CAT|CAU|CEB| | ||
CEL|CHA|CHB|CHE|CHG|CHI|CHK|CHM|CHN|CHO|CHP|CHR|CHU|CHV|CHY|CMC|COP|COR|COS|CPE|CPF|CPP|CRE|CRH| | ||
CRP|CSB|CUS|CZE|DAK|DAN|DAR|DAY|DEL|DEN|DGR|DIN|DIV|DOI|DRA|DSB|DUA|DUM|DUT|DYU|DZO|EFI|EGY|EKA| | ||
ELX|ENG|ENM|EPO|EST|EWE|EWO|FAN|FAO|FAT|FIJ|FIL|FIN|FIU|FON|FRE|FRM|FRO|FRY|FUL|FUR|GAA|GAY|GBA| | ||
GEM|GEO|GER|GEZ|GIL|GLA|GLE|GLG|GLV|GMH|GOH|GON|GOR|GOT|GRB|GRC|GRE|GRN|GUJ|GWI|HAI|HAT|HAU|HAW| | ||
HEB|HER|HIL|HIM|HIN|HIT|HMN|HMO|HSB|HUN|HUP|IBA|IBO|ICE|IDO|III|IJO|IKU|ILE|ILO|INA|INC|IND|INE| | ||
INH|IPK|IRA|IRO|ITA|JAV|JBO|JPN|JPR|JRB|KAA|KAB|KAC|KAL|KAM|KAN|KAR|KAS|KAU|KAW|KAZ|KBD|KHA|KHI| | ||
KHM|KHO|KIK|KIN|KIR|KMB|KOK|KOM|KON|KOR|KOS|KPE|KRC|KRO|KRU|KUA|KUM|KUR|KUT|LAD|LAH|LAM|LAO|LAT| | ||
LAV|LEZ|LIM|LIN|LIT|LOL|LOZ|LTZ|LUA|LUB|LUG|LUI|LUN|LUO|LUS|MAC|MAD|MAG|MAH|MAI|MAK|MAL|MAN|MAO| | ||
MAP|MAR|MAS|MAY|MDF|MDR|MEN|MGA|MIC|MIN|MIS|MKH|MLG|MLT|MNC|MNI|MNO|MOH|MOL|MON|MOS|MUL|MUN|MUS| | ||
MWL|MWR|MYN|MYV|NAH|NAI|NAP|NAU|NAV|NBL|NDE|NDO|NDS|NEP|NEW|NIA|NIC|NIU|NNO|NOB|NOG|NON|NOR|NSO| | ||
NUB|NWC|NYA|NYM|NYN|NYO|NZI|OCI|OJI|ORI|ORM|OSA|OSS|OTA|OTO|PAA|PAG|PAL|PAM|PAN|PAP|PAU|PEO|PER| | ||
PHI|PHN|PLI|POL|PON|POR|PRA|PRO|PUS|QAA-QUE|RAJ|RAP|RAR|ROA|ROH|ROM|RUM|RUN|RUP|RUS|SAD|SAG|SAH| | ||
SAI|SAL|SAM|SAN|SAS|SAT|SCC|SCN|SCO|SCR|SEL|SEM|SGA|SGN|SHN|SID|SIN|SIO|SIT|SLA|SLO|SLV|SMA|SME| | ||
SMI|SMJ|SMN|SMO|SMS|SNA|SND|SNK|SOG|SOM|SON|SOT|SPA|SRD|SRR|SSA|SSW|SUK|SUN|SUS|SUX|SWA|SWE|SYR| | ||
TAH|TAI|TAM|TAT|TEL|TEM|TER|TET|TGK|TGL|THA|TIB|TIG|TIR|TIV|TKL|TLH|TLI|TMH|TOG|TON|TPI|TSI|TSN| | ||
TSO|TUK|TUM|TUP|TUR|TUT|TVL|TWI|TYV|UDM|UGA|UIG|UKR|UMB|UND|URD|UZB|VAI|VEN|VIE|VOL|VOT|WAK|WAL| | ||
WAR|WAS|WEL|WEN|WLN|WOL|XAL|XHO|YAO|YAP|YID|YOR|YPK|ZAP|ZEN|ZHA|ZND|ZUL|ZUN) #REQUIRED> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
<?xml version="1.0" encoding="UTF-8" ?> | ||
<!DOCTYPE xdxf SYSTEM "xdxf_old_schema_rev33.dtd"> | ||
<xdxf lang_from="ENG" lang_to="ENG" format="logical" revision="033"> | ||
<meta_info> | ||
<title>Webster's Dictionary</title> | ||
<full_title>Webster's Unabridged Dictionary</full_title> | ||
<description>Webster's Unabridged Dictionary published 1913 by the Webster Institute</description> | ||
<file_ver>001</file_ver> | ||
<creation_date>07-04-2013</creation_date> | ||
<last_edited_date>13-10-2017</last_edited_date> | ||
<abbreviations> | ||
<abbr_def><abbr_k>n.</abbr_k><abbr_v>noun</abbr_v></abbr_def> | ||
<abbr_def><abbr_k>v.</abbr_k><abbr_v>verb</abbr_v></abbr_def> | ||
<abbr_def><abbr_k>Av.</abbr_k><abbr_k>Ave.</abbr_k><abbr_v>Avenue</abbr_v> </abbr_def> | ||
</abbreviations> | ||
</meta_info> | ||
<lexicon> | ||
<ar> | ||
<k>home</k> | ||
<def> | ||
<gr><tr>'həum</tr><abbr>n.</abbr> <rref start="16384" size="512" lctn="sounds_of_words.ogg"/></gr> | ||
<co>XDXF <iref href="http://xdxf.sourceforge.net"><b>Home</b> page</iref></co> | ||
<def><deftext>One's own dwelling place; the house in which one lives.</deftext></def> | ||
<def><deftext>One's native land; the place or country in which one dwells.</deftext></def> | ||
<def> | ||
<deftext>The abiding place of the affections.</deftext> | ||
<ex><ex_orig>For without hearts there is no home.</ex_orig></ex> | ||
</def> | ||
<def> | ||
<deftext> | ||
<dtrn>дом</dtrn>, at home - дома, у себя; | ||
</deftext> | ||
<ex> | ||
<ex_orig>make yourself at <mrkd>home</mrkd></ex_orig> | ||
<ex_tran>будьте как <mrkd>дома</mrkd></ex_tran> | ||
</ex> | ||
<categ><kref idref="fb982hk">Society</kref></categ> | ||
</def> | ||
<sr><kref type="rel">home-made</kref></sr> | ||
</def> | ||
</ar> | ||
<ar f="v"> | ||
<k id="fb982hk">Society</k> | ||
<def> | ||
<deftext>Plural form of word <kref>index</kref>.</deftext> | ||
</def> | ||
</ar> | ||
<ar> | ||
<k>disc</k> | ||
<k>disk</k> | ||
<def> | ||
<gr><abbr>n.</abbr></gr> | ||
<deftext>A flat, circular plate; as, a disk of metal or paper.</deftext> | ||
</def> | ||
</ar> | ||
<ar> | ||
<k>CO<sub>2</sub></k> | ||
<def> | ||
<deftext>Carbon dioxide (CO<sub>2</sub>) - a heavy odorless gas formed during respiration.</deftext> | ||
</def> | ||
</ar> | ||
</lexicon> | ||
</xdxf> |
Oops, something went wrong.