diff --git a/AboutGiellaLT.md b/AboutGiellaLT.md index 034f5378..24e8b406 100644 --- a/AboutGiellaLT.md +++ b/AboutGiellaLT.md @@ -6,33 +6,25 @@ It is an open source website providing analysers and tools for [a wide range of languages](LanguageModels.html), as well as [a ready-made setup](infra/HowToAddANewLanguage.md) for adding more languages. - - - -# The possibility to make computer tools for your language - +## The possibility to make computer tools for your language Computer tools supported by our infrastructure include: - - linguistic analysers (morphology, syntax) - spell checkers and grammar checkers - morphologically enabeled e-dictionaries - machine translation - -# ... by using the following linguistic technology +## ... by using the following linguistic technology We write our morphologies as [finite state transducers](https://en.wikipedia.org/wiki/Finite_state_transducer) -in the formalisms *lexc*, *twolc* and *xfst rewrite rules*, and compile them into computer programs for language analysis with the compilers [xfst](http://fsmbook.com), +in the formalisms _lexc_, _twolc_ and _xfst rewrite rules_, and compile them into computer programs for language analysis with the compilers [xfst](http://fsmbook.com), [hfst](http://www.ling.helsinki.fi/kieliteknologia/tutkimus/hfst/) or [foma](https://github.com/mhulden/foma). Our syntaxes we write in [constraint grammar](https://en.wikipedia.org/wiki/Constraint_grammar), and we compile our constraint grammars with [vislcg3](http://beta.visl.sdu.dk/cg3.html). The installation of these compilers is documented on the [Getting Started](infra/GettingStarted.html) page. - -# Source code, licensing and cooperation - +## Source code, licensing and cooperation All our resources, infrastructure and linguistic content alike, are available under dual licenses, CC-by-SA and GPL. You may thus take whatever resource you find useful with you and go, as long as you refer to us when you use it. diff --git a/CorpusResources.md b/CorpusResources.md index 8bd10238..2ca9f78b 100644 --- a/CorpusResources.md +++ b/CorpusResources.md @@ -1,77 +1,77 @@ # Corpus Resources ![Warning](images/Warning.svg) -__*Under construction.*__ +**_Under construction._** This page contains a dynamically built list of all corpus repositories. Private repositories are not listed. -# Overview +## Overview {% assign lang_repos = site.github.public_repositories|jsonify %}
-# Grouped according to geography +## Grouped according to geography -## Languages of the Nordic countries +### Languages of the Nordic countries
-## Languages of Russia +### Languages of Russia
-## Other European languages +### Other European languages
-## Languages in North America +### Languages in North America
-## Languages in Africa +### Languages in Africa
-## Languages in other parts of the world +### Languages in other parts of the world
-## Languages with no geography tag +### Languages with no geography tag
-# Grouped according to language family +## Grouped according to language family -## Uralic Languages +### Uralic Languages
-## Eskimo-Aleut Languages +### Eskimo-Aleut Languages
-## Algic Languages +### Algic Languages
-## Indoeuropean languages +### Indoeuropean languages
-## Niger-Congo Languages +### Niger-Congo Languages
-## Turkic Languages +### Turkic Languages
-## Languages of other language families, isolates, artificial languages +### Languages of other language families, isolates, artificial languages
-## Languages with no language family tag +### Languages with no language family tag
diff --git a/DocumentationGuide.md b/DocumentationGuide.md index 34835f80..0add34f2 100644 --- a/DocumentationGuide.md +++ b/DocumentationGuide.md @@ -3,12 +3,12 @@ The documentation is organised as follows: - language specific documentation is organised in separate subdomains, links to each language can be found as follows: - - [keyboards and locales](keyboards/KeyboardLayouts.md) - - [morphology, syntax, text processing, proofing tools](LanguageModels.md) - - [speech technology resources](SpeechTechnologyResources.md) + - [keyboards and locales](keyboards/KeyboardLayouts.md) + - [morphology, syntax, text processing, proofing tools](LanguageModels.md) + - [speech technology resources](SpeechTechnologyResources.md) - general technical & language independent documentation: [this site](/index.md) - Documentation specific to Divvun, Giellatekno and Tromsø: - - [old site](https://giellalt.uit.no) - - [new site](https://divvungiellatekno.github.io/giellalt.uit.no/) (will be moved to the old site URL when it is fully converted) + - [old site](https://giellalt.uit.no) + - [new site](https://divvungiellatekno.github.io/giellalt.uit.no/) (will be moved to the old site URL when it is fully converted) -Documentation on how to *write* and *publish* documentation [can be found here](infra/docinfra.md). +Documentation on how to _write_ and _publish_ documentation [can be found here](infra/docinfra.md). diff --git a/Games.md b/Games.md index 3a913814..b46b6be9 100644 --- a/Games.md +++ b/Games.md @@ -4,15 +4,16 @@ The languages are grouped according to game. {% assign games_repos = site.github.public_repositories|jsonify %} -# Word guessing game +## Word guessing game -Simple word guessing game in the tradition of [MasterMind](https://en.wikipedia.org/wiki/Mastermind_(board_game)). For more information on the source code, see [this repo](https://github.com/giellalt/template-wordguess-und). - -
+Simple word guessing game in the tradition of [MasterMind](). For more information on the source code, see [this repo](https://github.com/giellalt/template-wordguess-und). +```html +
+``` diff --git a/KeyboardLayouts.md b/KeyboardLayouts.md index 8fa4bacc..13358456 100644 --- a/KeyboardLayouts.md +++ b/KeyboardLayouts.md @@ -2,93 +2,93 @@ Beware that the documentation pages for most Experimental repos have little or no content, and that documentation for other keyboards probably is out-of-date. Writing documentation is an ongoing effort, and part of the development process. Automatically generated SVG layouts is presently not working. -The languages are grouped in three different ways, according to *maturity, geography* and *language family*. [Private repositories](https://github.com/divvun/private-registry) are not listed. +The languages are grouped in three different ways, according to _maturity, geography_ and _language family_. [Private repositories](https://github.com/divvun/private-registry) are not listed. -# Grouped according to maturity of the keyboards +## Grouped according to maturity of the keyboards Being in the **Production** group does not necessarily mean it is in production for both mobile and desktop, it can be only one of them. We don't differentiate between the two categories, as soon as a keyboard is released for the general audience for at least one platform, it is in the **Production** category. See the documentation for each keyboard for further details. {% assign keyb_repos = site.github.public_repositories|jsonify %} -## Production keyboard layouts [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](/MaturityClassification.html) +### Production keyboard layouts [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](/MaturityClassification.html)
-## Beta keyboard layouts [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](/MaturityClassification.html) +### Beta keyboard layouts [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](/MaturityClassification.html)
-## Alpha keyboard layouts [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](/MaturityClassification.html) +### Alpha keyboard layouts [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](/MaturityClassification.html)
-## Experimental keyboard layouts [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](/MaturityClassification.html) +### Experimental keyboard layouts [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](/MaturityClassification.html) Initial experiments and student exercises.
-## Keyboard layouts of undefined maturity [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](/MaturityClassification.html) +### Keyboard layouts of undefined maturity [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](/MaturityClassification.html)
-# Grouped according to geography +## Grouped according to geography -## Languages of the Nordic countries +### Languages of the Nordic countries
-## Languages of Russia +### Languages of Russia
-## Other European languages +### Other European languages
-## Languages in North America +### Languages in North America
-## Languages in Africa +### Languages in Africa
-## Languages in other parts of the world +### Languages in other parts of the world
-## Languages without geography tag +### Languages without geography tag
-# Grouped according to language family +## Grouped according to language family -## Uralic Languages +### Uralic Languages
-## Eskimo-Aleut Languages +### Eskimo-Aleut Languages
-## Algic Languages +### Algic Languages
-## Indoeuropean languages +### Indoeuropean languages
-## Niger-Congo Languages +### Niger-Congo Languages
-## Languages of other language families, isolates, artificial languages +### Languages of other language families, isolates, artificial languages
-## Languages with no language family tag +### Languages with no language family tag
diff --git a/LanguageModels.md b/LanguageModels.md index fc5b4966..3e041a78 100644 --- a/LanguageModels.md +++ b/LanguageModels.md @@ -2,97 +2,97 @@ Beware that the documentation pages for most Experimental repos have little or no content, and that documentation for other languages probably is out-of-date. Writing documentation for each language repository is an ongoing effort, and part of the development process. -The languages are grouped in three different ways, according to *maturity, geography* and *language family*. [Private repositories](https://github.com/divvun/private-registry) are not listed. +The languages are grouped in three different ways, according to _maturity, geography_ and _language family_. [Private repositories](https://github.com/divvun/private-registry) are not listed. -# Grouped according to maturity of the resources +## Grouped according to maturity of the resources -The [maturity levels](MaturityClassification.md) are *production, beta, alpha* and *experimental*. Some of the beta language models are used in practical applications. +The [maturity levels](MaturityClassification.md) are _production, beta, alpha_ and _experimental_. Some of the beta language models are used in practical applications. Being in the **Production** group does not necessarily mean a language model is in production for all purposes, it could be for one only. See the documentation for each language for further details. {% assign lang_repos = site.github.public_repositories|jsonify %} -## [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](MaturityClassification.html) Production language resources +### [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](MaturityClassification.html) Production language resources
-## [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](MaturityClassification.html) Beta language resources +### [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](MaturityClassification.html) Beta language resources
-## [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](MaturityClassification.html) Alpha language resources +### [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](MaturityClassification.html) Alpha language resources
-## [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](MaturityClassification.html) Experimental language resources +### [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](MaturityClassification.html) Experimental language resources
-## [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](MaturityClassification.html) Language resources of undefined maturity +### [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](MaturityClassification.html) Language resources of undefined maturity
-# Grouped according to geography +## Grouped according to geography -## Languages of the Nordic countries +### Languages of the Nordic countries
-## Languages of Russia +### Languages of Russia
-## Other European languages +### Other European languages
-## Languages in North America +### Languages in North America
-## Languages in Africa +### Languages in Africa
-## Languages in other parts of the world +### Languages in other parts of the world
-## Languages with no geography tag +### Languages with no geography tag
-# Grouped according to language family +## Grouped according to language family -## Uralic Languages +### Uralic Languages
-## Eskimo-Aleut Languages +### Eskimo-Aleut Languages
-## Algic Languages +### Algic Languages
-## Indoeuropean languages +### Indoeuropean languages
-## Niger-Congo Languages +### Niger-Congo Languages
-## Turkic Languages +### Turkic Languages
-## Languages of other language families, isolates, artificial languages +### Languages of other language families, isolates, artificial languages
-## Languages with no language family tag +### Languages with no language family tag
diff --git a/MaturityClassification.md b/MaturityClassification.md index 32c9c418..e4718655 100644 --- a/MaturityClassification.md +++ b/MaturityClassification.md @@ -1,168 +1,179 @@ # Language resource maturity classification -This page *presents* and *defines* the maturity classification system of this site. At the bottom of the page comes a description of how to add and change maturity tags. +This page _presents_ and _defines_ the maturity classification system of this site. At the bottom of the page comes a description of how to add and change maturity tags. - -# Maturity classes +## Maturity classes In the GielllaLT infrastructure we use a five-step classification to broadly describe the quality and development level of various linguistic resources. These categories are used as labels in README files, on the documentation front page for each resource, as well as in the overview pages for [language models](LanguageModels.md), [dictionaries](https://giellalt.github.io/dicts/DictionarySources.html), [keyboards](KeyboardLayouts.md) and [spell checkers](proof/index.md) (the maturity level of grammar checkers, machine translation applications and speech technology are still undefined). The labels look like the following: -| No. | Label | Type | Colour | -| --- |:----- |:---- | ------ | -| 1.| ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)| Production | green | -| 2.| ![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg) | Beta | yellow | -| 3.| ![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg) | Alpha | red | -| 4.| ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) | Experiment / student exercise | black | -| 5.| ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) | Undefined | grey | - - -# Maturity class definitions (in reverse order) +| No. | Label | Type | Colour | +| --- | :---------------------------------------------------------------------------------------- | :---------------------------- | ------ | +| 1. | ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg) | Production | green | +| 2. | ![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg) | Beta | yellow | +| 3. | ![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg) | Alpha | red | +| 4. | ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) | Experiment / student exercise | black | +| 5. | ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) | Undefined | grey | +## Maturity class definitions (in reverse order) Some of the criterias for the various levels are common for all resource pages and listed under **General criteria**. Other criteria are application specific: -## Undefined ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) +### Undefined ![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) Used when the maturity is not definable, or has not yet been defined/tagged. -## Experiment ![Maturity: Production](https://img.shields.io/badge/Maturity-Experiment-black.svg) +### Experiment ![Maturity: Production](https://img.shields.io/badge/Maturity-Experiment-black.svg) This category also covers student exercises (published with permission). The point of such exercises is not to make a working system, but to explore the possibilities for language technology. Such work can of course be extended and in the end result in a fully working, production tool. -### General criteria +#### General criteria + +- license not required, but is nice +- may not build at all +- Divvun Manager: + - might not be available + - if available: only available in the nightly channel +- rule of thumb: may not work at all + +#### Application specific criteria + +##### Language model + +- fragmentary grammar +- less than 1k lexical entries + +##### Dictionary + +- less than 1k lexical entries + +##### Keyboard + +- all letters may not be included +- layout experimental, will change + +##### Spell checker + +- see language model above +- no adaptation of error model +- no weighting corpus + +### Alpha ![Maturity: Production](https://img.shields.io/badge/Maturity-Alpha-red.svg) + +#### General criteria + +- license highly recommended +- Divvun Manager: + - is available + - only available in the nightly channel +- rule of thumb: it can be built locally and used for something + +#### Application specific criteria + +##### Language model -* license not required, but is nice -* may not build at all -* Divvun Manager: - * might not be available - * if available: only available in the nightly channel -* rule of thumb: may not work at all +- grammar model mostly complete +- lexicon between 1k and 10k entries -### Application specific criteria +##### Dictionary -#### Language model -* fragmentary grammar -* less than 1k lexical entries +- entries from different parts of speech +- lexicon between 1k and 10k entries -#### Dictionary -* less than 1k lexical entries +##### Keyboard -#### Keyboard -* all letters may not be included -* layout experimental, will change +- layout mostly done, may still change +- all letters in alphabet included -#### Spell checker -* see language model above -* no adaptation of error model -* no weighting corpus +##### Spell checker -## Alpha ![Maturity: Production](https://img.shields.io/badge/Maturity-Alpha-red.svg) +- Program works, corrects text, and is of some use -### General criteria +### Beta ![Maturity: Production](https://img.shields.io/badge/Maturity-Beta-yellow.svg) -* license highly recommended -* Divvun Manager: - * is available - * only available in the nightly channel -* rule of thumb: it can be built locally and used for something +#### General criteria -### Application specific criteria +- there **should** be a proper license +- CI/CD working for the tools being provided +- Divvun Manager: + - is available + - is available in the stable channel + - **NOT** visible on the front page, only via the `All languages` view +- rule of thumb: it can easily be installed via Divvun Manager - it must be testable by the user community -#### Language model -* grammar model mostly complete -* lexicon between 1k and 10k entries +#### Application specific criteria -#### Dictionary -* entries from different parts of speech -* lexicon between 1k and 10k entries +##### Language model -#### Keyboard -* layout mostly done, may still change -* all letters in alphabet included +- grammar model complete +- lexicon has more than 10k entries +- running text coverage above 80 % -#### Spell checker -* Program works, corrects text, and is of some use +##### Dictionary -## Beta ![Maturity: Production](https://img.shields.io/badge/Maturity-Beta-yellow.svg) +- different parts of speech treated differently +- lexicon has more than 10k entries -### General criteria -* there **should** be a proper license -* CI/CD working for the tools being provided -* Divvun Manager: - * is available - * is available in the stable channel - * **NOT** visible on the front page, only via the `All languages` view -* rule of thumb: it can easily be installed via Divvun Manager - it must be testable by the user community +##### Keyboard -### Application specific criteria +- layout complete for all levels and input methods -#### Language model -* grammar model complete -* lexicon has more than 10k entries -* running text coverage above 80 % +##### Spell checker -#### Dictionary -* different parts of speech treated differently -* lexicon has more than 10k entries +- The number of false positives is below 20 % +- Correction mechanism gives relevant connection in top-5 in most cases -#### Keyboard -* layout complete for all levels and input methods +### Production ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-rightgreen.svg) -#### Spell checker -* The number of false positives is below 20 % -* Correction mechanism gives relevant connection in top-5 in most cases - -## Production ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-rightgreen.svg) +#### General criteria -### General criteria -* there **must** be a proper license -* at least one contact person in the language community that is willing to or being payed to be a first line support person and language resource maintainer, public contact email or other contact info -* CI/CD working for the tools being provided -* Divvun Manager: - * is available - * is available in the stable channel - * **IS** visible on the front page -* Release `1.0.0` or higher of either speller or analyser/`giella-XXX` package -* rule of thumb: it is easily installable via the One-click installer or Divvun Manager front page +- there **must** be a proper license +- at least one contact person in the language community that is willing to or being payed to be a first line support person and language resource maintainer, public contact email or other contact info +- CI/CD working for the tools being provided +- Divvun Manager: + - is available + - is available in the stable channel + - **IS** visible on the front page +- Release `1.0.0` or higher of either speller or analyser/`giella-XXX` package +- rule of thumb: it is easily installable via the One-click installer or Divvun Manager front page -### Application specific criteria +#### Application specific criteria -#### Language model +##### Language model -* grammar/model/layout complete -* lexicon has more than 30k entries (but subject to realworld realities & limits) -* running text coverage above 90 % +- grammar/model/layout complete +- lexicon has more than 30k entries (but subject to realworld realities & limits) +- running text coverage above 90 % -#### Dictionary -* lexicon has more than 20k entries -* lemma articles are structured according to lemma type +##### Dictionary -#### Keyboard -* layout complete and evaluated for all levels and input methods +- lexicon has more than 20k entries +- lemma articles are structured according to lemma type -#### Spell checker -* The number of false positives is below 5 % -* Correction mechanism gives relevant connection in top-5 in almost all cases, in top position in most cases +##### Keyboard +- layout complete and evaluated for all levels and input methods +##### Spell checker +- The number of false positives is below 5 % +- Correction mechanism gives relevant connection in top-5 in almost all cases, in top position in most cases -# Registering maturity +## Registering maturity The maturity classification is done using GitHub topics. Maturity badges in README's, documentation and elsewhere are generated automatically from these topics, and they are also used in the [keyboard](keyboards/KeyboardLayouts.md) and [language resource](LanguageModels.md) lists to group the repos automatically. -## Adding maturity topic tags +### Adding maturity topic tags Adding maturity tags is done via [GitHub topics](https://docs.github.com/en/github/administering-a-repository/managing-repository-settings/classifying-your-repository-with-topics), and can only be done by repo or organisation owners or admins. It is also possible to use [`gut`](https://giellalt.github.io/infra/GutUsageExamples.html#task-9-manage-topics-info) to set the topics from the command line if they do not exist, but presently it is not possible to remove or change GitHub topics. The topic tags corresponding to the labels above are as follows: -* `maturity-prod` - ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg) -* `maturity-beta` - ![Maturity: Beta ](https://img.shields.io/badge/Maturity-Beta-yellow.svg) -* `maturity-alpha` - ![Maturity: Alpha ](https://img.shields.io/badge/Maturity-Alpha-red.svg) -* `maturity-exper` - ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) +- `maturity-prod` - ![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg) +- `maturity-beta` - ![Maturity: Beta ](https://img.shields.io/badge/Maturity-Beta-yellow.svg) +- `maturity-alpha` - ![Maturity: Alpha ](https://img.shields.io/badge/Maturity-Alpha-red.svg) +- `maturity-exper` - ![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg) The ![Maturity: Undefined ](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg) category does of course not have a topic - that is the definition of the category. In the lists and tables linked to above it should ideally be empty, but it is listed in any case to easily spot repositories that do not yet have a defined maturity class. diff --git a/Personvern.md b/Personvern.md index d9a72c6d..51f2550a 100644 --- a/Personvern.md +++ b/Personvern.md @@ -1,8 +1,8 @@ # Personvern og Github -# Diskusjon og problematisering +## Diskusjon og problematisering -## Personvern +### Personvern Det kan vera grunnar til å ikkje offentleggjera personane bak arbeidet vårt, men heller la miljøet stå samla bak alt vi gjer. Dette gjeld særleg dersom ein arbeidet med omdiskuterte spørsmål i små miljø. @@ -10,7 +10,7 @@ Problemet er at github er laga for å kopla i hop folk som arbeider med kode, so Det er mogleg å gjera seg ganske usynleg (sjå praktiske tips lenger ned), men for lingvistar i små språkmiljø er det heilt uråd - ein ser med ein gong kva for språk ein github-konto arbeider med, og meir trengst ikkje. -## Datavern +### Datavern Vi arbeider med både korrekte språkdata og data med feil i, inkl dårleg og beint fram feil språk. Dette må vi, både fordi vi lagar verkty som skal retta slike feil, og fordi det å studera slike feil vil hjelpa oss å laga betre verkty meir allment. @@ -18,23 +18,23 @@ Om data ligg ute, inkl data over språklege feil, og diskusjonar om normering - Det er dessutan alltid feil i koden vår - som i all annan programvare. Vi strevar så klart til å ha så få feil som mogleg, men koden er for kompleks til å kunna ha han heilt feilfri. Og i ein del tilfelle er det uklårt kva som er rett og feil, eller normeringa er udefinert. +## Retningsliner -# Retningsliner - -# Praktiske tips +## Praktiske tips Om github: + - alle personkontoar er offentlege, men ein kan ha minimalt med info der, og ingen ting som identifiserer ein. - lag eit githubnamn som ikkje kan koplast til deg - - det er mogleg [å byta brukarnamn på ein eksisterande konto](https://github.com/settings/admin) - "Change username" (men sjå sida over moglege negative fylgjer) + - det er mogleg [å byta brukarnamn på ein eksisterande konto](https://github.com/settings/admin) - "Change username" (men sjå sida over moglege negative fylgjer) - på [profilsida](https://github.com/settings/profile): - - ikkje ha eit profilbilete som kan koplast til deg - - ikkje spesifiser kor du bur - - ingen namn - - ingen url - - ingen arbeidsgjevar - - ikkje ha ei synleg e-postadresse - "Public email" (la feltet stå uspesifisert) + - ikkje ha eit profilbilete som kan koplast til deg + - ikkje spesifiser kor du bur + - ingen namn + - ingen url + - ingen arbeidsgjevar + - ikkje ha ei synleg e-postadresse - "Public email" (la feltet stå uspesifisert) - [hald e-postadressa privat](https://github.com/settings/emails) - "Keep my email addresses private" - [ikkje vis at du er medlem i ein "organisasjon"](https://docs.github.com/en/free-pro-team@latest/github/setting-up-and-managing-your-github-user-account/publicizing-or-hiding-organization-membership), t.d. giellalt eller divvun (github-organisasjonar er som ein paraply over alle repositoria som høyrer saman) diff --git a/README.md b/README.md index affbaca5..1b506e2c 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -## GiellaLT documentation +# GiellaLT documentation This is the repository for the site [giellalt.github.io](https://giellalt.github.io). It contains technical and developer documentation for everything related to the GiellaLT infrastructure, linguistic work using this infrastructure, keyboard, proofing tools and machine translation development, and much more. diff --git a/SharedResources.md b/SharedResources.md index e21cc376..d645045b 100644 --- a/SharedResources.md +++ b/SharedResources.md @@ -13,21 +13,21 @@ Finally there is a section listing all template repositories. These are used par as a starting point for new repositories, partly to update all existing repositories with new features or general improvements. -# List of repos with shared resources +## List of repos with shared resources {% assign shared_repos = site.github.public_repositories | where_exp: "repository", "repository.name contains 'shared-'" | jsonify %}
-# Core repository +## Core repository {% assign core_repos = site.github.public_repositories | where_exp: "repository", "repository.name contains 'giella-'" | jsonify %}
-# Templates +## Templates {% assign template_repos = site.github.public_repositories | where_exp: "repository", "repository.name contains 'template-'" | jsonify %} diff --git a/TeamsPartners.md b/TeamsPartners.md index 90875a1c..ae19d8bf 100644 --- a/TeamsPartners.md +++ b/TeamsPartners.md @@ -1,53 +1,55 @@ # Maintainers, Developers, Teams and Partners -# Core developers +## Core developers The GiellaLT infrastructure was initially built for the Sámi languages, and even today, the teams behind Sámi language technology are the core maintainers of the infrastructure: -| Logo & link | Name & description | -|:-----------:|:------------------ | -| [![Divvn logo](images/logos/divvun-logo-m-tekst-utan-uit.png)](https://divvun.no/en) | **The Divvun group**
The Divvun group was founded in 2004 at the Norwegian Sámi Parliament, and moved to UiT in 2011. The main purpose of the group is to develop language technology tools for the Sámi language communities. -| [![Giellatekno](images/logos/GT-logo.png)](https://giellatekno.uit.no/index.eng.html) | **Giellatekno**
The Giellatekno research group was founded by UiT in the early 2000's to develop and do research on Sámi language technology. -| [![UiT logo](images/logos/UiT_Segl_Sam_Svart_960px.png)](https://en.uit.no) | **UiT The Arctic University of Norway**
UiT is the world's northernmost university, and the home for both the Divvun and Giellatekno groups. It also provides GiellaLT with Enterprise GitHub services. +| | | +| :-----------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| [![Divvn logo](images/logos/divvun-logo-m-tekst-utan-uit.png)](https://divvun.no/en) | **The Divvun group**
The Divvun group was founded in 2004 at the Norwegian Sámi Parliament, and moved to UiT in 2011. The main purpose of the group is to develop language technology tools for the Sámi language communities. | +| [![Giellatekno](images/logos/GT-logo.png)](https://giellatekno.uit.no/index.eng.html) | **Giellatekno**
The Giellatekno research group was founded by UiT in the early 2000's to develop and do research on Sámi language technology. | +| [![UiT logo](images/logos/UiT_Segl_Sam_Svart_960px.png)](https://en.uit.no) | **UiT The Arctic University of Norway**
UiT is the world's northernmost university, and the home for both the Divvun and Giellatekno groups. It also provides GiellaLT with Enterprise GitHub services. | -# Sámi partners +## Sámi partners -| Logo & link | Name & description | -|:-----------:|:------------------ | -| [![SD logo](images/logos/SD-logo.png)](http://samediggi.no/) | **Sámediggi**
The Norwegian Sámi Parliament founded the Divvun group in 2004, together with the Norwegian government. -| [![GG logo](images/logos/Giellagaldu.svg)](http://www.giella.org) | **Sámi Giellagáldu**
Term development and normativity questions for all Sámi languages in the Nordic countries. -| | **Ávvir**
The only daily newspaper in a Sámi language. They use the Divvun tools, and provides their texts to the Sámi corpus. -| [![Oulu logo](images/logos/Oulun_yliopisto_logo_text_fi.png)](http://www.oulu.fi/giellagasinstituutti/) | **Giellagas-instituutti**
Cooperation covers a.o. Inari Sámi proofing tools and analysers, dictionaries, speech resources. -| [![Aajege logo](images/logos/Aajege_logo_svart_no.png)](http://aajege.no) | **Aajege**
The language learning app Gïelese +| | | +| :---------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------- | +| [![SD logo](images/logos/SD-logo.png)](http://samediggi.no/) | **Sámediggi**
The Norwegian Sámi Parliament founded the Divvun group in 2004, together with the Norwegian government. | +| [![GG logo](images/logos/Giellagaldu.svg)](http://www.giella.org) | **Sámi Giellagáldu**
Term development and normativity questions for all Sámi languages in the Nordic countries. | +| | **Ávvir**
The only daily newspaper in a Sámi language. They use the Divvun tools, and provides their texts to the Sámi corpus. | +| [![Oulu logo](images/logos/Oulun_yliopisto_logo_text_fi.png)](http://www.oulu.fi/giellagasinstituutti/) | **Giellagas-instituutti**
Cooperation covers a.o. Inari Sámi proofing tools and analysers, dictionaries, speech resources. | +| [![Aajege logo](images/logos/Aajege_logo_svart_no.png)](http://aajege.no) | **Aajege**
The language learning app Gïelese -# Other indigenous and minority language teams and developers +## Other indigenous and minority language teams and developers Although initially built for the Sámi languages, there has from the very beginning been cooperation with other indigenous and minority language communities, starting with Greenlandic. The GiellaLT infrastructure is open to everyone, and gives a tremendeous head start for language communities wanting support for their language when writing, reading and using the language in a digital world. -| Logo & link | Name & description | -|:-----------:|:------------------ | -| [![](images/logos/AltLab.png)](http://altlab.artsrn.ualberta.ca) | **AltLAB**
Language technology for Indigenous languages in Canada -| | **Oqaasileriffik — The Language Secretariat of Greenland**
Language technology for Greenlandic. -| [![](images/logos/Frodskaparsetur-logo-runt-MTD.png)](https://www.setur.fo/en/the-university/faculties/faculty-of-faroese-language-and-literature/the-centre-for-language-technology#:~:text=The%20Centre%20for%20Language%20Technology%20carries%20out%20research%20and%20development,Department%20of%20Science%20and%20Technology) | **University of the Faroe Islands, The Centre for Language Technology**
Carries out research and development of Faroese language technology. -| [![](images/logos/s200_jack.rueter.jpg)](https://researchportal.helsinki.fi/en/persons/jack-rueter) | **Jack Rueter**
Skolt Sámi and Uralic languages in Russia and the Baltic countries. -| [![](images/logos/VInst-logo-150702.png)](https://wi.ee/en/) | **Võro Instituut**
Language technology for the Võro language in Estonia. -| [![](images/logos/lu-libiesu-instituts-logo-en@2x.png)](https://www.livonian.lv/en/home/) | **University of Latvia Livonian Institute**
Language resources and tools for Latvia’s indigenous Livonian language. -| [![](images/logos/Contributors.jpg)](https://github.com/orgs/giellalt/people) | **Many individual contributors**
The GiellaLT infrastructure is open source, and we welcome external contributions, both directly (ask for push access) or via [Pull Recuests](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). - -# Technology, maintenance and academic partners - -| Logo & link | Name & description | -|:-----------:|:------------------ | -| [![](images/logos/HU-logo.gif)](https://hfst.github.io) | **University of Helsinki — HFST**
Finite state transducer technology, used for morphological analysis and generation, tokenisation, spelling checkers and more. -| [![](images/logos/HU-logo.gif)](https://www.helsinki.fi/en/faculty-arts/research/disciplines/digital-humanities/phonetics) | **University of Helsinki — phonetics lab**
Essential support for the GiellaLT speech synthesis infrastructure. -| [![](images/logos/GrammarSoftApS.jpg)](https://edu.visl.dk) | **VISL**
The home of VISLCG3, which is the tool and formalism used for all language processing after morphological analysis in the GiellaLT framework. -| [![](images/logos/TinoDidriksen.jpg)](https://tinodidriksen.com/curriculum-vitae/) | **Tino Didriksen**
Windows and MS Office integration until about 2021, Greenlandic LT, and VISLCG3 development and support. -| [![](images/logos/BrendanMolloy.jpg)](https://github.com/bbqsrc) | **Brendan Molloy**
Morphology testing framework, mobile keyboards and keyboard generation, web speller, MacDivvun, and much more. -| [![](images/logos/Necessary.png)](https://github.com/necessary-nu) | **Necessary Innovation**
Advanced language technnology integration. -| | **The Techno Creatives**
Support and maintenance work in the GiellaLT infrastructure and Divvun technology components 2019-2024. -| [![](images/logos/Apertium.png)](http://wiki.apertium.org/wiki/Main_Page) | **Apertium**
Free and open MT for many languages. -| [![](images/logos/Clarin_typeB_Frame_middle.png)](https://www.kielipankki.fi/safmoril/) | **CLARIN**
GiellaLT is part of the SAFMORIL research network in CLARIN. -| [![](images/logos/Zulip-org-logo.svg.png)](https://zulip.com) | **Zulip**
An open-source modern team chat app designed to keep both live and asynchronous conversations organized. Used in the GiellaLT infra for team chat and community communication. Log in and join us using your GitHub account! -| [![](images/logos/github-mark.png)](https://github.com) | **GitHub**
Source code repos, automatic builds, infrastructure support. -| [![](images/logos/voikko-icon.png)](https://voikko.puimula.org) | **Voikko**
Speller integration with LibreOffice until around 2022. -| [![](images/logos/TriGram.png)](https://unhammer.org/k/) | **Trigram AS / Kevin Unhammer**
Free and open source language technology. +| | | +| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| [![](images/logos/AltLab.png)](http://altlab.artsrn.ualberta.ca) | **AltLAB**
Language technology for Indigenous languages in Canada | +| | **ISOF**
Language resources and tools for Meänkieli and Romani. +| +|
| **Oqaasileriffik — The Language Secretariat of Greenland**
Language technology for Greenlandic. | +| [![](images/logos/Frodskaparsetur-logo-runt-MTD.png)](https://www.setur.fo/en/the-university/faculties/faculty-of-faroese-language-and-literature/the-centre-for-language-technology#:~:text=The%20Centre%20for%20Language%20Technology%20carries%20out%20research%20and%20development,Department%20of%20Science%20and%20Technology) | **University of the Faroe Islands, The Centre for Language Technology**
Carries out research and development of Faroese language technology. | +| [![](images/logos/s200_jack.rueter.jpg)](https://researchportal.helsinki.fi/en/persons/jack-rueter) | **Jack Rueter**
Skolt Sámi and Uralic languages in Russia and the Baltic countries. | +| [![](images/logos/VInst-logo-150702.png)](https://wi.ee/en/) | **Võro Instituut**
Language technology for the Võro language in Estonia. | +| [![](images/logos/lu-libiesu-instituts-logo-en@2x.png)](https://www.livonian.lv/en/home/) | **University of Latvia Livonian Institute**
Language resources and tools for Latvia’s indigenous Livonian language. | +| [![](images/logos/Contributors.jpg)](https://github.com/orgs/giellalt/people) | **Many individual contributors**
The GiellaLT infrastructure is open source, and we welcome external contributions, both directly (ask for push access) or via [Pull Recuests](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). | + +## Technology, maintenance and academic partners + +| | | +| :------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| [![](images/logos/HU-logo.gif)](https://hfst.github.io) | **University of Helsinki — HFST**
Finite state transducer technology, used for morphological analysis and generation, tokenisation, spelling checkers and more. | +| [![](images/logos/HU-logo.gif)](https://www.helsinki.fi/en/faculty-arts/research/disciplines/digital-humanities/phonetics) | **University of Helsinki — phonetics lab**
Essential support for the GiellaLT speech synthesis infrastructure. | +| [![](images/logos/GrammarSoftApS.jpg)](https://edu.visl.dk) | **VISL**
The home of VISLCG3, which is the tool and formalism used for all language processing after morphological analysis in the GiellaLT framework. | +| [![](images/logos/TinoDidriksen.jpg)](https://tinodidriksen.com/curriculum-vitae/) | **Tino Didriksen**
Windows and MS Office integration until about 2021, Greenlandic LT, and VISLCG3 development and support. | +| [![](images/logos/BrendanMolloy.jpg)](https://github.com/bbqsrc) | **Brendan Molloy**
Morphology testing framework, mobile keyboards and keyboard generation, web speller, MacDivvun, and much more. | +| [![](images/logos/Necessary.png)](https://github.com/necessary-nu) | **Necessary Innovation**
Advanced language technnology integration. | +| | **The Techno Creatives**
Support and maintenance work in the GiellaLT infrastructure and Divvun technology components 2019-2024. | +| [![](images/logos/Apertium.png)](http://wiki.apertium.org/wiki/Main_Page) | **Apertium**
Free and open MT for many languages. | +| [![](images/logos/Clarin_typeB_Frame_middle.png)](https://www.kielipankki.fi/safmoril/) | **CLARIN**
GiellaLT is part of the SAFMORIL research network in CLARIN. | +| [![](images/logos/Zulip-org-logo.svg.png)](https://zulip.com) | **Zulip**
An open-source modern team chat app designed to keep both live and asynchronous conversations organized. Used in the GiellaLT infra for team chat and community communication. Log in and join us using your GitHub account! | +| [![](images/logos/github-mark.png)](https://github.com) | **GitHub**
Source code repos, automatic builds, infrastructure support. | +| [![](images/logos/voikko-icon.png)](https://voikko.puimula.org) | **Voikko**
Speller integration with LibreOffice until around 2022. | +| [![](images/logos/TriGram.png)](https://unhammer.org/k/) | **Trigram AS / Kevin Unhammer**
Free and open source language technology. | diff --git a/_config.yml b/_config.yml index 96d71b5a..be74e9b0 100644 --- a/_config.yml +++ b/_config.yml @@ -1,5 +1,8 @@ theme: jekyll-theme-minimal + title: GiellaLT + description: GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document. + plugins: - - jemoji \ No newline at end of file + - jemoji diff --git a/_includes/toc.html b/_includes/toc.html index 85f3f623..59ea536d 100644 --- a/_includes/toc.html +++ b/_includes/toc.html @@ -24,7 +24,7 @@ OTHER DEALINGS IN THE SOFTWARE. {% endcomment %} {% comment %} - Version 1.1.0 + Version 1.2.1 https://github.com/allejo/jekyll-toc "...like all things liquid - where there's a will, and ~36 hours to spare, there's usually a/some way" ~jaybe @@ -47,6 +47,7 @@ * base_url (string) : '' - add a base url to the TOC links for when your TOC is on another page than the actual content * anchor_class (string) : '' - add custom class(es) for each anchor element * skip_no_ids (bool) : false - skip headers that do not have an `id` attribute + * flat_toc (bool) : false - when set to true, the TOC will be a single level list Output: An ordered or unordered list representing the table of contents of a markdown block. This snippet will only @@ -69,6 +70,7 @@ {% capture jekyll_toc %}{% endcapture %} {% assign orderedList = include.ordered | default: false %} + {% assign flatToc = include.flat_toc | default: false %} {% assign baseURL = include.base_url | default: include.baseurl | default: '' %} {% assign skipNoIDs = include.skip_no_ids | default: include.skipNoIDs | default: false %} {% assign minHeader = include.h_min | default: 1 %} @@ -138,9 +140,9 @@ {% capture listItem %}{{ anchorBody }}{% endcapture %} {% endif %} - {% if currLevel > lastLevel %} + {% if currLevel > lastLevel and flatToc == false %} {% capture jekyll_toc %}{{ jekyll_toc }}<{{ listModifier }}{{ subMenuClass }}>{% endcapture %} - {% elsif currLevel < lastLevel %} + {% elsif currLevel < lastLevel and flatToc == false %} {% assign repeatCount = lastLevel | minus: currLevel %} {% for i in (1..repeatCount) %} @@ -158,8 +160,13 @@ {% assign firstHeader = false %} {% endfor %} - {% assign repeatCount = minHeader | minus: 1 %} - {% assign repeatCount = lastLevel | minus: repeatCount %} + {% if flatToc == true %} + {% assign repeatCount = 1 %} + {% else %} + {% assign repeatCount = minHeader | minus: 1 %} + {% assign repeatCount = lastLevel | minus: repeatCount %} + {% endif %} + {% for i in (1..repeatCount) %} {% capture jekyll_toc %}{{ jekyll_toc }}{% endcapture %} {% endfor %} diff --git a/_layouts/default.html b/_layouts/default.html index f69b5c19..481694e1 100644 --- a/_layouts/default.html +++ b/_layouts/default.html @@ -43,8 +43,10 @@

{{ site.title | default: site.github.repo
  • View On GitHub
  • {% endif %} -

    Page Content

    - {% include toc.html html=content sanitize=true class="left_toc" id="left_toc" %} +
    +

    Page Content

    + {% include toc.html html=content sanitize=true class="left_toc" id="left_toc" h_min=2 h_max=6 %} +
    diff --git a/apps/AppLocalisation.md b/apps/AppLocalisation.md index b708d397..c63aef25 100644 --- a/apps/AppLocalisation.md +++ b/apps/AppLocalisation.md @@ -1,6 +1,6 @@ # Localising apps and webapps related to the GiellaLT infrastructure -# App overview +## App overview The following apps & webapps need localisation: @@ -13,7 +13,7 @@ The following apps & webapps need localisation: Localisation of each of them is described below. -# borealium.org +## borealium.org [borealium.org](https://github.com/borealium/borealium.org) is now always localised using [our Pontoon instance](https://divvun-pontoon-vm.norwayeast.cloudapp.azure.com/projects/borealium/). Log in using your GitHub account, it should be automatic. @@ -25,10 +25,11 @@ It is still possible to also do localisations directly in GitHub, and those chan All pathnames in this section are relative to the root of the [borealium.org](https://github.com/borealium/borealium.org) repository. -## Languages and fallback mechanisms +### Languages and fallback mechanisms `data/languages.ts` contains a list of all languages covered by the site. It has four sections: +```xml
    regions
    Defines all BCP-47 compatible area codes used in the portal, with their localised names. Regions are used to cover linguistic or orthographic variation following the regions, or just to ensure a most useful fallback list depending on region: SME in Finland should fall back to Finnish, then English, while SME in Sweden should fall back to Swedish, then English.
    @@ -41,14 +42,15 @@ All pathnames in this section are relative to the root of the [borealium.org](ht
    excludeFromUi
    Languages for which we do not have any translated content, but for which we still want them listed in the tools list. That is, skip these languages in the language selection drop down for the site, but list them in the overview of resources for languages.
    +``` -## Categories +### Categories `data/categories.ts` contains localised names and descriptions of categories. It is seen on top of each category page. One gets to these pages when clicking on a category label. -## Resources +### Resources `data/resources/` contains the definition of all resources described on the site. Except for the file `mod.ts`, all files contain strings that should be localised. The strings are the following: @@ -57,7 +59,7 @@ It is seen on top of each category page. One gets to these pages when clicking o - `moreInfo` - a longer description of the resource, if wanted - `links:text` - text to appear on the link button. Often this can just use the English text, but sometimes a translation will work better -## Content files +### Content files Most of the content for the portal lives in `src/`. All localisable text is placed in `.flt` files, including in subdirs. At present, the following dirs contain `.flt` files to be localised: @@ -75,9 +77,9 @@ src/ └── privacy ``` -# Divvun Manager +## Divvun Manager -## macOS +### macOS See [the README](https://github.com/divvun/divvun-manager-macos). But it boils down to the following: @@ -90,24 +92,24 @@ strut-icu-generate swift Support/LocalisationResources/base.yaml \ -o . ``` -## Windows +### Windows See [the README](https://github.com/divvun/divvun-manager-windows). But it boils down to the following: - edit the relevant file in `DivvunInstaller/Strings.[your_lang].resx` (add a new for new languages) - add your new language tag in `Divvun.Installer/UI/Settings/SettingsWindow.xaml.cs` -## Both +### Both - language names: [make PR here](https://github.com/divvun/iso639-databases) -# DM One-click installer +## DM One-click installer (Windows only) TBW - [support and documentation missing](https://github.com/divvun/divvun-manager-windows/issues/59) -# Package names and descriptions +## Package names and descriptions Packages are what we distribute to users, such as speller and keyboard packages. They are listed in various places, always with a name, and often with a corresponding description, both of which can be localised. @@ -123,8 +125,8 @@ Package names & descriptions are stored and localised in the following files: - keyboards: add entries in `keyboard-XXX/XXX.kbdgen/project.yaml` - spellers: add entries in `lang-XXX/manifest.toml.in` , **but**: - - English and Native speller names and descriptions are stored in `lang-XXX/configure.ac`, and automatically added to `lang-XXX/manifest.toml` - - Localisations for other languges should be added to `lang-XXX/manifest.toml.in` + - English and Native speller names and descriptions are stored in `lang-XXX/configure.ac`, and automatically added to `lang-XXX/manifest.toml` + - Localisations for other languges should be added to `lang-XXX/manifest.toml.in` As a general rule, the minimum localisation should be: @@ -143,9 +145,9 @@ The names and descriptions will be propagated in two steps: Both steps are automatic and happen regularly, so on average, new package descriptions will be available pretty soon after they have been committed and pushed. -# Páhkat +## Páhkat -## Categories and channel labels +### Categories and channel labels Category and channel labels are defined in `.toml` files in `main/strings/`. The following is the content of the `en.toml` (English) file: @@ -169,11 +171,11 @@ and the channel labels show up in the settings in Divvun Manager: ![Pahkatkategori_i_borealium.png](../images/Pahkattekst_i_DM_settings.png) -## Other strings +### Other strings There are a couple of other strings as well that could or should be translated. -### Language listing heading +#### Language listing heading This is defined in `main/index.toml`: @@ -194,7 +196,7 @@ This text is found several places: ![Pahkattekst_i_DM_settings2.png](../images/Pahkattekst_i_DM_meny.png) -### More strings +#### More strings The following files and directories contain localisable strings: @@ -203,6 +205,6 @@ The following files and directories contain localisable strings: It is unclear whether these strings are displayed, if at all. -# satni.org +## satni.org TBW diff --git a/apps/gielese.md b/apps/gielese.md index 68e5febe..ebcfcdc7 100644 --- a/apps/gielese.md +++ b/apps/gielese.md @@ -1,28 +1,26 @@ -# Gïelese documentation - +# Gïelese documentation Gïelese is a media rich language learning application written in Python, and [CoffeeScript](http://coffeescript.org/)/JavaScript/HTML5. Platform specific apps are built using [PhoneGap](http://phonegap.com). +## Technical documentation + +## [Project overview](gielese/ProjectOverview.html) -## Technical documentation +## [Building the Gïelese apps (PhoneGap)](gielese/BuildingTheGieleseApps.html) +## [Client Development](gielese/ClientDevelopment.html) -# [Project overview](gielese/ProjectOverview.html) -# [Building the Gïelese apps (PhoneGap)](gielese/BuildingTheGieleseApps.html) -# [Client Development](gielese/ClientDevelopment.html) -# [Server Development](gielese/ServerDevelopment.html) -# [Maintenance tasks](gielese/GieleseRestarting.html) +## [Server Development](gielese/ServerDevelopment.html) +## [Maintenance tasks](gielese/GieleseRestarting.html) NB: developers working on Gïelese can also refer to Python docstrings in each file, and a good amount of commenting in the client code. +### Project documentation -## Project documentation - - -* [meeting overview](doc/admin/index.html) -* [graphical specification 1](doc/app_design_og_layout.pdf) -* [graphical specification 2](doc/app_design_og_layout_2.pdf) +- [meeting overview](doc/admin/index.html) +- [graphical specification 1](doc/app_design_og_layout.pdf) +- [graphical specification 2](doc/app_design_og_layout_2.pdf) diff --git a/apps/gielese/BuildingTheGieleseApps.md b/apps/gielese/BuildingTheGieleseApps.md index f9137e4b..c6eb13c4 100644 --- a/apps/gielese/BuildingTheGieleseApps.md +++ b/apps/gielese/BuildingTheGieleseApps.md @@ -1,115 +1,84 @@ Gïelese has a client and a remote server for data storage. These steps focus on the compilation of the client. - These are the steps required to build the three apps presently supported by the Gïelese source code: - -* (1) JavaScript web app -* (2) Client-side media files -* (3) PhoneGap Android app -* (4) PhoneGap iOS app - +- (1) JavaScript web app +- (2) Client-side media files +- (3) PhoneGap Android app +- (4) PhoneGap iOS app # Preparations - 1. cd $GTHOME/apps/aajege/src/sma-client 1. sudo port install nodejs npm 1. npm install - -## Preparing the JavaScript environment - +### Preparing the JavaScript environment ??? +### Preparing the media server -## Preparing the media server - - -* Create a virtualenv, run it and initialize from requirements.txt -* `python -c "import os ; print os.urandom(24)" > secret_key` -* `python manage.py init_db` -* `python manage.py install_media -f ../data/sma_media.xml` -* `python manage.py append_lexical_data -f ../data/n_smanob_test.xml` - +- Create a virtualenv, run it and initialize from requirements.txt +- `python -c "import os ; print os.urandom(24)" > secret_key` +- `python manage.py init_db` +- `python manage.py install_media -f ../data/sma_media.xml` +- `python manage.py append_lexical_data -f ../data/n_smanob_test.xml` The latter only installs/updates definitions for existing words from the first step, if you want to just install everything, use: - -* `python manage.py install_lexicon -f ../data/n_smanob.xml` - +- `python manage.py install_lexicon -f ../data/n_smanob.xml` Prepare JSON files. +- `python manage.py prepare_json` -* `python manage.py prepare_json` - - -## Building the internationalisation - +### Building the internationalisation Extracting is a little tricky. Mind the dot at the end, as we need the current directory too. +`pybabel compile -d translations` -```pybabel compile -d translations``` - - -## Building the media directories for phonegap - +### Building the media directories for phonegap In the main directory (~/$GTHOME/apps/aajege/src/) run the following command: - -```make prepare-for-phonegap``` - +`make prepare-for-phonegap` This will take a little while. It does the following: - 1. Extracts database information to JSON 1. Copies JSON 1. Copies media files, and trims them depending on various parameters (target device) +## Phonegap dependencies -# Phonegap dependencies - - -PhoneGap system dependencies: - - -*For building with Android* +PhoneGap system dependencies: +_For building with Android_ 1. Android SDK (standalone tools): http://developer.android.com/sdk/installing/index.html?pkg=tools 1. ant (via homebrew) - PhoneGap dependencies must be installed using 'npm', globally. NB: you may need sudo permissions for this. - ``` $ npm install -g phonegap@3.6.3-0.22.6 $ npm install -g cordova@3.6.3-0.2.13 $ npm install -g ios-sim@3.1.1 ``` - These dependencies are checked in sma-client/phonegap/package.json - -# Building the apps - +## Building the apps 1. `./node_modules/.bin/brunch build --production` => webapp 1. `???` => Android app 1. `???` => iOS app - -# Building for Android - +## Building for Android ``` phonegap build android @@ -117,197 +86,141 @@ These dependencies are checked in sma-client/phonegap/package.json ant release ``` - NB: you will be prompted for the keystore password (twice). This is in priv too. - The file will be generated in `bin`. This should be enough for building Android .apk files for release, but but if this is not enough see further steps in the following document. - Other relevant docs: - -* [http://developer.android.com/tools/device.html] -* [http://developer.android.com/tools/publishing/app-signing.html] -* [http://stackoverflow.com/questions/17316910/phonegap-run-from-cli-with-release-and-self-signed-app-requires-me-to-patch-co] - +- [http://developer.android.com/tools/device.html] +- [http://developer.android.com/tools/publishing/app-signing.html] +- [http://stackoverflow.com/questions/17316910/phonegap-run-from-cli-with-release-and-self-signed-app-requires-me-to-patch-co] NB: before uploading a new release to the android app store, be sure to update the versionCode in the AndroidManifest.xml file. +### Debugging -## Debugging - - -If you encounter errors in the process of the *phonegap build android* +If you encounter errors in the process of the _phonegap build android_ command, run the following instead: - -```cordova build android``` - +`cordova build android` This will return much more useful feedback. +### Known issues -## Known issues - - -* saxon9.jar available in the library path in ~/lib/ may cause problems in the build. If you see errors about XPath transforms, move it out of your java path - - -# Android Submission +- saxon9.jar available in the library path in ~/lib/ may cause problems in the build. If you see errors about XPath transforms, move it out of your java path +## Android Submission ? +## Beta testers -# Beta testers - - -## iOS - +### iOS ? - -## Android - +### Android For beta testing on android, users must be a member of the Google group (gielese-tester-community), where they must follow the development link in order to get permission to download any beta versions. After the user is a member of the group, this is automatic. - Invites must be managed within the Group, as it is not listed as public. - -### Creating a beta testing group restriction - +#### Creating a beta testing group restriction 1. Follow the procedure on http://groups.google.com -1. Once complete, copy the group e-mail address (*@googlegroups.com) +1. Once complete, copy the group e-mail address (\*@googlegroups.com) 1. In the Play admin page, select APK -1. After uploading a beta, click 'Manage list of testers' in the Beta Testers section +1. After uploading a beta, click 'Manage list of testers' in the Beta Testers section 1. Copy & Paste the group e-mail address into the field, and keep track of the k below 1. Share the link with potential beta testers. The easiest means here is to just add the link to the Google Group info, so testers will see this as the first thing +#### Beta tester enrollment -### Beta tester enrollment - - -As a beta tester, follow this procedure: - +As a beta tester, follow this procedure: 1. Join the group 1. If necessary, make sure an admin can confirm you are enrolled 1. Follow the link provided 1. Follow the instructions to download the beta - NB: Since it may take a few hours for APKs to be deployed to all of google Play's servers, it may be so that you will be able to enroll in the beta -program, but not access the beta. If this is so, try back in a couple hours. - - - - -# Deploying an iOS app +program, but not access the beta. If this is so, try back in a couple hours. +## Deploying an iOS app In the Apple Developer Member Center, you must have the following things: - -* a Provisioning profile -** The deploying user must have access to the production distribution profile -* a Deployment certificate -** After a distribution provisioning profile is created, the user must create - a production certificate - +- a Provisioning profile + \*\* The deploying user must have access to the production distribution profile +- a Deployment certificate + \*\* After a distribution provisioning profile is created, the user must create + a production certificate The wizards used to create these will explain in good detail how to generate certificates. - The end result will be that you will need to install the Deployment Certificate on your own machine through the Keychain Access app. - - - -## Xcode configuration - +### Xcode configuration 1. Add your Apple Dev Center account (which could be different from iTunes Store) 1. Account must be an admin in order to have access to provisioning profiles 1. In Xcode preferences, look at Accounts tab -1. Add the account, or if the account is already added, click on Details, and then click the refresh icon. - - -## Uploading +1. Add the account, or if the account is already added, click on Details, and then click the refresh icon. +### Uploading 1. Build the project with destination iOS Device, and Build for Running. 1. Troubleshoot any errors that occur in the build process. -1. Once successful... Select from the *Product* menu: *Archive* +1. Once successful... Select from the _Product_ menu: _Archive_ 1. Once the archiving process completes, you will see a window with the latest archive -1. Select this, and choose *Submit* +1. Select this, and choose _Submit_ 1. Through the following dialog boxes, choose the correct project and team 1. Wait for archive to be signed (you may have to confirm some accesses to the Key Ring) 1. Upload it - NB: the Bundle ID must be the same for the upload to work. You can set this on Apple's side within iTunes Connect. +- Exporting is also possible, but haven't identified a need for this yet -* Exporting is also possible, but haven't identified a need for this yet - - - - -# Upgrading cordova +## Upgrading cordova +Follow the documentation, but also make sure that you: -Follow the documentation, but also make sure that you: - - - * cordova platform upgrade ios - +- cordova platform upgrade ios And then check the installed plugins and, remove and add them all individually +## Known issuess - - -# Known issuess - - -## Apple complains of aps-environment entitlement - +### Apple complains of aps-environment entitlement Review process returns: - ``` Missing Push Notification Entitlement - Your app appears to include API used to register with the Apple Push Notification service, but the app signature's - entitlements do not include the "aps-environment" entitlement. + entitlements do not include the "aps-environment" entitlement. ``` - -But, we don't use push notifications. - +But, we don't use push notifications. 1. https://github.com/meteor/meteor/issues/2974 1. http://forum.ionicframework.com/t/missing-push-notification-entitlement/5436/4 - -*Problem*: Phonegap 3.5 seems to automatically include API calls to set up +_Problem_: Phonegap 3.5 seems to automatically include API calls to set up the push notification API, even though the phonegap project is not configured to use a push notification plugin... Building the XCode project from PhoneGap results in the inclusion of the following lines in -*Gielese/Classes/AppDelegate.m*. Apple is warning that the certificate does +_Gielese/Classes/AppDelegate.m_. Apple is warning that the certificate does not include these entitlements. One way would be to regenerate a provisioning -profile with these entitlements, but keeping the app simpler seems to be a better idea. - +profile with these entitlements, but keeping the app simpler seems to be a better idea. ``` // repost all remote and local notification using the default NSNotificationCenter so multiple plugins may respond @@ -341,8 +254,5 @@ profile with these entitlements, but keeping the app simpler seems to be a bette } ``` - Commenting these out, rebuilding, and resubmitting, is supposed to fix the problem (currently waiting for proof from the approval process). - - diff --git a/apps/gielese/ClientDevelopment.md b/apps/gielese/ClientDevelopment.md index 4092c2c3..92b695e4 100644 --- a/apps/gielese/ClientDevelopment.md +++ b/apps/gielese/ClientDevelopment.md @@ -1,139 +1,96 @@ -# Gïelese client development - +# Gïelese client development **For information on Phonegap, see phonegap/README.md** +## Getting started notes -# Getting started notes - - -The frontend client uses *Node.js*'s environment, and specifically *Brunch.io* +The frontend client uses _Node.js_'s environment, and specifically _Brunch.io_ for compilation and project structure management. In order to prepare the -development environment, first install *Node.js* and *npm* (Node Package +development environment, first install _Node.js_ and _npm_ (Node Package Manager), then: +1.) In _~/main/apps/aajege/src/sma-client/_ run _npm install_. +2.) As a convenience, add _node_modules/.bin/_ to your _$PATH_ variable - 1.) In *~/main/apps/aajege/src/sma-client/* run *npm install*. - 2.) As a convenience, add *node_modules/.bin/* to your *$PATH* variable - - -Familiarize yourself a little with *Brunch.io*, but generally speaking you'll +Familiarize yourself a little with _Brunch.io_, but generally speaking you'll be most interested in: - brunch watch --server - Buildin for release on the other hand will require: - brunch build --production - This will minify everything to prepare it for web or inclusion in apps. +## Languages required -# Languages required - - - * Coffeescript and Literate Coffeescript - * CSS is written in Stylus: http://learnboost.github.io/stylus/ - * Templates are in eco: https://github.com/sstephenson/eco. Someone could - change this if they feel the need, because eco may be on the way out. - - -# Project structure +- Coffeescript and Literate Coffeescript +- CSS is written in Stylus: +- Templates are in eco: . Someone could + change this if they feel the need, because eco may be on the way out. +## Project structure This is just a short overview to the most important files and structure. For details, look at any source file For details, look at any source file - -* *config.coffee*: handles brunch configuration, file concatenation order +- _config.coffee_: handles brunch configuration, file concatenation order overrides, minimization, and managing source directories. +- _package.json_: build dependencies and installation configuration for _npm_. -* *package.json*: build dependencies and installation configuration for *npm*. - - -* *app/*: Source! +- _app/_: Source! +## app/ structure -# app/ structure - - -* *application.coffee*: the main application file, handles initialization of +- _application.coffee_: the main application file, handles initialization of all the basic stuff, routers, client-side database models and server/client synchronization. - -* *routers/routers.coffee*: URL routing and view processing. New views need to +- _routers/routers.coffee_: URL routing and view processing. New views need to be set up here. - -* *views/*: The views directory should include one folder per view, each - containing its own *templates/* and *styles/* folder. The build process +- _views/_: The views directory should include one folder per view, each + containing its own _templates/_ and _styles/_ folder. The build process automatically finds where to include new templates and styles, and there is no need to include them anywhere. Any general view documentation should be included in the main view file in each directory. - -* *models/*: one file per model or collection. Models should be well +- _models/_: one file per model or collection. Models should be well documented, if they aren't, they need to be. - -# Data structure / Models intro - +## Data structure / Models intro The database structure data is fetched from the server on app initialization and stored locally. (This means, word information, word relations to media files, etc.). Some user data is more or less always live, and data on user activity is synced automatically when a connection is available. - Backbone.js handles data storage, and generates models and collections for searching htesse. - **General note on Collections and data fetching** - TODO: @fetch method; @server.offline_media vs. @server.path +### Concepts -## Concepts - - -*Concept* is a general term for learning information. It may be a word, an +_Concept_ is a general term for learning information. It may be a word, an image or an audio file, but the data is all heavily cross-linked so that it is easy to find a word for an image, or a related sound file. - -## Categories / CategoryList - +### Categories / CategoryList Maintains the main screen category list, as well as is used for an organizational tool for question construction. - -## Question - +### Question Several defaults are provided which happen to line up with the progression in Gïelese, however question types are possible to be defined on the server side. These are then fetched by the client for gameplay. +### UserProgression -## UserProgression - - - - - - -# The complexity of rendering exercises... - - - - - +## The complexity of rendering exercises... diff --git a/apps/gielese/GieleseRestarting.md b/apps/gielese/GieleseRestarting.md index 4c819bc8..a47f8b0f 100644 --- a/apps/gielese/GieleseRestarting.md +++ b/apps/gielese/GieleseRestarting.md @@ -1,53 +1,39 @@ -# Overview - +# Overview Running the [Gïelese](http://gielese.no/play/) process depends on the following -things on the *gtweb* server: - +things on the _gtweb_ server: - nginx, the HTTP server, which connects to Gïelese processes - mongodb, which stores user data, points, and such. - Gïelese python processes, served via gunicorn - Nginx may be started whenever, and ideally will be running already. Mongodb must be running first, so that the Python processes can connect. - -# Starting the service - +## Starting the service Do this as your regular user account, thus the sudo password will be your usual sudo password. - -``` +```sh sudo service gielese-mongodb start ``` +2.) Then if all is good... -2.) Then if all is good... - - -``` +```sh sudo service gielese start ``` - - - NB: commands accepted by these processes are also stop, and restart; however, make sure to start mongodb first, otherwise the gielese process will not start. - -# Restarting the services - +## Restarting the services The order to restart these is such that the web service is not running without mongo, thus: - -``` +```sh sudo service gielese stop sudo service gielese-mongodb stop sudo service gielese-mongodb start diff --git a/apps/gielese/ProjectOverview.md b/apps/gielese/ProjectOverview.md index cea15a68..e234f73b 100644 --- a/apps/gielese/ProjectOverview.md +++ b/apps/gielese/ProjectOverview.md @@ -1,68 +1,51 @@ -# Following is a quick technical overview of the project. - +# Following is a quick technical overview of the project. Gïelese is split into two major parts, the client and the server. The server is very simple, and mainly serves media, but also maintains account information and tracks user progression and grading, in addition it maintains some of the application configuration information. - The Gïelese client is more complex. It: - 1. renders exercises for users 1. downloads and syncs the media and exercise databases, allowing for offline play 1. tracks user progression, and chooses exercises based on existing progression +## Stack -# Stack - - -## Server - +### Server The server uses Python, with the Flask web framework, and data is stored in mongodb. Linguistic data is stored in XML, with references to media files and rendered into JSON. The media is either served directly, or packaged with the client for mobile apps. - In addition to using mongodb for user account information, Sqlite is used temporarily during the media install process to package media information into JSON. - We use Gunicorn to handle serving FastCGI data to the web server, which in this -case is nginx. - - -Dependencies are tracked in *requirements.txt*, with which you should use -*virtualenv* and *pip* to install and manage a local environment. +case is nginx. +Dependencies are tracked in _requirements.txt_, with which you should use +_virtualenv_ and _pip_ to install and manage a local environment. See [Server Development](ServerDevelopment.html) for more information. - -## Client - +### Client The client is built in Coffeescript, a superset of JavaScript which is compiled using node.js. Templates render into HTML5. The client uses Backbone.js to manage views, and Brunch.io to handle building, JS compression, and other tasks. - -Dependencies are tracked in *package.json*, with which you should use *npm* +Dependencies are tracked in _package.json_, with which you should use _npm_ (a part of Node.js) to install and manage the environment. - See [Client Development](ClientDevelopment.html) for more information. +### Client + PhoneGap -## Client + PhoneGap - - -Phonegap is used to manage the build process. - +Phonegap is used to manage the build process. See [Building the Gielese Apps](BuildingTheGieleseApps.html) for more information. diff --git a/apps/gielese/ServerDevelopment.md b/apps/gielese/ServerDevelopment.md index cb48987c..3570db26 100644 --- a/apps/gielese/ServerDevelopment.md +++ b/apps/gielese/ServerDevelopment.md @@ -1,217 +1,165 @@ - - -# Important packages to know about - +# Important packages to know about These, along with necessary version numbers are documented in -*requirements.txt*, however, a subset of these are important for developers +_requirements.txt_, however, a subset of these are important for developers to be aware of and work with: - 1. [flask](http://flask.pocoo.org/) - a web framework 1. [itsdangerous](http://pythonhosted.org/itsdangerous/) - a library for encoding data for transport 1. [schematics](https://schematics.readthedocs.org/en/latest/) - A JSON encoding, decoding, and validation library 1. [babel](http://babel.pocoo.org/) - i18n library -1. [transifex]() - not entirely a package, but worth knowing about here-- this is - used for maintaining translations that non-technical users can have access to - and translate. A python library is used to fetch these and install them in the server. +1. [transifex](transifex.com) - not entirely a package, but worth knowing about here-- this is + used for maintaining translations that non-technical users can have access to + and translate. A python library is used to fetch these and install them in the server. 1. [sqlalchemy](http://sqlalchemy.org/) - a database ORM for managing database models, querying, etc. 1. [pymongo](http://api.mongodb.org/python/current/) - a database library for mongodb 1. [lxml](http://lxml.de/) - an XML parsing library 1. [gunicorn](http://gunicorn.org/) - a wsgi fcgi server library - There are of course, other important libraries, but those listed above are the most critical. They are also very easy for developers unfamiliar with them to pick up. - -Of secondary interest are the Flask modules: - +Of secondary interest are the Flask modules: 1. [flask marrow mailer](http://flask-marrowmailer.readthedocs.org/en/latest/) - - a library for generating and sending emails. + a library for generating and sending emails. 1. [flask babel](https://pythonhosted.org/Flask-Babel/) - an interface to babel from flask +## Preparing the Development Environment -# Preparing the Development Environment - - -## Checking out biggies - - -## Initializing media server +### Checking out biggies +### Initializing media server Create a virtualenv, run it and initialize from requirements.txt - Create a secret key - -``` +```sh python -c "import os ; print os.urandom(24)" > secret_key` ``` - Initialize and install the database. - -``` +```sh python manage.py init_db python manage.py install_media -f ../data/sma_media.xml python manage.py append_lexical_data -f ../data/n_smanob_test.xml ``` - The latter only installs/updates definitions for existing words from the first step, if you want to just install everything, use: - -``` +```sh python manage.py install_lexicon -f ../data/n_smanob.xml ``` +### Prepare JSON files -## Prepare JSON files - - -``` +```sh python manage.py prepare_json ``` - -# Internationalisation - +## Internationalisation Extracting is a little tricky. Mind the dot at the end, as we need the current directory too. - -``` +```sh pybabel extract -F babel.cfg -o translations/messages.pot ../sma-client/ . ``` +### initialising translations -## initialising translations - - -``` +```sh pybabel init -i translations/messages.pot -d translations -l sma pybabel init -i translations/messages.pot -d translations -l no pybabel init -i translations/messages.pot -d translations -l sv etc ``` +### updating -## updating - - -``` +```sh pybabel extract -F babel.cfg -o translations/messages.pot ../sma-client/ . pybabel update -i translations/messages.pot -d translations ``` +### compiling -## compiling - - -``` +```sh pybabel compile -d translations ``` - -## Updating from transifex - +### Updating from transifex In order to use the transifex client, you need two things: - - * the gïelese virtual environment enabled - * a user-specific configuration file for transifex in your own home - directory: ~/.transifexrc, otherwise, the - project-specific configuration is already checked in in - `src/media-serv/.tx/config` - +- the gïelese virtual environment enabled +- a user-specific configuration file for transifex in your own home + directory: ~/.transifexrc, otherwise, the + project-specific configuration is already checked in in + `src/media-serv/.tx/config` Transifex Documentation: http://support.transifex.com/customer/portal/articles/1000855-configuring-the-client - -## user-specific file: ~/.transifexrc - +### user-specific file: ~/.transifexrc The short of it is to copy all this, and replace the password. If more is necessary, refer to docs. Token must be left blank. - -``` +```sh [https://www.transifex.com] hostname = https://www.transifex.com password = yourpasswordgoeshere! - token = + token = username = aajegebot ``` - -## Basic operations - +### Basic operations Once the virtualenv is enabled properly, this should mean that the transifex command line client is available to use. Typically, all you should need to be concerned with for fetching new translations is: - -``` +```sh tx pull ``` +A specific language can be specified also: -A specific language can be specified also: - - -``` +```sh tx pull -l sma tx pull --language sma ``` - After updating translation strings in messages.pot, send them to the server for translators to start working: - -``` +```sh tx push --source ``` - If you have made modifications locally to any of the translation files, you will need to include the `--translations` flag. - Further documentation on the command line tool's various options is here: +`http://support.transifex.com/customer/portal/articles/960804-overview` -http://support.transifex.com/customer/portal/articles/960804-overview - - - - -## Additional docs: +### Additional docs - -``` +```sh * http://support.transifex.com/customer/portal/topics/440187-transifex-client/articles * `tx --help` ``` - -# Management scripts - +## Management scripts This is somewhat of a TODO:. There are several managment scripts for various tasks that will need to be unified at some point. Currently: - 1. manage.py - For executing Flask-Actions, as well as database installation operations from lexicon files. 2. fabfile.py - for managing deployment tasks, compiling localization strings @@ -219,11 +167,8 @@ tasks that will need to be unified at some point. Currently: 3. read_media_directory.py - Specifically for managing the media directory structures +## Data structure overview -# Data structure overview - - -# Concepts (media, or phonetic/orthographic content) - See documentation in - *lexicon_models.py* - +## Concepts (media, or phonetic/orthographic content) - See documentation in +_lexicon_models.py_ diff --git a/apps/iosapps/iOSAppDevelopmentAndDebugging.md b/apps/iosapps/iOSAppDevelopmentAndDebugging.md index 3d31168e..7f8d0d52 100644 --- a/apps/iosapps/iOSAppDevelopmentAndDebugging.md +++ b/apps/iosapps/iOSAppDevelopmentAndDebugging.md @@ -1,76 +1,55 @@ -iOS App Development And Debugging -========== - -# Debugging +# iOS App Development And Debugging +## Debugging Apple has some general debugging tips [here](https://developer.apple.com/library/ios/qa/qa1747/_index.html). The most important info is: +- sync your phone with your computer using iTunes +- after syncing, you can find crash logs in: + ** `~/Library/Logs/CrashReporter/MobileDevice/` (Mac OS X) + ** `C:\Documents and Settings\\Application Data\Apple Computer\Logs\CrashReporter\MobileDevice\` (Windows XP) + \*\* `C:\Users\\AppData\Roaming\Apple Computer\Logs\CrashReporter\MobileDevice\` (Windows Vista, 7+) -* sync your phone with your computer using iTunes -* after syncing, you can find crash logs in: -** `~/Library/Logs/CrashReporter/MobileDevice/` (Mac OS X) -** `C:\Documents and Settings\\Application Data\Apple Computer\Logs\CrashReporter\MobileDevice\` (Windows XP) -** `C:\Users\\AppData\Roaming\Apple Computer\Logs\CrashReporter\MobileDevice\` (Windows Vista, 7+) - - -# Apple dev user & iTunes Connect user - +## Apple dev user & iTunes Connect user These two are different entities, but for simplicity's sake, use the same user account for both. Also, make sure your user account is a valid e-mail address. A non-e-mail address is not accepted by iTunes Connect. +## Building and submitting -# Building and submitting - +### Building -## Building - - -* open your xcode project file in XCode -* Change the build setting from Debug to Release in Product > Scheme > Edit Scheme ... -* if you want to test on a phone, attach the phone with a USB cable, and select Product > Run (the phone must be in developer mode) - - -## Submitting +- open your xcode project file in XCode +- Change the build setting from Debug to Release in Product > Scheme > Edit Scheme ... +- if you want to test on a phone, attach the phone with a USB cable, and select Product > Run (the phone must be in developer mode) +### Submitting When the project is ready to be submitted, do as follows: +- build an archive: Product > Archive +- open the archive Window: Windows > Organizer +- submit using one of to alternatives: + \*\* **either** + **\* click on the submit button in the Archive windows (assumes your developer ID + is the same as your Apple ID / iTunes Connect ID - if not, use the next + alternative) + ** **or** + **_ export archive (click the export button) + _** open Application Loader, and log in with your iTunes Connect user account + \*\*\* click Deliver Your App, and select the exported file from two steps above -* build an archive: Product > Archive -* open the archive Window: Windows > Organizer -* submit using one of to alternatives: -** **either** -*** click on the submit button in the Archive windows (assumes your developer ID - is the same as your Apple ID / iTunes Connect ID - if not, use the next - alternative) -** **or** -*** export archive (click the export button) -*** open Application Loader, and log in with your iTunes Connect user account -*** click Deliver Your App, and select the exported file from two steps above - +## Icons +### Tools - -# Icons - - -## Tools - - -## Sizes - +### Sizes [Apple's documentation](https://developer.apple.com/library/ios/documentation/userexperience/conceptual/mobilehig/IconMatrix.html) - og: [https://developer.apple.com/library/ios/documentation/userexperience/conceptual/MobileHIG/AppIcons.html] +## Screenshots for the store pictures -# Screenshots for the store pictures - - -## iOS - +### iOS [https://developer.apple.com/library/ios/documentation/LanguagesUtilities/Conceptual/iTunesConnect_Guide/Appendices/Properties.html] diff --git a/apps/satni/RESTEndPoints.md b/apps/satni/RESTEndPoints.md index 9ac59424..50da178f 100644 --- a/apps/satni/RESTEndPoints.md +++ b/apps/satni/RESTEndPoints.md @@ -1,10 +1,12 @@ # REST API -REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/components/satni/satni.rest.js +REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/components/satni/satni.rest.js ## Dictionaries + [https://satni.uit.no/satnibackend/dictionaries] returns list of dictionaries and terminilogies in satni database. It also returns localized names for diciotnaries and terminologies in north sami, julev sami, south sami, swedish and norwegian. Below is the response given today: -``` + +```json [ { "id": "smnsme", @@ -14,10 +16,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, "description": { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Ordboka er utvikla av Giellatekno, som del av\n prosjektet maskinoversetting mellom samiske språk. Grunnlaget for ordboka\n var Giellateknos Nordsamisk-finske ordbok (som bygger bla. på ordsamlinga\n Álgu) og Marja-Liisa Olthuis og Taarna Valtonens finsk-enaresamiske ordbok\n (finansiert av det finske Sametinget)." }, "copyright": "Uspesifisert" @@ -30,10 +29,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, "description": { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Ordboka er utvikla av Giellatekno, som del av\n prosjektet maskinoversetting mellom samiske språk. Grunnlaget for ordboka\n var Giellateknos Nordsamisk-finske ordbok (som bygger bla. på ordsamlinga\n Álgu) og Marja-Liisa Olthuis og Taarna Valtonens finsk-enaresamiske ordbok\n (finansiert av det finske Sametinget)." }, "copyright": "Uspesifisert" @@ -65,10 +61,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Ordboka byggjer på ordlistematerialet utarbeida av Albert Jåma og Tove\n Brustad som finst på ", @@ -82,10 +75,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Sátnegirjji vuođđun lea Albert Jåma ja Tove Brustad sátnelistui, guđe lea\n gávdnomis ", @@ -99,10 +89,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "sma", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Dan baakoegærjan sïsvege båata Albert Jåman jïh Tove\n Brustaden baakoelæstojste, mah leah ", @@ -116,10 +103,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "smj", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Albert Jåma ja Tove Brustada báhkolistatjoakkáldahka l vuodon dán báhkogirjjáj. Báhkolistatjoakkáldahka l sadjihin ", @@ -161,11 +145,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n\n ", - "\n " - ], + "#text": ["\n ", "\n\n ", "\n "], "p": [ { "#text": [ @@ -189,11 +169,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n\n ", - "\n " - ], + "#text": ["\n ", "\n\n ", "\n "], "p": [ { "#text": [ @@ -245,11 +221,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n\n ", - "\n " - ], + "#text": ["\n ", "\n\n ", "\n "], "p": [ { "#text": [ @@ -273,11 +245,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n\n ", - "\n " - ], + "#text": ["\n ", "\n\n ", "\n "], "p": [ { "#text": [ @@ -329,18 +297,12 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Ordboka byggjer på ordlistematerialet utarbeida av Albert Jåma og Tove\n Brustad som finst på sine nettsider. I tillegg er dei vanlegaste orda frå sørsamiske\n tekstar lagt til, og om lag 1250 sørsamiske stadnamn er henta frå Statens\n Kartverk og det svenske Sametingets internettsider. Alle verba i Verbh! er\n lagt inn, med svensk omsetjing. I alt inneheld ordboka omtrent 8750 norske\n lemma. Vær obs på at ordboka er blitt til ved å snu sørsamisk-norsk\n ordbok, og at ordboka dermed vil mangle en del vanlige norske ord." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Sátnegirjji vuođđun lea Albert Jåma ja Tove Brustad sátnelistui, guđe lea\n gávdnomis ", @@ -354,10 +316,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "sma", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ "Dan baakoegærjan sisvege båata Albert Jåman jïh Tove Brustaden baakoelæstojste, mah leah", @@ -371,10 +330,7 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon }, { "xml:lang": "smj", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": { "#text": [ " Albert Jåma ja Tove Brustada báhkolistatjoakkáldahka l vuodon dán báhkogirjjáj. Báhkolistatjoakkáldahka l sadjihin ", @@ -416,18 +372,12 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Enaresamisk → finsk og Finsk → Enaresamisk ordbok bygger på Valtonen og\n Olthuis si Inarinsaame-suomi-inarinsaame-ordbok (ca 20000 ordpar).\n Materialet blir kontinuerlig utvida av Giellatekno-gruppa ved UiT." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Enaresamisk → finsk og Finsk → Enaresamisk ordbok bygger på Valtonen og\n Olthuis si Inarinsaame-suomi-inarinsaame-ordbok (ca 20000 ordpar).\n Materialet blir kontinuerlig utvida av Giellatekno-gruppa ved UiT." } ], @@ -460,18 +410,12 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Enaresamisk → finsk og Finsk → Enaresamisk ordbok bygger på Valtonen og\n Olthuis si Inarinsaame-suomi-inarinsaame-ordbok (ca 20000 ordpar).\n Materialet blir kontinuerlig utvida av Giellatekno-gruppa ved UiT." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Enaresamisk → finsk og Finsk → Enaresamisk ordbok bygger på Valtonen og\n Olthuis si Inarinsaame-suomi-inarinsaame-ordbok (ca 20000 ordpar).\n Materialet blir kontinuerlig utvida av Giellatekno-gruppa ved UiT." } ], @@ -500,34 +444,22 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Ordboka er utvikla av Giellatekno og Divvun ved UiT Norges arktiske\n universitet, og er basert på Nordsamisk-norsk ordbok." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Sátnegirjji leat Giellatekno ja Divvun buvttadan UiT Norgga árktalaš\n universitehtas ja dat vuođđuduvva Davvisámi-dáru sátnegirjái." }, { "xml:lang": "sma", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Giellatekno jïh Divvun Norgga árktalaš univeristehtesne lea dam baakoegærjam dorjeme! Noerhtesaemien-daaroen baakoegærja lij jarngense daeenie barkosne." }, { "xml:lang": "smj", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Báhkogirjje l åvddånahtedum Giellateknos ja Divvunis Vuona arktalasj Universitehtan, Nuorttasáme-dárro báhkogirjje l vuodon dán bargguj." } ], @@ -560,34 +492,22 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Ordboka er utvikla av Giellatekno og Divvun ved UiT Norges arktiske\n universitet, med utgangspunkt i Nils Jernslettens ordbok (med forfatterens\n tillatelse). Den er senere blitt utvidet gjennom flere prosjekter." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Sátnegirjji leat Giellatekno ja Divvun buvttadan UiT Norgga árktalaš\n universitehtas ja dat vuođđuduvva Nils Jernsletten sátnegirjái (čálli lobiin). Dan leat\n maŋŋelot viiddidan eará prošeavttain." }, { "xml:lang": "sma", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Giellatekno jïh Divvun UIT Norgga árktalaš univeristehtesne leah dam baakoegærjam dorjeme. Baakoegærjan aalkoe lij Nils Jernsletten baakoegærja (Tjaelije lea jïjtje luhpiem vadteme). Dan mænngan lea baakoegærjam ovmessie prosjeekti tjïrrh vijriedovveme ." }, { "xml:lang": "smj", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Báhkogirjje åvddånahtedum Giellateknos ja Divvunis Vuona arktalasj universitehtan, vuodon dán bargguj la Nils Jernslettena báhkogirjje (tjálle dåhkkidimijn). Dán maŋŋela la báhkogirjje vijdeduvvam ietjá prosjevtaj baktu." } ], @@ -616,50 +536,30 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon "description": [ { "xml:lang": "no", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Innholdet i Termwikien er basert på Sametingets termsamling. Sametingets\n termsamling ble overført til Termwikien høsten 2013, og det har blitt lagt\n inn noe ny terminologi etter det. Sátni.org vil regelmessig bli oppdatert\n med det nyeste innholdet på Termwikien." }, { "xml:lang": "se", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Tearbmawiki vuođđun lea Sámadikki tearbmačoakkáldat. Sámedikki tearbmačoakkáldat\n sirdojuvvui Tearbmawikii čakčat 2013 ja dasa lea lasihuvvon tearpmat dan maŋŋá.\n Lasihuvvon tearpmat ihtet Sátni.org siidduide jeavddalaččat." }, { "xml:lang": "sma", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Termwikijen sïsvegisnie leah Saemiedigkien teermelæstoeh. Saemiedigkien teermelæstoeh tjaktjen 2013 Termwikijasse sirtesovvin, jïh dan mænngan leah vielie teermh læssanamme. Daamhtetje sijhtieh orre baakoeh jïh teermh Sátni.org:ese lissiehtidh" }, { "xml:lang": "smj", - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "p": "Sámedikke tærmmatjoakkáldahka l vuodon Tærmmawikijij. Sámedikke tærmmatjoakkáldahka sirdeduváj Tærmmawikijij tjavtjan 2013, ja ådå terminologija l laseduvvam dan maŋŋela. Báhko.org sjaddá juovnnát ådåstuhteduvvat ådåsamos sisanojn Tærmmawikijis." } ], "copyright": "Sametinget, Giellagáldu, UiT Norgga árktalaš universitehta", "editors": { - "#text": [ - "\n ", - "\n " - ], + "#text": ["\n ", "\n "], "editor": { - "#text": [ - "\n ", - "\n ", - "\n ", - "\n " - ], + "#text": ["\n ", "\n ", "\n ", "\n "], "name": "Giellagáldu", "url": "http://www.giella.org", "email": "info@giella.org" @@ -669,17 +569,17 @@ REST points are called in file at $GTHOME/apps/risten2/frontend/assets/js/compon ] ``` - ## Search + https://satni.uit.no/satnibackend/search returns search result for queries that are three character long or longer. There are two query parameters for search: -*query= Query string sent to the database, which compiles a regex "^*" -*dict= Optional parameter to specify which dictionary or terminiology to query. If no dictionary is specified, all dictionaries are queried. +_query= Query string sent to the database, which compiles a regex "^_" +\*dict= Optional parameter to specify which dictionary or terminiology to query. If no dictionary is specified, all dictionaries are queried. - -*Example: +\*Example: [https://satni.uit.no/satnibackend/search?query=muna] -``` + +```json [ { "term": "munakoiso", @@ -724,9 +624,9 @@ There are two query parameters for search: ] ``` - -*Example with dictionaries specified +\*Example with dictionaries specified [https://satni.uit.no/satnibackend/search?query=lin&dict=smanob] + ``` { { @@ -741,9 +641,9 @@ There are two query parameters for search: } ``` - [https://satni.uit.no/satnibackend/search?query=lin&dict=nobsma] -``` + +```json [ { "term": "Lindsetdalen", @@ -832,9 +732,9 @@ There are two query parameters for search: ] ``` - [https://satni.uit.no/satnibackend/search?query=lin&dict=smenob] -``` + +```json [ { "term": "Lina", @@ -948,10 +848,7 @@ There are two query parameters for search: "term": "linnjábiila", "dict": "smenob", "lang": "sme", - "langs": [ - "nob", - "nob" - ] + "langs": ["nob", "nob"] }, { "term": "linnjáhuksen", @@ -974,9 +871,9 @@ There are two query parameters for search: ] ``` - [https://satni.uit.no/satnibackend/search?query=lin&dict=nobsme] -``` + +```json [ { "term": "lin", @@ -1054,10 +951,7 @@ There are two query parameters for search: "term": "linje", "dict": "nobsme", "lang": "nob", - "langs": [ - "sme", - "sme" - ] + "langs": ["sme", "sme"] }, { "term": "linjebygging", @@ -1104,21 +998,15 @@ There are two query parameters for search: ] ``` - [https://satni.uit.no/satnibackend/search?query=linj&dict=termwiki] -``` + +```json [ { "term": "linja", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi" - ] + "langs": ["fi", "nb", "se", "sv", "fi"] }, { "term": "linja-auto", @@ -1142,329 +1030,165 @@ There are two query parameters for search: "term": "linja-auto ja taksikaista", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linja-autokaista", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linja-autokatu", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linja-autopysäkin levennys", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linja-autopysäkki", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjal", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi", - "fi", - "en", - "smn" - ] + "langs": ["fi", "nb", "se", "sv", "fi", "fi", "en", "smn"] }, { "term": "linjal", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi", - "fi", - "en", - "smn", - "se", - "fi", - "nb" - ] + "langs": ["fi", "nb", "se", "sv", "fi", "fi", "en", "smn", "se", "fi", "nb"] }, { "term": "linjanjako", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjatuomari", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "smj", - "sv", - "fi", - "se" - ] + "langs": ["fi", "nb", "smj", "sv", "fi", "se"] }, { "term": "linjaverkko", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjašauto", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sma", - "smj", - "sv", - "smn" - ] + "langs": ["fi", "nb", "se", "sma", "smj", "sv", "smn"] }, { "term": "linje", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi" - ] + "langs": ["fi", "nb", "se", "sv", "fi"] }, { "term": "linje", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi" - ] + "langs": ["fi", "nb", "se", "sv", "fi"] }, { "term": "linjedeling", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjedelning", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjediagram", "dict": "termwiki", "lang": null, - "langs": [ - "nb", - "se", - "fi" - ] + "langs": ["nb", "se", "fi"] }, { "term": "linjedomare", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "smj", - "sv", - "fi", - "se" - ] + "langs": ["fi", "nb", "smj", "sv", "fi", "se"] }, { "term": "linjedommer", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "smj", - "sv", - "fi", - "se" - ] + "langs": ["fi", "nb", "smj", "sv", "fi", "se"] }, { "term": "linjenett", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjenät", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjeskriver", "dict": "termwiki", "lang": null, - "langs": [ - "nb", - "se" - ] + "langs": ["nb", "se"] }, { "term": "linjetrafik", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi", - "nb", - "se", - "sv", - "sv" - ] + "langs": ["fi", "nb", "se", "sv", "fi", "nb", "se", "sv", "sv"] }, { "term": "linjetrafikk", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv" - ] + "langs": ["fi", "nb", "se", "sv"] }, { "term": "linjáduopmár", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "smj", - "sv", - "fi", - "se" - ] + "langs": ["fi", "nb", "smj", "sv", "fi", "se"] }, { "term": "linjála", "dict": "termwiki", "lang": null, - "langs": [ - "fi", - "nb", - "se", - "sv", - "fi", - "fi", - "en", - "smn", - "se", - "fi", - "nb" - ] + "langs": ["fi", "nb", "se", "sv", "fi", "fi", "en", "smn", "se", "fi", "nb"] }, { "term": "linjärmetod", "dict": "termwiki", "lang": null, - "langs": [ - "nb", - "se", - "sv" - ] + "langs": ["nb", "se", "sv"] } ] ``` - - - -*Example with all dictionaries +\*Example with all dictionaries [https://satni.uit.no/satnibackend/search?query=hele&dict=all] -``` + +```json [ { "term": "Helena", @@ -1488,10 +1212,7 @@ There are two query parameters for search: "term": "hele og fulle sannhet", "dict": "termwiki", "lang": null, - "langs": [ - "nb", - "se" - ] + "langs": ["nb", "se"] }, { "term": "hele tiden", @@ -1503,21 +1224,13 @@ There are two query parameters for search: "term": "heler", "dict": "termwiki", "lang": null, - "langs": [ - "smj", - "nb", - "sv" - ] + "langs": ["smj", "nb", "sv"] }, { "term": "heleri", "dict": "termwiki", "lang": null, - "langs": [ - "smj", - "nb", - "sv" - ] + "langs": ["smj", "nb", "sv"] }, { "term": "heleys", @@ -1534,14 +1247,14 @@ There are two query parameters for search: ] ``` - ## Article search -https://satni.uit.no/satnibackend/article/
    returns article or articles. +`https://satni.uit.no/satnibackend/article/
    ` returns article or articles. -*Example +\*Example [https://satni.uit.no/satnibackend/article/munanleikkuri] -``` + +```json { { "term": "munanleikkuri", @@ -1613,16 +1326,14 @@ https://satni.uit.no/satnibackend/article/
    returns article or articles. } ``` - - - ## Dictionary metadata -Fourth REST point returns metadata information about a dictionary or terminology. Difference is that this REST address returns an XML fragment. +Fourth REST point returns metadata information about a dictionary or terminology. Difference is that this REST address returns an XML fragment. -*Example +\*Example [https://satni.uit.no/satnibackend/dictionary/termwiki] -``` + +```xml Uspesifisert diff --git a/apps/satni/Setup.md b/apps/satni/Setup.md index f06215dd..fb414833 100644 --- a/apps/satni/Setup.md +++ b/apps/satni/Setup.md @@ -1,84 +1,82 @@ # Strategies and methods for satni.org development - Technologies used: -* XQuery/eXist (Java) -* JS/Mithril/Polythene (UI komponenter, for Mithril) -* Babel for kompilering av JS (pakkar all JS til ei stor js-fil, og lagar bakoverkompatibel kode) -* [Gulp/gulp-exist](https://github.com/olvidalo/gulp-exist) +- XQuery/eXist (Java) +- JS/Mithril/Polythene (UI komponenter, for Mithril) +- Babel for kompilering av JS (pakkar all JS til ei stor js-fil, og lagar bakoverkompatibel kode) +- [Gulp/gulp-exist](https://github.com/olvidalo/gulp-exist) Maskiner/servarar: -* Development: tomi si maskin -* test deployment server: har ikkje, bør ta i bruk gtlab for dette (med identisk oppsett jf med gtweb) -* deployment server: gtweb (ny server på veg) +- Development: tomi si maskin +- test deployment server: har ikkje, bør ta i bruk gtlab for dette (med identisk oppsett jf med gtweb) +- deployment server: gtweb (ny server på veg) Ved oppdateringar/serveromstart: - I utgangspunktet: -* service som startar eXist m.m. automatisk +- service som startar eXist m.m. automatisk Dersom ting går gale: -* installer eXist (frå kor, korleis, kva for versjon?) -** http://exist-db.org/exist/apps/doc/advanced-installation.xml -* køyr skript som installerer alt på nytt -** ikkje dokumentert enno - Tomi skriv +- installer eXist (frå kor, korleis, kva for versjon?) + \*\* http://exist-db.org/exist/apps/doc/advanced-installation.xml +- køyr skript som installerer alt på nytt + \*\* ikkje dokumentert enno - Tomi skriv To run as the eXist user: -``` + +```sh sudo -s su exist cd ``` - At the moment the user id's are assigned to different users in gtweb.uit.no and satni.uit.no. eXist is installed and run by sudo. +- run on the desired server -* run on the desired server -** run in desired directory (/home/exist/eXist) - `sudo java -jar /home/exist/installer/eXist-db-setup-3.0-acd0c14.jar -console` -** give amount of memory (2048 MB) (cache 256 default) -** password (ask tomi, sjur, børre) -** if service is not installed, run - `sudo tools/wrapper/bin/exist.sh install` + - run in desired directory (/home/exist/eXist) + `sudo java -jar /home/exist/installer/eXist-db-setup-3.0-acd0c14.jar -console` + - give amount of memory (2048 MB) (cache 256 default) + - password (ask tomi, sjur, børre) + - if service is not installed, run + `sudo tools/wrapper/bin/exist.sh install` +- run anywhere + - deploy satni.org app -* run anywhere - * deploy satni.org app - -``` +```sh cd $GTHOME/apps/risten2/backend gulp deploy --passwd (--host gtlab.uit.no) gulp reindex --passwd (--host gtlab.uit.no) ``` -** sometimes the restxq registry doesn't register rest endpoints, then do -*** open eXide -*** open /db/apps/satni/modules/SatniResource.xqm -*** delete a line / create a line / whatever change -*** save the file + +** sometimes the restxq registry doesn't register rest endpoints, then do \*** open eXide +**_ open /db/apps/satni/modules/SatniResource.xqm +_** delete a line / create a line / whatever change +**\* save the file ** store xml files +```sh cd $GTHOME/words gulp store --passwd (--host gtlab.uit.no) ``` - Cron ping task: -* now checks that something is running on port 8080 (eXist), but not what is returned -* improvement: verify that the REST request returns expected data structures +- now checks that something is running on port 8080 (eXist), but not what is returned +- improvement: verify that the REST request returns expected data structures We also need a `make check` like gulp(?) command: -* a single command that will run a set of predefined tests to verify code integrity -* it should be easy to add new tests and test cases -* gulp-exist for xquery testing -* XXX for JS testing (UI testing) - (eg. http://nightwatchjs.org/ - nodejs tool using Selenium/WebDriver) +- a single command that will run a set of predefined tests to verify code integrity +- it should be easy to add new tests and test cases +- gulp-exist for xquery testing +- XXX for JS testing (UI testing) - (eg. - nodejs tool using Selenium/WebDriver) Load testing: -* already documented, requires `ab` (Apache Benchmarking) + +- already documented, requires `ab` (Apache Benchmarking) diff --git a/apps/satni/StressTesting.md b/apps/satni/StressTesting.md index a2f1acf6..9035b2f1 100644 --- a/apps/satni/StressTesting.md +++ b/apps/satni/StressTesting.md @@ -1,18 +1,15 @@ # Stress testing satni.org - To test satni.org for high load and to see if it stays up we have used apache http server benchmarking tool [ab](http://httpd.apache.org/docs/2.2/programs/ab.html). Commands and their results from 17/11/2016 and 18/11/2016: - `ab -c 10 -n 1500 http://gtweb.uit.no:8080/exist/restxq/satni/search?query=dep&dict=all` where -c is number of concurrent workers, and -n is number of requests sent. - Example result: -``` +```text This is ApacheBench, Version 2.3 <$Revision: 1706008 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ @@ -75,10 +72,8 @@ Percentage of the requests served within a certain time (ms) 100% 1546 (longest request) ``` - The server has been experiencing out of memory crashes. Now there is Oracle Java installed on gtweb which is used now. This seems to be more stable, but this needs more time and more testing to be sure of. - To see how the server handles the stress testing, you can also use the [monitoring webapp that comes with eXist](http://gtweb.uit.no:8080/exist/apps/monex/index.html). **NB!** User name and password required, as Sjur or Tomi! diff --git a/apps/satni/info.md b/apps/satni/info.md index 1687d3fd..b78774a1 100644 --- a/apps/satni/info.md +++ b/apps/satni/info.md @@ -1,68 +1,48 @@ -satni.org -======= +# satni.org [satni.org](https://satni.org) is a service running on `satni.uit.no`. - The webapp and its files are found in `/home/exist/eXist` - It has been started by running this command: -``` +```sh sudo service eXist-db start #EXIST_HOME=/home/exist/eXist $EXIST_HOME/bin/startup.sh ``` +## Apps -# Apps - - -## Backend - +### Backend Serves as a restful interface to dictionaries and term collections. - An obsolete version seems to live in `$GTHOME/apps/risten2/backend`, the current version is found in `/home/exist/eXist/webapp/WEB-INF/data/fs/db/apps/satni/`. The obsolete code is correct and added to the latter folder by eXist when uploading files. - -## Frontend - +### Frontend Serves as the search interface to dictionaries and term collections found in the satni backend. - An obsolete version seems to live in `$GTHOME/apps/risten2/frontend/index.js`, the current version is found in `/home/exist/eXist/webapp/WEB-INF/data/fs/db/apps/satni/index.js` The obsolete version is compiled from `assets/` files and added to the latter folder by eXist when uploading files. - To get started with frontend development run `npm install` in `frontend/`. This will install all dependencies. - `main.js` is the main code that loads satni component. satni is divided into smaller components, dictlist, search and articles. Each of them are in the satni interface in a row. They are called by the main satni component. Additionally there are rest component, util component, and translation component. They are dependent on `mithril` js framework, which is located in `mithril.js.org`. Also Bootstrap is used, but this is included in the backend by eXist shared resources. CSS file is located in `assets/css/` folder. - -# Poking the backend app and get .json answers - +## Poking the backend app and get .json answers To list all dictionaries: [http://satni.uit.no:8080/exist/restxq/satni/dictionaries] - To search for all lemmas in all dictionaries starting starting with "juol": [http://satni.uit.no:8080/exist/restxq/satni/search?query=juol] - To lookup the article on "juolgebáddi" [http://satni.uit.no:8080/exist/restxq/satni/article/juolgeb%C3%A1ddi] +## Meetings -# Meetings [Meetings](../../dicts/satni.org/index.html) - - - - diff --git a/apps/termwiki/UpdatingTermwikiToSatni.md b/apps/termwiki/UpdatingTermwikiToSatni.md index 8a3b0036..55fcb5af 100644 --- a/apps/termwiki/UpdatingTermwikiToSatni.md +++ b/apps/termwiki/UpdatingTermwikiToSatni.md @@ -1,33 +1,27 @@ -Updating Termwiki To Satni -======== +# Updating Termwiki To Satni To run automatic termwiki database dump, and update eXist and svn: - In the same machine where termwiki is: first run: -``` +```sh php /var/www/termwiki/maintenance/dumpBackup.php --conf=/var/www/termwiki/LocalSettings.php --current --filter=namespace:1362,1274,1354,1298,1210,1330,1338,1242,1096,1364,1102,1322,1346,1282,1234,1250,1118,1258,1314,1098,1202,1226,1266,1306,1218,1290 > /home/tomi/dump.xml ``` - - - The machine where termwiki is it doesn't have svn working. So you need to change machine. - then convert it: -``` + +```sh cd langtech/words/terms/termwiki ant run ``` - and finally store it to exist (same dir as previous step): -``` + +```sh ant store ``` - also commit the files to svn from `terms/` dir diff --git a/assets/css/giellalt-site-global.css b/assets/css/giellalt-site-global.css index 417865b0..42842228 100644 --- a/assets/css/giellalt-site-global.css +++ b/assets/css/giellalt-site-global.css @@ -40,6 +40,11 @@ header { height: calc(100% - 10em); } +/* Hide TOC if empty: */ +header > #toc:not(:has(:nth-child(2))) { + display: none +} + header h2.tocheader { margin: 0; font-size: 1em; @@ -63,11 +68,6 @@ header #left_toc { overflow: auto; } -/* Hide the first TOC entry - it is the page title: */ -header #left_toc > li:first-child { - display: none; -} - header #left_toc li::before { content: "- "; } @@ -196,7 +196,7 @@ https://stackoverflow.com/questions/9333379/check-if-an-elements-content-is-over display:inline } - header #left_toc { + header > #toc { margin: 1em 3em 2em 0; border-style: none solid none none; border-color: #e0e0e0; @@ -229,13 +229,9 @@ https://stackoverflow.com/questions/9333379/check-if-an-elements-content-is-over /* --- Mobile phone screen widths: --- */ @media print, screen and (max-width: 720px) { - header #left_toc { - position: static; - float: none; - border-style: none; - margin: 0; - padding: 0; - background: #fff; +/* Hide TOC on phone screens:: */ + header > #toc { + display: none } header { diff --git a/courses/index.md b/courses/index.md index 04002c70..6a6886a8 100644 --- a/courses/index.md +++ b/courses/index.md @@ -1,7 +1,3 @@ - -Courses and arrangements related to the *giellalt* infrastructure -=============================================================== - +# Courses and arrangements related to the _giellalt_ infrastructure [Installation och användning av språkteknologiska program för nordiska minoritetsspråk](nordisk/program-bruk.html) (29.8.-2.9.22) - diff --git a/courses/nordisk/index.md b/courses/nordisk/index.md index 26c48666..94497ade 100644 --- a/courses/nordisk/index.md +++ b/courses/nordisk/index.md @@ -1,8 +1,6 @@ - - - # Kurs i installation och användning av språkteknologiska program för nordiska minoritetsspråk + Tromsö, 29.8-2.9 2022 - [Praktiske opplysningar](praktisk.md) -- [Program](program-bruk.md) \ No newline at end of file +- [Program](program-bruk.md) diff --git a/courses/nordisk/praktisk.md b/courses/nordisk/praktisk.md index 7d3913a4..df9dd5b5 100644 --- a/courses/nordisk/praktisk.md +++ b/courses/nordisk/praktisk.md @@ -1,34 +1,32 @@ - # Praktiske opplysningar -for * Kurs i installation och användning av språkteknologiska program för nordiska minoritetsspråk* +for _Kurs i installation och användning av språkteknologiska program för nordiska minoritetsspråk_ Tromsö, 29.8-2.9 2022 ## Lokaltransport flyplass - sentrum - universitetet ### Billett til bybussen -Det enklaste er å kjøpe billett via ein app, **Troms Billett** (Android og Iphone). +Det enklaste er å kjøpe billett via ein app, **Troms Billett** (Android og Iphone). -Pris: +Pris: -- 1 reise (1 time 30 min, med overgang): NOK 39,- -- 24 timar: NOK 110,- +- 1 reise (1 time 30 min, med overgang): NOK 39,- +- 24 timar: NOK 110,- - 7 dagar: NOK 270,- Det billegaste er å bestille ein periodebillett for 7 dagar, då er det også mogleg å ta bussen til og frå flyplassen. -Det er også mogleg å betale bussbilletten (ei reise) via SMS (med å sende meldinga BUSS til telefonnummer 2002. +Det er også mogleg å betale bussbilletten (ei reise) via SMS (med å sende meldinga BUSS til telefonnummer 2002. Det er **ikkje** mogleg å kjøpe billett på bussen, verken med kontantar eller med kredittkort. ### Bussruter -- Buss mellom flyplassen og sentrum: buss **42** og buss **24**. -- Buss frå sentrum til UiT: buss **20, 21, 34**. -- Buss frå UiT til sentrum: buss **20, 21, 33**. +- Buss mellom flyplassen og sentrum: buss **42** og buss **24**. +- Buss frå sentrum til UiT: buss **20, 21, 34**. +- Buss frå UiT til sentrum: buss **20, 21, 33**. - Buss frå flyplassen til UiT: buss **24 + 33** eller **42 + 33** (bytte i Giæverbukta) -- Buss frå UiT til flyplassen: buss **34 + 24** (mot Flyplassen) eller **34 + 42** (mot Eidkjosen) (bytte i Giæverbukta) +- Buss frå UiT til flyplassen: buss **34 + 24** (mot Flyplassen) eller **34 + 42** (mot Eidkjosen) (bytte i Giæverbukta) Rutetider finst i appen **Troms Reise** (Android og Iphone). - diff --git a/courses/nordisk/program-bruk.md b/courses/nordisk/program-bruk.md index 93e1b31d..c23c3e50 100644 --- a/courses/nordisk/program-bruk.md +++ b/courses/nordisk/program-bruk.md @@ -1,71 +1,69 @@ +# Program for _Kurs i installation och användning av språkteknologiska program för nordiska minoritetsspråk_ -# Program for *Kurs i installation och användning av språkteknologiska program för nordiska minoritetsspråk* Tromsö, 29.8-2.9 2022 - - - ## Måndag 29.8. 12.15 - 16.00 + [Rom: Teorifagbygget hus 1 rom 1.317](https://use.mazemap.com/#v=1&zlevel=3¢er=18.971927,69.681322&zoom=18&campusid=5&sharepoitype=poi&sharepoi=176209) - + - Teknologin bakom språkteknologi för minoritetsspråk - - Olika sätt att göra språkteknologi - - Källkoden bakom språkteknologin för nordiska minoritetsspråk + - Olika sätt att göra språkteknologi + - Källkoden bakom språkteknologin för nordiska minoritetsspråk - Hur installera program i olika operativsystem: Divvun Installer - Tangentbord (tastatur) för datorer - - Hur installera tangentborden på olika operativsystem - + - Hur installera tangentborden på olika operativsystem + ## Tisdag 30.8. 09.15 - 16.00 + [Rom: Teorifagbygget hus 1 rom 1.317](https://use.mazemap.com/#v=1&zlevel=3¢er=18.971927,69.681322&zoom=18&campusid=5&sharepoitype=poi&sharepoi=176209) - Rättstavningsprogram - - Hur och var använda rättstavningsprogram - - De grammatiska modellerna i rättstavningsprogrammet - - Principen bakom förslagsmekanismerna + - Hur och var använda rättstavningsprogram + - De grammatiska modellerna i rättstavningsprogrammet + - Principen bakom förslagsmekanismerna - Github - - Hur följa med på arbetet med språkprogrammen - + - Hur följa med på arbetet med språkprogrammen ## Onsdag 31.8. 09.15 - 16.00 + [Rom: Teorifagbygget hus 1 rom 1.425](https://use.mazemap.com/#v=1&zlevel=4¢er=18.972168,69.681349&zoom=18&campusid=5&sharepoitype=poi&sharepoi=176214) - Elektroniska ordböcker - - Olika sätt att använda samma ordbok - - Ordbokstypologi: Varför ser ordböckerna ut så som de gör? + - Olika sätt att använda samma ordbok + - Ordbokstypologi: Varför ser ordböckerna ut så som de gör? - Program för grammatisk analys - - Grammatisk analys och generering av ordformer - - Morfologisk och syntaktisk textanalys - - Principen bakom dependensanalysen + - Grammatisk analys och generering av ordformer + - Morfologisk och syntaktisk textanalys + - Principen bakom dependensanalysen - Program för språkgranskning (Grammatikkontroll) - - Skillnaden mellan rättstavningsprogram och program för språkgranskning - - Gränssnitt för grammatikkontrollprogram - + - Skillnaden mellan rättstavningsprogram och program för språkgranskning + - Gränssnitt för grammatikkontrollprogram ## Torsdag 1.9. 09.15 - 16.00 + [Rom: Teorifagbygget hus 3 rom 3.416](https://use.mazemap.com/#v=1&zlevel=4¢er=18.970928,69.681773&zoom=18&campusid=5&sharepoitype=poi&sharepoi=176644) - Elektroniska textsamlingar - - Textsamlingarna för nordiska minoritetsspråk - - Användningen av korpus som skrivstöd + - Textsamlingarna för nordiska minoritetsspråk + - Användningen av korpus som skrivstöd - Maskinöversättning - - Hur använda maskinöversättning som verktyg i skrivprocessen + - Hur använda maskinöversättning som verktyg i skrivprocessen - Skrivstöd för översättare - - översättningsminne och översättningsverktyg -- Program för språkundervisning - - Programmet Oahpa och tillsvarande program + - översättningsminne och översättningsverktyg +- Program för språkundervisning + - Programmet Oahpa och tillsvarande program - Tangentbord (tastatur) för mobiltelefon - - Hur installera tangentborden på mobiltelefonen - - Principen bakom språkhjälpen i tangentborden för mobiltelefoner - + - Hur installera tangentborden på mobiltelefonen + - Principen bakom språkhjälpen i tangentborden för mobiltelefoner ## Fredag 2.9. 09.15 - 12.00 + [Rom: Teorifagbygget hus 4 rom 4.262](https://use.mazemap.com/#v=1&zlevel=2¢er=18.969613,69.681562&zoom=18&campusid=5&sharepoitype=poi&sharepoi=177083) - Ljudteknologi - - Vad finns, vad finns inte, vad er perspektiven framåt + - Vad finns, vad finns inte, vad er perspektiven framåt - Sammanfatning av veckan - - Vad tar vi med oss från kursen? - - Vilka prioriteringar bör genomföras inom språkteknologin för minoritetsspråk? - - Hur arbetar vi framöver? - - Andra frågor? Vad fattades på kursen? - + - Vad tar vi med oss från kursen? + - Vilka prioriteringar bör genomföras inom språkteknologin för minoritetsspråk? + - Hur arbetar vi framöver? + - Andra frågor? Vad fattades på kursen? diff --git a/dicts/10000.md b/dicts/10000.md index b925860b..0dddd8fe 100644 --- a/dicts/10000.md +++ b/dicts/10000.md @@ -1,5 +1,4 @@ -The first 10000 FO words -======================== +# The first 10000 FO words The words should have been sorted after column 4, and thereafter after column 3. Instead, columd 4 are somewhat garbeled, but sorted in groups. diff --git a/dicts/2000.md b/dicts/2000.md index f406a8ed..5d88a11f 100644 --- a/dicts/2000.md +++ b/dicts/2000.md @@ -1,5 +1,4 @@ -Top-2000 words from first run -============================= +# Top-2000 words from first run These are the words from the first FAD parallel text run. diff --git a/dicts/DictionaryManipulation.md b/dicts/DictionaryManipulation.md index ac08f24f..dbc81199 100644 --- a/dicts/DictionaryManipulation.md +++ b/dicts/DictionaryManipulation.md @@ -1,24 +1,21 @@ - # Dictionary manipulation - Compilation is documented elsewhere, for [Interactive dictionaries](InteractiveDictionaryCompilation.html) and [Web dictionaries](WebdictCompilation.html). -# Dictionary scripts +## Dictionary scripts - General dtds and scripts are in `main/words/dicts/scripts` (yes, in scripts) - Dictionary specific dtds are in `main/words/dicts/LANG1LANG2/dtd` - Dictionary-specific scripts are in `main/words/dicts/LANG1LANG2/scripts` +## Changing dictionary direction - -# Changing dictionary direction Changig from LANG1LANG2 to LANG2LANG1: Script is in `main/words/dicts/upside2down/` -1. Collect all LANG1 files into one with `main/words/dicts/scripts/collect-dict-parts.xsl` +1. Collect all LANG1 files into one with `main/words/dicts/scripts/collect-dict-parts.xsl` 1. Run the conversion with the scommand below (exchange sjdrus with whatever) -``` +```sh java -Xmx2048m net.sf.saxon.Transform -it:main gt_sd2td.xsl inFile=all-merged-pos_sjdrus.xml java -Xmx2048m net.sf.saxon.Transform -it:main gt_mergeEntry_pos_td.xsl inFile=outDir/all-merged-pos_sjdrus_rus.xml ``` diff --git a/dicts/DictionarySources.md b/dicts/DictionarySources.md index 960ff7c8..23d4fbf4 100644 --- a/dicts/DictionarySources.md +++ b/dicts/DictionarySources.md @@ -1,99 +1,99 @@ # Dictionary Sources ![Warning](../images/Warning.svg) -__*Under construction.*__ +**_Under construction._** This page contains a dynamically built list of all dictionary repositories. Private repositories are not listed. Dictionary sources are grouped according to the **source** language, **_NOT_** the target language(s). -# Grouped according to maturity of the resources +## Grouped according to maturity of the resources -The [maturity levels](../MaturityClassification.md) are *production, beta, alpha* and *experimental*. +The [maturity levels](../MaturityClassification.md) are _production, beta, alpha_ and _experimental_. {% assign lang_repos = site.github.public_repositories|jsonify %} -## [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](../MaturityClassification.html) Production dictionary resources +### [![Maturity: Production](https://img.shields.io/badge/Maturity-Production-brightgreen.svg)](../MaturityClassification.html) Production dictionary resources
    -## [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](../MaturityClassification.html) Beta dictionary resources +### [![Maturity: Beta](https://img.shields.io/badge/Maturity-Beta-yellow.svg)](../MaturityClassification.html) Beta dictionary resources
    -## [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](../MaturityClassification.html) Alpha dictionary resources +### [![Maturity: Alpha](https://img.shields.io/badge/Maturity-Alpha-red.svg)](../MaturityClassification.html) Alpha dictionary resources
    -## [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](../MaturityClassification.html) Experimental dictionary resources +### [![Maturity: Experiment](https://img.shields.io/badge/Maturity-Experiment-black.svg)](../MaturityClassification.html) Experimental dictionary resources
    -## [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](../MaturityClassification.html) Dictionary resources of undefined maturity +### [![Maturity: Undefined](https://img.shields.io/badge/Maturity-Undefined-lightgrey.svg)](../MaturityClassification.html) Dictionary resources of undefined maturity
    -# Grouped according to geography +## Grouped according to geography -## Languages of the Nordic countries +### Languages of the Nordic countries
    -## Languages of Russia +### Languages of Russia
    -## Other European languages +### Other European languages
    -## Languages in North America +### Languages in North America
    -## Languages in Africa +### Languages in Africa
    -## Languages in other parts of the world +### Languages in other parts of the world
    -## Languages with no geography tag +### Languages with no geography tag
    -# Grouped according to language family +## Grouped according to language family -## Uralic Languages +### Uralic Languages
    -## Eskimo-Aleut Languages +### Eskimo-Aleut Languages
    -## Algic Languages +### Algic Languages
    -## Indoeuropean languages +### Indoeuropean languages
    -## Niger-Congo Languages +### Niger-Congo Languages
    -## Turkic Languages +### Turkic Languages
    -## Languages of other language families, isolates, artificial languages +### Languages of other language families, isolates, artificial languages
    -## Languages with no language family tag +### Languages with no language family tag
    diff --git a/dicts/GammelKompilereNettordbok.md b/dicts/GammelKompilereNettordbok.md index 931ab5e7..1a25376a 100644 --- a/dicts/GammelKompilereNettordbok.md +++ b/dicts/GammelKompilereNettordbok.md @@ -1,87 +1,69 @@ -# Obsolete (old infra) -Se denne siden med nye tagger: [FSTer i nyinfra](/lang//sme/KompilereFST.html). +Use/NVD er nå +Use/NGminip - +# Obsolete (old infra) -# Ordbok-fst-er +Se denne siden med nye tagger: [FSTer i nyinfra](/lang//sme/KompilereFST.html). +Use/NVD er nå +Use/NGminip +## Ordbok-fst-er Til dict-sma/sme.fst brukes sma.fst og sme.fst med noen modifiseringer av tagger. `dict-sma/sme.fst` er ikke definerte i Makefile og derfor kompilerer man en `sma.fst/sme.fst` som man skifter navn på når man kopierer over i /opt/ - -# Spesielt for sma - +## Spesielt for sma isma-dict.fst er kompilert med følgende prosedyre: - i gt/sma - svn up -r 59359 - deretter i denne fila: - `common/src/tag-not-save-but-oahpa.regex` - kommenter ut noen tagger slik at det blir slik: - %0 (<-) %+Hom1, %0 (<-) %+Hom2, - -# Felles for sme og sma - +## Felles for sme og sma Til `isma-dict.fst` og `dict-isme-norm.fst` brukes `isma-norm.fst` og `isme-norm.fst` med noen modifiseringer av tagger. `dict-isme-norm.fst` er definert i Makefile, men man må sikre taggene manuelt. For sørsamisk kompilerer man isma-norm.fst med endring av tagger og endrer så navnet til isma-dict.fst når man kopierer til /opt/ - - - Både `dict-sma/sme.fst` og `dict-isma/isma.fst` skal bli kompilert med tagger som identifiserer homonymer og varianter. Dessuten vil vi unnlate å presentere visse former i miniparadigmet. - Her er en liste over taggene det gjelder, og filene de er i (utropstegnet viser hvilke tagger du skal kommentere ut): - ``` tag-not-save.regex - -## 0 (<-) %+Use%/NVD, +### 0 (<-) %+Use%/NVD, remove-variant-homonym-tags.regex -## 0 <- %+Hom1, -## 0 <- %+Hom2, -## 0 <- %+v1, -## 0 <- %+v2, -## 0 <- %+v3, -## 0 <- %+v4, -## 0 <- %+v5, +### 0 <- %+Hom1, +### 0 <- %+Hom2, +### 0 <- %+v1, +### 0 <- %+v2, +### 0 <- %+v3, +### 0 <- %+v4, +### 0 <- %+v5, 0 <- %+Use%/NVD, 0 <- %+Allegro; make-variant-homonym-tags-optional.regex - 0 (<-) %+v5; # <== NB Flytt semikolon hit! -## 0 (<-) %+Use%/NVD, -## 0 (<-) %+Allegro ; + 0 (<-) %+v5; ## <== NB Flytt semikolon hit! +### 0 (<-) %+Use%/NVD, +### 0 (<-) %+Allegro ; tag-not-save-but-oahpa.regex -## 0 (<-) %+Hom1, -## 0 (<-) %+Hom2, -## +N (<-) %+N %+NomAg, +### 0 (<-) %+Hom1, +### 0 (<-) %+Hom2, +### +N (<-) %+N %+NomAg, 0 (<-) %+South , -## 0 (<-) %+G3 , +### 0 (<-) %+G3 , 0 (<-) %+G7 ; ``` - Taggene nedenfor skal være med i output fra `dict-sma/sme.fst` for å få riktig bøyningsparadigme til homonyme lemmaer. De to øverste gjelder sma, de nederste sme: - ``` +Hom1 +Hom2 @@ -96,90 +78,59 @@ Taggene nedenfor skal være med i output fra `dict-sma/sme.fst` for å få rikti +v5 ``` - Disse to taggene skal være med i `dict-isme-norm.fst` fordi vi ikke ønsker dem i miniparadigmet: - ``` +Allegro +Use/NVD ``` - Alle de nevnte taggene skal være utkommentert i - `gt/common/src/tag-not-save.regex` - - - - - Deretter kompileres i gt: - ``` make GTLANG=sma make GTLANG=sme ``` - - - -i bin endres navnet på `isma-norm.fst` til `isma-dict.fst` slik at man også har en +i bin endres navnet på `isma-norm.fst` til `isma-dict.fst` slik at man også har en vanlig `isma-norm.fst` for anna bruk. - - - -# some-ordbok (for sosiale media) - +## some-ordbok (for sosiale media) I tillegg, og med samme taggoppsett, kompileres en egen some-sme.fst for #SoMe-ordboka. Den blir kompilert slik: - Samme taggoppsett som ovafor, men i tillegg kommenteres de første 20 linjene i `gt/common/src/spellrelax.regex` inn (iPhone keyboard relax og capital for Saami letter..) - Resultatet blir et sett med fst-er som tolererer assi for ášši. Døp om `sme.fst` til `some-sme.fst` og kopier over til `/opt/smi/sme/bin`. Kommeter deretter ut de 20 linjene og kompiler sme på nytt (så du får tilbake normale fst-er). - - - - - -# Oppsummering, kommandoer for å kopiere på plass - - - +## Oppsummering, kommandoer for å kopiere på plass Når alt er sjekka kopierer vi over til opt: - sma: - ``` sudo cp sma/bin/sma.fst /opt/smi/sma/bin/dict-sma.fst sudo cp sma/bin/isma-norm.fst /opt/smi/sma/bin/isma-dict.fst ``` - sme, vanlig ordbok (kompilert med vanlig spellrelax): - ``` sudo cp sme/bin/sme.fst /opt/smi/sme/bin/dict-sme.fst sudo cp sme/bin/dict-isme-norm.fst /opt/smi/sme/bin/dict-isme-norm.fst ``` - sme, some-ordbok (kompilert med some-spellrelax): + ``` sudo cp sme/bin/sme.fst /opt/smi/sme/bin/some-sme.fst ``` diff --git a/dicts/GrammaticalDictionary.md b/dicts/GrammaticalDictionary.md index 619001e4..46ab40f2 100644 --- a/dicts/GrammaticalDictionary.md +++ b/dicts/GrammaticalDictionary.md @@ -1,19 +1,15 @@ +# Plan for å lage ei grammatisk ordbok -Plan for å lage ei grammatisk ordbok -=========== -*NB! Dette dokumentet er frå 2012, men ideen er framleis like god* +_NB! Dette dokumentet er frå 2012, men ideen er framleis like god_ -Ideen er å lage ei grammatisk ordbok, generert av *lexc*-filene: +Ideen er å lage ei grammatisk ordbok, generert av _lexc_-filene: +## Bruksområde +- som en del av Oahpa +- som app for nedlastning til mobil, lesebrett osv (hvis vi bare finner en løsning for de sme-bokstaver, for sma er det ikke noe problem) vi kan ikkje skrive ŋŧ, men dei andre -# Bruksområde -* som en del av Oahpa -* som app for nedlastning til mobil, lesebrett osv (hvis vi bare finner en løsning for de sme-bokstaver, for sma er det ikke noe problem) vi kan ikkje skrive ŋŧ, men dei andre - - -# Innhald - +## Innhald Brukeren kan skrive inn et ord, f.eks. "girku" og så kan hun velge: @@ -25,22 +21,13 @@ Brukeren kan skrive inn et ord, f.eks. "girku" og så kan hun velge: Andre døme: -* Ordbogi (app for iPhone) er bra (laga av eit spin-off frå Center for Sproggteknologi i Kbh. -* Vi har samisk WebDict... - - -Vi tenker videre på dette, ikke minst før møte med FAD 20.4.12. - +- Ordbogi (app for iPhone) er bra (laga av eit spin-off frå Center for Sproggteknologi i Kbh. +- Vi har samisk WebDict... +Vi tenker videre på dette, ikke minst før møte med FAD 20.4.12. -# Ryan sine tilleggskommentarar - - -Ein tanke kunne vera å lage ein service som står utafor Oahpa men kan då nyttast i Oahpa eller ein mobil app eller kor som helst (eg tenkjer ogso litt som ein erstatning til webdict, men med meir features). Men dersom ein app bør fungera på ein mobil utan nett tilkopling, hadde ein sånn API ikkje vore ein god idé. Ein offline app må ogso innehalda alle paradigmer, som kan vera ganske stor, avhengig av kor mange ord er inne. +## Ryan sine tilleggskommentarar +Ein tanke kunne vera å lage ein service som står utafor Oahpa men kan då nyttast i Oahpa eller ein mobil app eller kor som helst (eg tenkjer ogso litt som ein erstatning til webdict, men med meir features). Men dersom ein app bør fungera på ein mobil utan nett tilkopling, hadde ein sånn API ikkje vore ein god idé. Ein offline app må ogso innehalda alle paradigmer, som kan vera ganske stor, avhengig av kor mange ord er inne. Ein kul feature kunne vera at ordboka nyttar ein FST ogso for å leite i ordboka, eller om nokon skriv inn ei ordform som ikkje er grunnformen, hadde det vore mogleg enno å finne ordet. Eg såg at nob-ordbok fungerer sånn, og eg hev laga ein liten somalisk ordbok app som hev dette kapabilitetet :) - - - - diff --git a/dicts/InteractiveDictionaryCompilation.md b/dicts/InteractiveDictionaryCompilation.md index 705d4769..b610f78c 100644 --- a/dicts/InteractiveDictionaryCompilation.md +++ b/dicts/InteractiveDictionaryCompilation.md @@ -1,19 +1,14 @@ -Interactive Dictionary Compilation -================================== +# Interactive Dictionary Compilation The Vuosttaš Digisánit was replaced by Neahttadigisánit, and we do not at present have a pipeline for it. This is unfortunate, as VD filled a niche for offline, integrated dictionaries. - In order to present Vuosttaš digisánit to the user, we used the Macintosh Dictionary framework for Mac OS 10.5 and newer. Earlier we used [StarDict](http://stardict.com) for other platforms, now we are sort of looking for an alternative to StarDict. Documentation was never written. Here are the principles we followed: - - -The dictionary content is based upon the dictionary files in our -``words/dicts/smenob/src/`` catalogues (see the source files, and replace -"smenob" with the language pair of your choice). - +The dictionary content is based upon the dictionary files in our +`words/dicts/smenob/src/` catalogues (see the source files, and replace +"smenob" with the language pair of your choice). For each lemma, we generate the corresponding workforms. The makefile for generating the dictionary paradigms was found in $LANG/testing/ in the Giellatekno old infra, it is probably still somewhere. @@ -21,13 +16,3 @@ The makefile for generating the dictionary paradigms was found in $LANG/testing/ Each wordform was then linked via its lemma to the dictionary explanation. We compiled dictionaries for North and South Sami, and for Kven (but Kven without morphology). - - - - - - - - - - diff --git a/dicts/KotusResources.md b/dicts/KotusResources.md index fffa3c1c..d2c2048e 100644 --- a/dicts/KotusResources.md +++ b/dicts/KotusResources.md @@ -1,9 +1 @@ - - - - -* [Documentation of the files (in Finnish)](tietokanta8.html) - - - - +- [Documentation of the files (in Finnish)](tietokanta8.html) diff --git a/dicts/LexicalisingNorwegian.md b/dicts/LexicalisingNorwegian.md index 7af61b61..155174f1 100644 --- a/dicts/LexicalisingNorwegian.md +++ b/dicts/LexicalisingNorwegian.md @@ -1,17 +1,12 @@ - - - - -For analysis of Norwegian we may use either the Oslo-Bergen tagger -(obt) or the nob finite state transducer (nob.fst) from Giellatekno. -The Giellatekno fst is based -upon a wordform list and contains approximately 2000 unclassified +For analysis of Norwegian we may use either the Oslo-Bergen tagger +(obt) or the nob finite state transducer (nob.fst) from Giellatekno. +The Giellatekno fst is based +upon a wordform list and contains approximately 2000 unclassified verbs and 2700 unclassified nouns. At the outset, the obt pipeline is thus better. On the positive side for the gt fst is its flexibility. For Neahttadigisánit we use the gt fst, and therefore we lexicalise all compouds found in the dictionary. - The gt fst is found in `$GTHOME/langs/nob`, and is thus part of the new infrastructure, with the stems in `src/morphology/stems`. The nouns, verbs and adjectives are given the continuation lexica @@ -19,38 +14,23 @@ found in [Bokmålsordboka](http://bokmalsordboka.html), the inflection code system is also found at the top of the files in both the `stems/` and the `affixes/` catalogues. - # Lexicalisation - The nob.fst may be set up to include or exclude dynamic compounds. -To check today's behaviour, check for the words *hybelkanin* -(lexicalised) and *hybelhest* (not lexicalised). If both are accepted, +To check today's behaviour, check for the words _hybelkanin_ +(lexicalised) and _hybelhest_ (not lexicalised). If both are accepted, dynamic compounding is ON, if only the former is accepted, it is OFF. The behaviour is regulated by commenting in and out 3 lines of the lexicon R in `src/morphology/root.lexc`. - -Turn dynamic compounding off (if needed), and find unknown verbs +Turn dynamic compounding off (if needed), and find unknown verbs for example as follows: - ``` cat file|preprocess|rev|sort|rev|uniq|unob|grep '?'|cut -f1 ``` - Add words to the files in `src/morphology/stems/` by following -the pattern indicated on the top of each file. When words may be -both masculine and feminine (like *boka* vs. *boken*), choose +the pattern indicated on the top of each file. When words may be +both masculine and feminine (like _boka_ vs. _boken_), choose feminine. The analyser treats all feminines as potential masculines. - - - - - - - - - - diff --git a/dicts/Maskinlesbar.md b/dicts/Maskinlesbar.md index 9b76df1b..9dfbcc9a 100644 --- a/dicts/Maskinlesbar.md +++ b/dicts/Maskinlesbar.md @@ -1,12 +1,9 @@ -Er ordboksmanuset mitt maskinlesbart? -===================================== +# Er ordboksmanuset mitt maskinlesbart? Ordboksmanus må vere tilgjengeleg i digital form, som datafiler. Men det i seg sjølv gjer dei ikkje maskinlesbare. - For å vere maskinlesbart må eit ordboksmanus vere strukturert slik at det for kvar lemmaartikkel (ordboksartikkel) er mogleg å identifisere alle delar automatisk. Eit døme kan vere: - ``` lemma: spasere ordkl: v @@ -20,34 +17,19 @@ ordkl: v overs: swim ``` - Her er kvar lemmaartikkel identifisert med nylinje, og kvar type informasjon i lemmaartikkelen er identifisert med forklaring til venstre for kolon. - Ein annan type kan vere ein tabell, t.d. i eit rekneark: +| lemma | ordkl | overs | eks | eksovers | +| ------- | ----- | ----- | -------------------- | --------------------- | +| spasere | v | walk | Vi spaserte i parken | We walked in the park | +| svømme | v | swim |   |   | -| lemma | ordkl | overs | eks | eksovers -| --- | --- | --- | --- | --- -| spasere | v | walk | Vi spaserte i parken | We walked in the park -| svømme | v | swim |   |   - - -Maskinlesbare data kan vere strukturert på mange måtar, det viktige prinsippet er *det som gjeld for eit tilfelle, gjeld for alle*. Viss innhaldet i kolonne 4 er *eksempel* må det **alltid** vere eksempel (der det finst eit eksempel). Viss det ikkje finst eksempel kan vi ikkje t.d. legge til fleire engelske omsetjingar for å spare plass. Vi kan heller ikkje ha grunnform i kolonna til venstre i eitt tilfelle, men t.d. ei bøyingsform der i eit anna tilfelle. Gjer vi det, er ikkje basen vår lenger maskinlesbar. - - -Maskinlesbare data bør ikkje innehalde formattering som *kursiv* eller **halvfeit**. Dette er slikt som gjer det lettare å lese for menneske, men ikkje lettare å lese for maskiner. Dessutan veit vi ikkje kva *kursiv* betyr. Kanskje står både ordklassemarkering og eksempel i kursiv? Korleis veit maskina så kva som er kva? Av dette følgjer det at vi helst ikkje bør bruke program som AbiWord, Microsoft Word eller OpenOffice Writer til ordboksmanus. Derimot er rekneark som Eccel, Lotus, OpenOffice Calc eller Numbers godt eigna til jobben, *så lenge vi klarer å ha ein og same kategori i kvar kolonne*. - +Maskinlesbare data kan vere strukturert på mange måtar, det viktige prinsippet er _det som gjeld for eit tilfelle, gjeld for alle_. Viss innhaldet i kolonne 4 er _eksempel_ må det **alltid** vere eksempel (der det finst eit eksempel). Viss det ikkje finst eksempel kan vi ikkje t.d. legge til fleire engelske omsetjingar for å spare plass. Vi kan heller ikkje ha grunnform i kolonna til venstre i eitt tilfelle, men t.d. ei bøyingsform der i eit anna tilfelle. Gjer vi det, er ikkje basen vår lenger maskinlesbar. -Det er også mogleg å bruke XML (t.d. med XML-redigeringsprogram), eller eigne ordboksredigeringsprogram. +Maskinlesbare data bør ikkje innehalde formattering som _kursiv_ eller **halvfeit**. Dette er slikt som gjer det lettare å lese for menneske, men ikkje lettare å lese for maskiner. Dessutan veit vi ikkje kva _kursiv_ betyr. Kanskje står både ordklassemarkering og eksempel i kursiv? Korleis veit maskina så kva som er kva? Av dette følgjer det at vi helst ikkje bør bruke program som AbiWord, Microsoft Word eller OpenOffice Writer til ordboksmanus. Derimot er rekneark som Eccel, Lotus, OpenOffice Calc eller Numbers godt eigna til jobben, _så lenge vi klarer å ha ein og same kategori i kvar kolonne_. +Det er også mogleg å bruke XML (t.d. med XML-redigeringsprogram), eller eigne ordboksredigeringsprogram. [Liste over ordboksredigeringsprogram](http://en.wikipedia.org/wiki/Dictionary_writing_system) - - - - - - - - diff --git a/dicts/Meetings/2017-12-06.md b/dicts/Meetings/2017-12-06.md index fe67c34f..91ca3821 100644 --- a/dicts/Meetings/2017-12-06.md +++ b/dicts/Meetings/2017-12-06.md @@ -1,49 +1,34 @@ -Møte om ordbøker, terminologi og REST (el. GraphQL) 6.12.2017 - +# Møte om ordbøker, terminologi og REST (el. GraphQL) 6.12.2017 Til stades: Børre, Chiara, Ciprian, Lene, Sjur, Trond - "Ei ordbok per språkpar per retning, og alt innhald skal inn dit" - Forvirring for brukarane - [Folkets lexicon](http://folkets-lexikon.csc.kth.se/folkets/) - [https://en.wiktionary.org/wiki/Category:Northern_Sami_lemmas] - -# Dei vanlegaste nob-sme-orda - +## Dei vanlegaste nob-sme-orda adapter afghaner akselerasjon akseptabel aktivere aktor angi angivelse anlegge anoreksi anvendt arrest atelier avhjelpe avhøre avkall avkrefte avskaffelse avspark avvente avvike banner barneverntjeneste befaring begjæring begå belegg bemanne bemanning bensinmotor besitte beslag bestikkelse bestride bevisbyrde bidragsyter bilag bistå blåskjell borettslag bot brannmann brite brukar byttedyr bølgelengde bøtelegge diskurs disponering distribuere domene drivhuseffekt droppe drøfting dynamitt dynamo dødlinje enebolig enkeltstående enkeltvis erfaringsmessig erverve etterforske etterleve etterretning ettersyn fagbevegelse fastlegge fengsling filt fjerning forbudt fordre forenlig foreta forfall forlovede forskudd fortrinnsvis fortrolig forutgående forvaring forveksling fossil framdrift framsette fraværende fremkalle fremlegge fremmøte fremstå frifinne fruktose fukt funksjonshemmet fyr gjengivelse gjennomgående gjenoppta gjenvinning gluten gods gunst hallusinasjon hands heading henholdsvis henlegge heving himling holder hyre håndverker idømme ikon ilegge importere inder injeksjon innbrudd innlegge innleggelse innrede innsatt innsending innstille innta inntre intensitet iraker israeler japaner jekk karbohydrat kausjon kineser kjønnslemlestelse klargjøre klarlegge kloster kolesterol kompleksitet komprimere krenkelse kriminalomsorg kriminell krympe kvadratmeter loddtrekning lovendring løslate løslatelse marbakke medhold megle - diskurs <== ? +## Dei sjeldnaste nob-sme-orda -# Dei sjeldnaste nob-sme-orda - - -beramme beramme bensinmotor bend benåde bemyndige bemyndige bemanning bemanning bemanne bemanne beiterettshaver beitekonsulent beinbrudd beholdningsendring begunstigelse begunstigelse begerlav begå befordringsmiddel befordring befaring bebude bebude bebreide bebreide bebreide bearbeidelse bøtestraff bøteregister båtdrag båtbord båtbord basert barnevernloven barnesete banner banner bangladesher båndlegge balansekonto balansekonto - - -* bakslag <== -* bakkestjerne <== -* bakaksel <== - - -bahrainer badematte badematte øyenvitne østtimoreser østerriker ørebein økonomisjef økonomisjef økonomisjef ødeåker avvente avvike avvike avveining avveining avstikking avstikking avstandsskilt avsone avsone avsone avskjæring avpasse avpasse avlytte avlytte avlufting avlufting avløpsrør avløpspumpe avløpsanlegg avkryssingsoppgave avkreftelse avkrefte avkjøringsfelt avkjøringsfelt avkall avisholder avisholder avhente avhøre avhøre avgjørelsesmyndighet avgjørelsesmyndighet autovern atelier åstedsbefaring åsted æreskrenkelsessak ærbarhet arvelodd arvelodd arvelodd arvelodd arveforhold arveforhold arveforhold arveanlegg arrestkrav arresthaver armstøtte armstøtte årmann armener armener arksamler arkmater argentiner arbeidsslag arbeidersamfunn åpningsprosedyre antennelse anstalt anneksjonsskilt anleggsmaskin ankerplass ankerett ankeprøving ankegrunn ankegjenstand ankeforhandling ankedomstol angolaner angolaner angivelse allmennbegrep allmennbegrep allé allé alkoholforbrenning akvakulturanlegg aktivitetsområde akseptabel akselavstand akkar åkerrein åkerrein åkerrein åkerrein åkerrein åkergråurt afghaner adresseringsmaskin adresselinje tf-hsl-m0016:termwiki ttr000$ - +beramme beramme bensinmotor bend benåde bemyndige bemyndige bemanning bemanning bemanne bemanne beiterettshaver beitekonsulent beinbrudd beholdningsendring begunstigelse begunstigelse begerlav begå befordringsmiddel befordring befaring bebude bebude bebreide bebreide bebreide bearbeidelse bøtestraff bøteregister båtdrag båtbord båtbord basert barnevernloven barnesete banner banner bangladesher båndlegge balansekonto balansekonto -# Samtlege nob-sma-ord +- bakslag <== +- bakkestjerne <== +- bakaksel <== +bahrainer badematte badematte øyenvitne østtimoreser østerriker ørebein økonomisjef økonomisjef økonomisjef ødeåker avvente avvike avvike avveining avveining avstikking avstikking avstandsskilt avsone avsone avsone avskjæring avpasse avpasse avlytte avlytte avlufting avlufting avløpsrør avløpspumpe avløpsanlegg avkryssingsoppgave avkreftelse avkrefte avkjøringsfelt avkjøringsfelt avkall avisholder avisholder avhente avhøre avhøre avgjørelsesmyndighet avgjørelsesmyndighet autovern atelier åstedsbefaring åsted æreskrenkelsessak ærbarhet arvelodd arvelodd arvelodd arvelodd arveforhold arveforhold arveforhold arveanlegg arrestkrav arresthaver armstøtte armstøtte årmann armener armener arksamler arkmater argentiner arbeidsslag arbeidersamfunn åpningsprosedyre antennelse anstalt anneksjonsskilt anleggsmaskin ankerplass ankerett ankeprøving ankegrunn ankegjenstand ankeforhandling ankedomstol angolaner angolaner angivelse allmennbegrep allmennbegrep allé allé alkoholforbrenning akvakulturanlegg aktivitetsområde akseptabel akselavstand akkar åkerrein åkerrein åkerrein åkerrein åkerrein åkergråurt afghaner adresseringsmaskin adresselinje tf-hsl-m0016:termwiki ttr000$ -arbeidskrevende digitale eneveldig folkevalgt grufull grunnleggende homofil kjedelig kommersiell konservativ loddrett lokal magisk negativ oppfinnsom protestantisk radikal ren rosa sammenhengende tilfeldig ugyldig sammenhengende president albue allemannseie alpinanlegg annonsekostnad arbeidsgiver arbeidskrevende arbeidsledighetstrygd arbeidsseminar arbeidstaker arkitekt astma atom atomkraft avisredaksjon badekar badestrand bankkonto barnehjem befolkningsgruppe befolkningsvekst bergart beskrivelse betalingsplan bidrag bildemontasje bilist bilkolonne billedkunst bokanmeldelse bokhandel bokmerke bokmål bosetningsområde brennmerke bukser bulldoser butikksenter byplanlegger bytterett bønneskrift collage dagsorden dalbunn dampmaskin dampskip datamaskin dataoverføring debatt diplomati diplomatlosje dokumentvisning driftstilskudd drosje ekspert elvebunn etableringsstipend fagforening fagkomité fakkeltog faktum fangeleir ferievane finansminister fjellkart flertallsregjering flypassasjer flypassasjer flyreise flyttsame folkerett folkeskole folketelling forbruk formynder forurensing fossekraft foto fremmedord fremmedordbok fritidsutstyr fryser førsteside gatehjørne gatenett gjeng gjennomsnittsfamilie gjødsel grammofon granskog grunnfag grunnfagsstudium grunnleggende grunnlegger gruppepress gruvegang gymnastikksal halvtime halvårsenhet hand havavsetning havbunn helseforetak helsesøster historiker hjemmelaget homofil homofili hovedkandidat hurtigbåt høring høyblokk høydeforskjell industriby infobank innlandsfylke inntektsgrunnlag invasjon isbrefront jernkonstruksjon jernmalm jernverk jettegryte jobbintervju jordklode jordmor jordskorpeplate kalender kalori keramiker kino kirkebok kjøretøy kobberverk kommuneordfører kommunikasjon kongeskip kullgruve kultursenter kunstgjødsel kunstner kuvøse kvegfarm kvittering kvote landform landskapskart legevitenskap leilending loddrett lokomotiv lydfil læringsnett læringssenter lønnsmottaker lønnstrinn løsrive løvskog maktkamp mappesystem marine -markplanlegger massakre matbutikk medhjelper mediaklipp meglingsfrist melodi merkeklær midnattssol miljøproblem miljøteam minister minnepinne modell montasje moteklær motorkjøretøy motstandsgruppe månedsmagasin naturmedisin naturminne naturreservat nedbør nedbørfelt nettvettregler notat næringsgruppe næringsoppgave observatorium omgangsskole oppfinner oppvekstsjef oppvekstvilkår pakketur park part partigruppe pengebeløp pengesekk pensjonist personale personalrom plantegift porto postordre presselosje propaganda rasjoneringskort reindriftsforvaltning reindriftsorganisasjon rekkefølge reklamasjonsfrist reklame reklameavtale reklamefilm restaurering rettssikkerhet returporto riksrettssak rock rosa saksgang salgssjef samarbeidsoppgave savanne selvbilde sjørøver skipsfart skipsverft sklie skolegård skyllerom slep smaktilsetning smeltvann småbarnsmor snøbrøyting spillautomat sprøytemiddel statsborger statsborgerskap stattholder stemmelokale stereo stumfilm syden symmetri søring talerstol tariffoppgjør tast telegraf telegrafnett terrorist tidevann tilskuer tilværelse transportmiddel transportnæring trener trikk tukthus turisthytte tvil tøfler union unionsmerke utenriksminister utstilling utstillingsdukke valgresultat vindkraftverk vogn volleyball ørkenlandskap havn heltid idrettsstadion lokalpolitiker nettleser posesuppe rengjøringsassistent rengjøringsassistent returpakning sagbruk stortingsrepresentant svømmehall tannhelsetjeneste utbygging utbygging veinett Fremskrittspartiet Høyesterett ILO-konvensjonen Reindriftsloven Sameloven Utenriksdepartementet Venstre begrense eksaminere forenkle gjenbruke hevde høste innvandre installere megle omfatte oppdatere overleve privatisere produsere publisere utnevne virvle +## Samtlege nob-sma-ord +arbeidskrevende digitale eneveldig folkevalgt grufull grunnleggende homofil kjedelig kommersiell konservativ loddrett lokal magisk negativ oppfinnsom protestantisk radikal ren rosa sammenhengende tilfeldig ugyldig sammenhengende president albue allemannseie alpinanlegg annonsekostnad arbeidsgiver arbeidskrevende arbeidsledighetstrygd arbeidsseminar arbeidstaker arkitekt astma atom atomkraft avisredaksjon badekar badestrand bankkonto barnehjem befolkningsgruppe befolkningsvekst bergart beskrivelse betalingsplan bidrag bildemontasje bilist bilkolonne billedkunst bokanmeldelse bokhandel bokmerke bokmål bosetningsområde brennmerke bukser bulldoser butikksenter byplanlegger bytterett bønneskrift collage dagsorden dalbunn dampmaskin dampskip datamaskin dataoverføring debatt diplomati diplomatlosje dokumentvisning driftstilskudd drosje ekspert elvebunn etableringsstipend fagforening fagkomité fakkeltog faktum fangeleir ferievane finansminister fjellkart flertallsregjering flypassasjer flypassasjer flyreise flyttsame folkerett folkeskole folketelling forbruk formynder forurensing fossekraft foto fremmedord fremmedordbok fritidsutstyr fryser førsteside gatehjørne gatenett gjeng gjennomsnittsfamilie gjødsel grammofon granskog grunnfag grunnfagsstudium grunnleggende grunnlegger gruppepress gruvegang gymnastikksal halvtime halvårsenhet hand havavsetning havbunn helseforetak helsesøster historiker hjemmelaget homofil homofili hovedkandidat hurtigbåt høring høyblokk høydeforskjell industriby infobank innlandsfylke inntektsgrunnlag invasjon isbrefront jernkonstruksjon jernmalm jernverk jettegryte jobbintervju jordklode jordmor jordskorpeplate kalender kalori keramiker kino kirkebok kjøretøy kobberverk kommuneordfører kommunikasjon kongeskip kullgruve kultursenter kunstgjødsel kunstner kuvøse kvegfarm kvittering kvote landform landskapskart legevitenskap leilending loddrett lokomotiv lydfil læringsnett læringssenter lønnsmottaker lønnstrinn løsrive løvskog maktkamp mappesystem marine +markplanlegger massakre matbutikk medhjelper mediaklipp meglingsfrist melodi merkeklær midnattssol miljøproblem miljøteam minister minnepinne modell montasje moteklær motorkjøretøy motstandsgruppe månedsmagasin naturmedisin naturminne naturreservat nedbør nedbørfelt nettvettregler notat næringsgruppe næringsoppgave observatorium omgangsskole oppfinner oppvekstsjef oppvekstvilkår pakketur park part partigruppe pengebeløp pengesekk pensjonist personale personalrom plantegift porto postordre presselosje propaganda rasjoneringskort reindriftsforvaltning reindriftsorganisasjon rekkefølge reklamasjonsfrist reklame reklameavtale reklamefilm restaurering rettssikkerhet returporto riksrettssak rock rosa saksgang salgssjef samarbeidsoppgave savanne selvbilde sjørøver skipsfart skipsverft sklie skolegård skyllerom slep smaktilsetning smeltvann småbarnsmor snøbrøyting spillautomat sprøytemiddel statsborger statsborgerskap stattholder stemmelokale stereo stumfilm syden symmetri søring talerstol tariffoppgjør tast telegraf telegrafnett terrorist tidevann tilskuer tilværelse transportmiddel transportnæring trener trikk tukthus turisthytte tvil tøfler union unionsmerke utenriksminister utstilling utstillingsdukke valgresultat vindkraftverk vogn volleyball ørkenlandskap havn heltid idrettsstadion lokalpolitiker nettleser posesuppe rengjøringsassistent rengjøringsassistent returpakning sagbruk stortingsrepresentant svømmehall tannhelsetjeneste utbygging utbygging veinett Fremskrittspartiet Høyesterett ILO-konvensjonen Reindriftsloven Sameloven Utenriksdepartementet Venstre begrense eksaminere forenkle gjenbruke hevde høste innvandre installere megle omfatte oppdatere overleve privatisere produsere publisere utnevne virvle [Ord frå Snåasen tjïelte](https://satni.uit.no/termwiki/index.php/Collection:Sn%C3%A5asen_tj%C3%AFelte_2017-10) diff --git a/dicts/NewFeatures.md b/dicts/NewFeatures.md index f4846eaa..587cbcf2 100644 --- a/dicts/NewFeatures.md +++ b/dicts/NewFeatures.md @@ -2,23 +2,25 @@ Denne sida listar opp ting vi vil forbetre, legge til eller utprøve i NDS. -## Bøying av talord i smnfin, finX +### Bøying av talord i smnfin, finX -Dette fungerer i smenob (*vihtta, guhtta*) og i fkvnob (*viisi, kuusi*), smanob (*vijhte, govhte*) men ikkje for smnfin (*vittâ, kuttâ*) eller finsmn, finnob (*viisi, kuusi*). +Dette fungerer i smenob (_vihtta, guhtta_) og i fkvnob (_viisi, kuusi_), smanob (_vijhte, govhte_) men ikkje for smnfin (_vittâ, kuttâ_) eller finsmn, finnob (_viisi, kuusi_). TODO: Legge til paradigmer. -## Bøying av pronomen i smnfin +### Bøying av pronomen i smnfin - Blir bøygd: - - Personlege pronomen + - Personlege pronomen - Blir ikkje bøygd: - - Alle dei andre, ser det ut til + - Alle dei andre, ser det ut til + +### Diminutiv og forklaring på det -## Diminutiv og forklaring på det ... finst no for sanit, men ikkje for andre språk (t.d. smn) -## Lenkje til bokmerke i mobiltelefon Brukarane vil ha "ein app", +### Lenkje til bokmerke i mobiltelefon Brukarane vil ha "ein app", + dvs. NDS på mobiltelefonen. Det er planar om nedlastbar NDS, men mens vi ventar på det kan vi lage ei lenkje på sida som gjer det mogleg å lage (viser korleis ein kan lage) eit bokmerkesymbol på skrivebordet @@ -26,42 +28,42 @@ på telefonen. Ei mogleg løysing: https://github.com/docluv/add-to-homescreen -## Syntetisk tale (TTS) for nordsamisk +### Syntetisk tale (TTS) for nordsamisk Klikk på eit symbol <| og få TTS til å lese opp ordet Status: TODO. Alle komponentane eksisterer. -## IPA +### IPA Klikk på eit symbol og få translitterasjon TODO: Gå attende i svn-historia og få fram IPA-fst-en (i dag er den endra til eit ortografisk output). Som alternativ kunne vi vurdere Wiktionary sitt skript. -## Ny logo (og mindre logo for Reader) +### Ny logo (og mindre logo for Reader) Status: TODO -## Synleggjera alternative skrivemåtar +### Synleggjera alternative skrivemåtar t.d. diftoŋga/diftoŋŋa, tomáhtá/tomáhtta, tunealla/tunnealla Vi legg variantane til i xml-fila under lg (l_var?) med ein attributt som viser kva variant det er for generering (v2, v3, ...). Så vert det generert ulikt paradigme avhengig av kva variant ein trykkar på. Vi legg berre til variantar av lemma, ikkje av omsetjingar. -## Legge til stavekontroll i framleggsvindauget +### Legge til stavekontroll i framleggsvindauget Vi kan t.d. gjere slik: -1. For dei tilfella der vi **ikkje** finn ordet i ordboka xxxyyy: - 1. **Send ordet til analysatoren for yyy**, sjekk for treff. Viss "ja", foreslå å bytte retning. Viss nei: - 2. **Send ordet til stavekontrollen for xxx**, og gjer framlegg om **det første** rettingsframlegget. +1. For dei tilfella der vi **ikkje** finn ordet i ordboka xxxyyy: + 1. **Send ordet til analysatoren for yyy**, sjekk for treff. Viss "ja", foreslå å bytte retning. Viss nei: + 2. **Send ordet til stavekontrollen for xxx**, og gjer framlegg om **det første** rettingsframlegget. -## DONE +### DONE Her kjem ting vi allereie har gjort. -### Etymologi +#### Etymologi Klikk på eit symbol ETYM og få lenkje til Kotus (den finske etymologiske databasen) @@ -75,21 +77,21 @@ http://kaino.kotus.fi/algu/index.php?t=haku&o=hae&l=1&valinta=1&valintaryhma=1&k ... der målspråkkoden er: -* kieli=45 = sme -* kieli=41 = sma -* kieli=46 = sma -* kieli=47 = sms -* kieli=29 = fin -* kieli=29 = fin -* kieli=30 = izh -* kieli=36 = liv -* kieli=60 = myv -* kieli=61 = mjd -* kieli=62 = mhr -* kieli=63 = mrj -* kieli=20 = kom -* kieli=21 = udm -* kieli=3 = yrk +- kieli=45 = sme +- kieli=41 = sma +- kieli=46 = sma +- kieli=47 = sms +- kieli=29 = fin +- kieli=29 = fin +- kieli=30 = izh +- kieli=36 = liv +- kieli=60 = myv +- kieli=61 = mjd +- kieli=62 = mhr +- kieli=63 = mrj +- kieli=20 = kom +- kieli=21 = udm +- kieli=3 = yrk ... og grensesnittspråk er: @@ -99,57 +101,63 @@ http://kaino.kotus.fi/algu/index.php?t=haku&o=hae&l=1&valinta=1&valintaryhma=1&k http://kaino.kotus.fi/algu/index.php?t=haku&o=hae&l=1&valinta=1&valintaryhma=1&kieli=45&hakusana=sátni&kkieli=en ``` -### Forbedre presentasjon av sammensatte ord +#### Forbedre presentasjon av sammensatte ord Se "Sammensetninger i nds" i [180926](/admin/giellatekno/180926.html) -### Bug 2406 (egentlig ikke new feature) +#### Bug 2406 (egentlig ikke new feature) Kontekst som dette, blir ikke presentert. + - entry_context: "mun" tag_context: "V+Ind+Prs+Sg1" - template: "(mun) ` word_form `" + template: "(mun) `word_form`" + +#### Oppdatere bokmerke -### Oppdatere bokmerke Det er sjekket inn korrigert feedback-adresse (giellatekno@hum.uit.no > giellatekno@uit.no) i fila apps/dicts/nds/src/neahtta/static/js/bookmarklet.js Denne skal oppdateres i grensesnittet. -### Linker til Korp fra finsmn og nobsma +#### Linker til Korp fra finsmn og nobsma + link til tospråklig korpus på samme måte som fra nobsme -### Forbedret presentasjon av derivasjoner (behandle derivasjoner som sammensatte ord) +#### Forbedret presentasjon av derivasjoner (behandle derivasjoner som sammensatte ord) Sammensatte ord fungerer slik idag: -* viessohaddi viessu+N+Cmp/SgNom+Cmp#haddi+N+Sg+Nom -* Både viessu og (#)haddi sendes til ordboka og vi får oversettelse av begge: - - viessu = hus, haddi = pris -* Hvis viessohaddi også er i ordboka, så blir også denne presentert med oversetting, øverst + +- viessohaddi viessu+N+Cmp/SgNom+Cmp#haddi+N+Sg+Nom +- Både viessu og (#)haddi sendes til ordboka og vi får oversettelse av begge: + - viessu = hus, haddi = pris +- Hvis viessohaddi også er i ordboka, så blir også denne presentert med oversetting, øverst Derivasjoner (alle som starter med Der/ ): -* borralit borrat+V+TV+Der/l+V+Inf -* Vi ønsker at både borrat og Der/l skal sendes til ordboka, hvor det skal være en entry med forklaringer: - - borrat = spise, Der/l = gjøre noe raskt, eller starte en bevegelse + +- borralit borrat+V+TV+Der/l+V+Inf +- Vi ønsker at både borrat og Der/l skal sendes til ordboka, hvor det skal være en entry med forklaringer: + - borrat = spise, Der/l = gjøre noe raskt, eller starte en bevegelse Dette skal også gjelde noen infinitte verbformer, f.eks. -* borakeahttá borrat+V+TV+VAbess -* Også VAbess skal sendes til ordboka, hvor det skal være en entry med forklaringer for denne - - borrat = spise, VAbess = uten å gjøre det -Vi trenger entrier for verdier som 'Der/l' eller 'VAbess' i ordboken. +- borakeahttá borrat+V+TV+VAbess +- Også VAbess skal sendes til ordboka, hvor det skal være en entry med forklaringer for denne + - borrat = spise, VAbess = uten å gjøre det -### Legge til l_ref feature også i NDS +Vi trenger entrier for verdier som 'Der/l' eller 'VAbess' i ordboken. + +#### Legge til l_ref feature også i NDS Se [om l_ref](dictionarywork.html#Bruk+av+l_ref+i+xml++%28gjelder+bare+VD%29) -### Legge til informasjon om stammetype +#### Legge til informasjon om stammetype substantiv, verb, adjektiv -* 2syll = likestavelsesstamme -* 3syll = ulikestavelsesstamme -* Csyll = kontrakt stamme +- 2syll = likestavelsesstamme +- 3syll = ulikestavelsesstamme +- Csyll = kontrakt stamme info hentes fra kontinuasjonsleksikonene i main/langs/sme/src/morphology/stems/nouns.lexc @@ -166,10 +174,10 @@ GOAHTI-A er for 2syll. Liste over kontinuasjonsleksikonene vs. stem type er i \\ trunk/words/dicts/smenob/scripts/nouns_stemtypes.txt -### Behandling av derivasjoner med flere analyser +#### Behandling av derivasjoner med flere analyser -* 1: Når samme lemma + Der-tagg med og uten Err/Orth: -Oppslag og høyremarg: vis bare den uten Err/Orth +- 1: Når samme lemma + Der-tagg med og uten Err/Orth: + Oppslag og høyremarg: vis bare den uten Err/Orth ``` "skuvla+N+Der/Dimin" @@ -180,8 +188,8 @@ skuvllaš skuvla+N+Der/Dimin+N+Sg+Gen+Err/Orth-nom-gen skuvllaš skuvla+N+Der/Dimin+N+Sg+Acc+Err/Orth-nom-acc ``` -* 2: Når samme lemma + Der-tagg uten Err/Orth: -Oppslag: vis bare en gang - Høyremarg: vis alle analyser +- 2: Når samme lemma + Der-tagg uten Err/Orth: + Oppslag: vis bare en gang - Høyremarg: vis alle analyser ``` "skuvla+N+Der/Dimin" @@ -191,8 +199,8 @@ skuvllažiid skuvla+N+Der/Dimin+N+Pl+Gen skuvllažiid skuvla+N+Der/Dimin+N+Pl+Acc ``` -* 3: Når alle lemma + Der-tagg er med Err/Orth: -Oppslag: vis bare en gang - Høyremarg: vis alle analyser +- 3: Når alle lemma + Der-tagg er med Err/Orth: + Oppslag: vis bare en gang - Høyremarg: vis alle analyser ``` kántuvrraš kantuvra+Err/Orth-a-á+N+Der/Dimin+N+Sg+Nom @@ -201,16 +209,16 @@ kántuvrraš kantuvra+Err/Orth-a-á+N+Der/Dimin+N+Sg+Acc+Err/Orth-nom-acc ``` -* 4: Når det er både lemma og lemma + Der-tagg: -Oppslag: vis både leksikalisert lemma (øverst) og lemma med Der-tagg +- 4: Når det er både lemma og lemma + Der-tagg: + Oppslag: vis både leksikalisert lemma (øverst) og lemma med Der-tagg ``` geavahit geavahit+V+TV+Inf <= geavahit geavvat+V+IV+Der/h+V+TV+Inf <= ``` -* 5: Når det lemma + Der-tagg og lemma + Der-tagg + Der-tagg : -Oppslag: vis bare lemma med færrest dertagger - Høyremarg: vis bare lemma med færrest Der-tagg ? Vet ikke +- 5: Når det lemma + Der-tagg og lemma + Der-tagg + Der-tagg : + Oppslag: vis bare lemma med færrest dertagger - Høyremarg: vis bare lemma med færrest Der-tagg ? Vet ikke ``` geavahuvvot geavvat+V+IV+Der/h+V+TV+Der/PassL+V+IV+Inf @@ -220,7 +228,7 @@ geavahuvvogoahtit geavvat+V+IV+Der/h+V+TV+Der/PassL+V+IV+Der/InchL+V+Inf geavahuvvogoahtit geavahit+V+TV+Der/PassL+V+IV+Der/InchL+V+Inf <= ``` -## Forbedre etymologi +### Forbedre etymologi Det hadde vore betre å lenkje direkte til artikkelen, men for å få til det må vi hente sanue_id-nummeret frå databasen. Vi kan t.d. legge det inn som ein id i kjeldekoden: @@ -228,11 +236,12 @@ Det hadde vore betre å lenkje direkte til artikkelen, men for å få til det m Status: gjort -## Flytte re-node framfor omsetjing +### Flytte re-node framfor omsetjing + +### Ordbok for nordsamisk-spansk -## Ordbok for nordsamisk-spansk Status: Demoversjon ligg ute -## Fjerne korp-lenkjer frå paradime +### Fjerne korp-lenkjer frå paradime -Då vi har lagt til lenkjer frå adjektivparadigme til smi.cgi, vert det forvirrande å ha lenkjer frå verbparadigme til Korp. Vi fjerner Korp-lenkjene, sidan det likevel finst lenkje til Korp i analyseboken til høgre. \ No newline at end of file +Då vi har lagt til lenkjer frå adjektivparadigme til smi.cgi, vert det forvirrande å ha lenkjer frå verbparadigme til Korp. Vi fjerner Korp-lenkjene, sidan det likevel finst lenkje til Korp i analyseboken til høgre. diff --git a/dicts/NyeKandidater.md b/dicts/NyeKandidater.md index 3de5df2c..9a1c7acc 100644 --- a/dicts/NyeKandidater.md +++ b/dicts/NyeKandidater.md @@ -1,29 +1,22 @@ -# Arbeidsmåte, +# Arbeidsmåte, eksempel nobsme +## Arbeid i inc/kandidatar.csv -# Arbeid i inc/kandidatar.csv - - -Legg til PoS, restriksjon, oversetting, eksempelsetning, oversetting av eksempelsetning. +Legg til PoS, restriksjon, oversetting, eksempelsetning, oversetting av eksempelsetning. Viktig at alle disse linjene har fem underscore. - Restriksjon skrives i parantes for å gjøre csv-lista mer lesbar. - Eksempelsetning bør være en evt. forkorta versjon av setning funnet i Korp eller på internett. - Man kan godt hoppe over ord i lista. - -# Lage xml-fil med de nye ordene og fjern dem fra inc/kandidatar.csv +## Lage xml-fil med de nye ordene og fjern dem fra inc/kandidatar.csv Med utgangspunkt i dicts/nobsme/ katalogen, gjør disse kommandoene: - ``` grep '_.*_.*_.*_.*_' inc/kandidatar.csv |grep -v '#' > inc/nyeord.csv @@ -39,8 +32,7 @@ cat inc/nyeord.csv | perl scripts/csv2xml_with_re_xg.pl >> inc/nyeord.xml Merk at perlfila i `scripts/` kan ha andre navn, f.eks. `c2x.pl` eller lignende - -# Rediger inc/nyeord.xml +## Rediger inc/nyeord.xml see `inc/nyeord.xml` (eller tilsvarende fil med kandidater) @@ -76,10 +68,8 @@ Samme lemma med ny oversetting: rediger slik at det blir en `` med to `` ``` - **Resultatet blir slik:** - ``` @@ -100,9 +90,3 @@ Samme lemma med ny oversetting: rediger slik at det blir en `` med to `` ``` - - - - - - diff --git a/dicts/PrinsippForOrdbokssnuing.md b/dicts/PrinsippForOrdbokssnuing.md index 3364cce0..27657fc3 100644 --- a/dicts/PrinsippForOrdbokssnuing.md +++ b/dicts/PrinsippForOrdbokssnuing.md @@ -1,81 +1,61 @@ - - - - Dette dokumentet diskuterer prinsipp for korleis vi kan redigere ordbøker som har vorte snudd, td. frå sme-nob til nob-sme. Problema som blir tatt opp er dei vi støyter på når ei norsk forklaring til eit samisk, kvensk, ... ord plutseleg dukkar opp som norsk oppslagsord. - # Overordna spørsmål uansett ordklasse - **Valg av ord i kildespråket** - Poenget må være å legge ordet under kildespråk-ordet som brukeren vil leite etter: -1. Finnes det et ord på kildespråket som dekker innholdet? +1. Finnes det et ord på kildespråket som dekker innholdet? 1. Vil det være naturlig for brukeren å leite etter dette ordet? \\ -Eksempel *sykne = buohccát* som er korrekt, men verbet *sykne* er ikke vanlig i norsk, annet enn i fast uttrykk som *sykne hen*, derfor bør ordet også legge under adjektivet: *syk A = buohcci A, (å bli syk) buohccát V* + Eksempel _sykne = buohccát_ som er korrekt, men verbet _sykne_ er ikke vanlig i norsk, annet enn i fast uttrykk som _sykne hen_, derfor bør ordet også legge under adjektivet: _syk A = buohcci A, (å bli syk) buohccát V_ 1. Flerordsuttrykk? - 1. ja: men grensesnittet er avgjørende for hvor mange man kan vises - 1. nei: vil gi større artikler, viktig at det blir leselige (f.eks. i forhold til annen info. Evt fordel ved klikk-i-tekst - hvis mulig å få dem opp) + 1. ja: men grensesnittet er avgjørende for hvor mange man kan vises + 1. nei: vil gi større artikler, viktig at det blir leselige (f.eks. i forhold til annen info. Evt fordel ved klikk-i-tekst - hvis mulig å få dem opp) 1. Gjøre oppmerksom på at det på målspråket skilles mellom betydninger som det ikke skilles mellom på kildespråket \\ -Eksempel *grå A = ránis A, (om hår, skjegg, fjær) čuorgat A* + Eksempel _grå A = ránis A, (om hår, skjegg, fjær) čuorgat A_ 1. Gjøre oppmerksom at på målspråket bruker man f.eks. et verb for å uttrykke noe som man på kildespråket f.eks. bruker et flerordsuttrykk for. \\ -Eksempel *grå A = ránis A*, men også *(se grå ut) rádnát V* -1. Gjøre oppmerksom på at det finnes flere ord å velge mellom \\ -Eksempel *laks N = luossa N*, men også *(voksen hunnlaks) duovvi N, (voksen hannlaks) goadjin N, (unglask) diddi N* osv. *unglaks N = diddi N* kan også være et eget oppslagsord, hvis unglaks er et naturlig ord å lete etter på kildespråket. \\ -Men for *ball* vil det ikke være nødvendig å legge *fotball*, fordi det vil være naturlig ord å lete etter på kildespråket. - + Eksempel _grå A = ránis A_, men også _(se grå ut) rádnát V_ +1. Gjøre oppmerksom på at det finnes flere ord å velge mellom \\ + Eksempel _laks N = luossa N_, men også _(voksen hunnlaks) duovvi N, (voksen hannlaks) goadjin N, (unglask) diddi N_ osv. _unglaks N = diddi N_ kan også være et eget oppslagsord, hvis unglaks er et naturlig ord å lete etter på kildespråket. \\ + Men for _ball_ vil det ikke være nødvendig å legge _fotball_, fordi det vil være naturlig ord å lete etter på kildespråket. **Kildehenvisninger** - 1. kildehenvisninger som finnes (som ikke lenger er konsistente) - 1. fad: ordparet kommer fra korpus - 1. nj: ordparet kommer fra N Jernsletten + 1. fad: ordparet kommer fra korpus + 1. nj: ordparet kommer fra N Jernsletten 1. hva gjør vi med kildehenvisninga når vi legger til en oversetting, forbedrer ortografi, legger til en osv? - 1. Eit mogleg svar: la stå heilt til endringa er substansiell, deretter endre/fjerne + 1. Eit mogleg svar: la stå heilt til endringa er substansiell, deretter endre/fjerne 1. skal kildehenvisninga vises i grensesnittet? - 1. i tilfelle bør vi tilby to grensesnitt: eitt med og eitt utan + 1. i tilfelle bør vi tilby to grensesnitt: eitt med og eitt utan 1. kan kildehenvisninga brukes av Giellagáldu eller andre? 1. legge til informasjon i xml-formatet (signere ved endringer osv)? - - - - - # Verb ## Overordna spørsmål ### Fleirordsuttrykk eller ikkje som lemma? - Jf. eit tilfelle som vokse vs. vokse opp: - -``` +```text "vokse" (med eksempel vokse opp) "vokse opp" som eige oppslag ``` - Vi har tre alternativ: - 1. Berre einskildverb som oppslagsord (med partiklar som underoppslag) 1. Partikkelverb som separate oppslag 1. Begge delar - Konklusjon: ... vil variere frå tilfelle til tilfelle, sjå lenger ned. - ## Verbtyper ### Partikkelverb @@ -84,11 +64,9 @@ Konklusjon: ... vil variere frå tilfelle til tilfelle, sjå lenger ned. - bære hen - dale ned - -Dette er moglege kandidatar for å ha fleirordsuttrykk (her: *dale ned*) +Dette er moglege kandidatar for å ha fleirordsuttrykk (her: _dale ned_) som lemma. - ### verb + preposisjon Her er det tilfelle som @@ -96,207 +74,146 @@ Her er det tilfelle som - fokusere på - folde ut -her vil vi ha *fokusere* som oppslagsord, med to mønster: - -* fokusere ei kameralinsa, fokusere blikket (Acc) -* fokusere på ei sak (Ill?) - +her vil vi ha _fokusere_ som oppslagsord, med to mønster: +- fokusere ei kameralinsa, fokusere blikket (Acc) +- fokusere på ei sak (Ill?) ### verb + adverb/adverbial - Dette er tilfelle som - -``` +```text verb + ...fort, hardt, litt, plutseleg, ... fort + verb + ... (sma) ``` - - - - - -Når målspråkordet er resultat av eit modifisert +Når målspråkordet er resultat av eit modifisert uttrykk (hoppe til, plutselig hoppe til, ..). -Desse må under hovudverbet *hoppe* - - - - -*anse for å være dyr - divrrašit V* skal under *dyr* - - +Desse må under hovudverbet _hoppe_ +_anse for å være dyr - divrrašit V_ skal under _dyr_ ### verb + objekt - **Type 1**: objektet er nært knytt til verbet +- bake brød +- avlegge ed +- flekke fisk -* bake brød -* avlegge ed -* flekke fisk - - -=> legges under både verbet og også under objektet dersom tilsvarende verb ikke finnes på kildespråket, eller ikke ville være naturlig å lete etter. Eksempler: - -* *molte N = luomi N, (plukke molter) lubmet V* -* *kake N = gáhkku N, (bake kake) gáhkket V* - - +=> legges under både verbet og også under objektet dersom tilsvarende verb ikke finnes på kildespråket, eller ikke ville være naturlig å lete etter. Eksempler: +- _molte N = luomi N, (plukke molter) lubmet V_ +- _kake N = gáhkku N, (bake kake) gáhkket V_ **Type 2**: Objektet er ikkje like nært knytt til verbet -* ta et skritt = lávket V -* få posefasong = goarvanit V - +- ta et skritt = lávket V +- få posefasong = goarvanit V => legges under objektet -* *skritt N = lávki N, (ta et skritt) lávket V* -* *posefasong N = (få posefasong) goarvanit V* - +- _skritt N = lávki N, (ta et skritt) lávket V_ +- _posefasong N = (få posefasong) goarvanit V_ slike "oppslagsord" som "få posefasong" er styrt av målspråket, og ikke av kildespråket. - Man legger til en ny i entryen for å gjøre oppmerksom på at når man skal bruke det norske ordet "skritt" i sammenhengen "ta et skritt", så skal man bruke verbet "lávket" istedenfor substantivet "lávki" - - ### verb (+ objekt) + PP Dei mest komplekse av desse bør vi berre fjerne. Dei andre bør inn under ordet med mest semantisk vekt. T.d. heller under "ruse" enn under "fisk". - -* fange fisk med ruse => ruse V -* flette sennegressknipper => sennegressknippe N -* drive reinflokken ned fra fjellet til et bestemt sted => drive V -* forårsake at noen blir ille omtalt => baksnakke V, omtale V ? -* legge seg mer eller mindre flatt i vannet => fjerne? - +- fange fisk med ruse => ruse V +- flette sennegressknipper => sennegressknippe N +- drive reinflokken ned fra fjellet til et bestemt sted => drive V +- forårsake at noen blir ille omtalt => baksnakke V, omtale V ? +- legge seg mer eller mindre flatt i vannet => fjerne? ### Forklaring som oppslag +- "avgi skjærende lyd", +- "bruke mange norske ord når man snakker samisk" -* "avgi skjærende lyd", -* "bruke mange norske ord når man snakker samisk" - - -Viss vi ikkje finn måtar å gjere om desse til norske lemma +Viss vi ikkje finn måtar å gjere om desse til norske lemma (ein- eller fleirordsuttrykk), men berre blir ståande med forklaringar, fjernar vi dei frå nob-X-ordboka. - ### refleksive verb - -* barbere seg -* gifte seg - +- barbere seg +- gifte seg Refleksive verb har som regel også ikkje-refleksiv bruk: -*barbere snøkanten, gifte bort dattera si*. Refleksiven -bør vere i re-feltet, for å vise at det faktisk er refleksivt +_barbere snøkanten, gifte bort dattera si_. Refleksiven +bør vere i re-feltet, for å vise at det faktisk er refleksivt på norsk, og for å skilje mellom refleksiv og ikkje-refleksiv bruk. I nokre tilfelle kan verbet ta refleksivt pronomen også på samisk (ráhket iežas), i andre tilfelle ikkje (náitalit). Dette må gå fram av oppslaget. - ### Inkoative verb -* begynne å fortelle => fortelle V - +- begynne å fortelle => fortelle V ### Durative verb -* drive på å farte => farte V - +- drive på å farte => farte V ### Kausative verb - -* få noen til å gråte => gråte V -* få noen til å hoppe => hoppe V - - - +- få noen til å gråte => gråte V +- få noen til å hoppe => hoppe V ### Passive verb med bli + V - -* bli bundet -* bli drept - +- bli bundet +- bli drept Viss det er mogleg bruker vi s-passiv-infinitiven som lemma (bindes, drepes). På den måten kjem oppslagsorda som regel -attmed kvarandre (drepe, drepes). - +attmed kvarandre (drepe, drepes). For verb som er resultat av vanlig passivavledning, bør det lages oppslag bare hvis kildespråket krever det. - - - ### verb + adjektiv +- farge blå -* farge blå - - -Verbet *å farge* bør ha eitt eksempel, slik at brukaren +Verbet _å farge_ bør ha eitt eksempel, slik at brukaren ser at målspråket bruker eit verb avleidd av adjektivet. -Deretter bør *å farge gul* osb. vere eksempel under +Deretter bør _å farge gul_ osb. vere eksempel under gul, osb. - Det tilsvarande gjeld for andre verb avleidd av adjektiv. -I nokre tilfelle finst det slike norske verb -(*forstørre, forminske*), då bruker vi dei. - - - +I nokre tilfelle finst det slike norske verb +(_forstørre, forminske_), då bruker vi dei. ### bli + A - -* bli døv => døv A -* bli edru => edru A -* bli blekgul => blekgul A - - - +- bli døv => døv A +- bli edru => edru A +- bli blekgul => blekgul A ### Verb med restriksjon i lemmaet - -* dryppe fra sår => dryppe V - - +- dryppe fra sår => dryppe V ## Sentrale verb - ### ta - -``` -nobsme: +```text +nobsme: ta = váldit ta det med ro ta en snartur ta en tur ta et skritt = lávket ta et tak -ta forbehold +ta forbehold ta fort ta fram ta hensyn til @@ -309,11 +226,9 @@ ta slutt ta stilling til ``` - Nynorskordboka - -``` +```text ta 1 gripe 2 røre @@ -335,17 +250,4 @@ ta bladet fra munnen ta feil ``` - (lista her har omtrent ingen fleirordsuttrykk til felles med nobsme). - - - - - - - - - - - - diff --git a/dicts/SkoltSaami2X.md b/dicts/SkoltSaami2X.md index 7b6946cb..886e4ee6 100644 --- a/dicts/SkoltSaami2X.md +++ b/dicts/SkoltSaami2X.md @@ -2,97 +2,73 @@ This page documents the Skolt Saami dictionary projects at Giellatekno. - The backbone of all dictionary projects, incl. the contlex files for the morphological analyzer, the Skolt Saami Oahpa and different user dictionaries are the lexical data stored in a database called sms2X. - - - ## User dictionaries we are are working on at present -* [Neahttadigisánit: saan.oahpa.no](http://saan.oahpa.no) -* [Stem-based webdict](http://gtweb.uit.no/webdict/index_sms-eng.html) - - +- [Neahttadigisánit: saan.oahpa.no](http://saan.oahpa.no) +- [Stem-based webdict](http://gtweb.uit.no/webdict/index_sms-eng.html) ## Other applications linked to the lexical database -* [Oahpa!-nuõrti](http://oahpa.no/sms/) -* [FST](/lang/sms/j-sms.html) - - +- [Oahpa!-nuõrti](http://oahpa.no/sms/) +- [FST](/lang/sms/j-sms.html) ## The sms2X lexical data backbone + The aim with this common dictionary database is to create a rich structure in single lexicon. We are working on a lexicographic structure which later allows exporting data for different applications: e.g. descriptive dictionaries, bilingual learner dictionaries, Oahpa!-nuõrti, etc. Thus "sms2X" means both "to-X-languages" and "to-X-products". - The database is the result of collaborative work carried out at Østsamisk museum Neiden, Freiburg Research Group in Saami Studies, Giellatekno, and members of the Skolt Saami language communities. +### Using XML with the NDS dictionary -### Using XML with the NDS dictionary - - -* [Documentation for dictionary work with NDS](sms/SkoltSaamiDictionaryFeatures.html) - +- [Documentation for dictionary work with NDS](sms/SkoltSaamiDictionaryFeatures.html) ### Database -The dictionary database sms2X is devided into several single files, each representing one of the +The dictionary database sms2X is devided into several single files, each representing one of the #### Underived parts-of-speech - -* [a - adjective](SkoltSaami2X/Adjectives.html) -* [adp - adposition](SkoltSaami2X/Adpositions.html) -* adv - adverb -* cc - conjunction -* cs - subjunction -* det - determiner -* i - interjection -* n - noun -* [num - numeral](SkoltSaami2X/Numerals.html) -* pcle - particle -* pro - pronoun -* prop - proper noun -* v - verb - - -#### Derived parts-of-speech - +- [a - adjective](SkoltSaami2X/Adjectives.html) +- [adp - adposition](SkoltSaami2X/Adpositions.html) +- adv - adverb +- cc - conjunction +- cs - subjunction +- det - determiner +- i - interjection +- n - noun +- [num - numeral](SkoltSaami2X/Numerals.html) +- pcle - particle +- pro - pronoun +- prop - proper noun +- v - verb + +#### Derived parts-of-speech Since most derivations are formed by means of regular/productive morphology and do not represent own lemmas they are stored in separate files for derived PoS's with the link to the respective root as a variable. For different kinds of dictionaries, we will later handle derivations differently: - -* Oahpa!-nuõrti includes derivations similar to other lemmas (if these derivations are tagged for oahpa) -* saan.oahpa.no analyses derivations if they are written in the FST -* in a future printed dictionary some derivations will be listed under root lemmas -* contlex lexica do not include productive derivations - +- Oahpa!-nuõrti includes derivations similar to other lemmas (if these derivations are tagged for oahpa) +- saan.oahpa.no analyses derivations if they are written in the FST +- in a future printed dictionary some derivations will be listed under root lemmas +- contlex lexica do not include productive derivations A PROBLEM: what are the productive (non/lexicalized) derivations and how do we tag them? - These are the files for derived parts-of-speech: +- [der_a - derived adjectives](SkoltSaami2X/Adjectives.html) +- der_adv +- der_det +- der_n +- [der_num - derived numerals](SkoltSaami2X/Numerals.html) +- der_pro +- der_v -* [der_a - derived adjectives](SkoltSaami2X/Adjectives.html) -* der_adv -* der_det -* der_n -* [der_num - derived numerals](SkoltSaami2X/Numerals.html) -* der_pro -* der_v - - -#### Other - - -* [abbr - abbreviations](SkoltSaami2X/Abbreviations.html) -* mwe - multiword expressions (listed as lemmas, e.g. for Oahpa!-nuõrti) -* [inf - inflected forms](SkoltSaami2X/Inflections.html) -* [var - variants](SkoltSaami2X/Variants.html) - - - +#### Other +- [abbr - abbreviations](SkoltSaami2X/Abbreviations.html) +- mwe - multiword expressions (listed as lemmas, e.g. for Oahpa!-nuõrti) +- [inf - inflected forms](SkoltSaami2X/Inflections.html) +- [var - variants](SkoltSaami2X/Variants.html) diff --git a/dicts/SkoltSaami2X/Abbreviations.md b/dicts/SkoltSaami2X/Abbreviations.md index 28a7bea2..b5697239 100644 --- a/dicts/SkoltSaami2X/Abbreviations.md +++ b/dicts/SkoltSaami2X/Abbreviations.md @@ -1,21 +1,9 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - - - Abbreviations are listed as lemmas of its own with translations into Skolt Saami and other languages (which are again abbreviations) and explanations into Skolt Saami and other languages (which present the non-abbreviated full forms). - - - # The file abbr_sms2X - - - # Open question - Is it usefull to handle abbreviations like this? - - diff --git a/dicts/SkoltSaami2X/Adjectives.md b/dicts/SkoltSaami2X/Adjectives.md index b77e3df1..18fa5869 100644 --- a/dicts/SkoltSaami2X/Adjectives.md +++ b/dicts/SkoltSaami2X/Adjectives.md @@ -1,26 +1,15 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - Adjectivs are an open class in Skolt Saami. Adjectives are included in two different source files listed below. - # The file a_sms2X - underived adjective stems - - - # The file der_a_sms2X - derived adjective stems - - - # Open question - how to deal with attributive and predicative adjectives which are not derivable from each other diff --git a/dicts/SkoltSaami2X/Adpositions.md b/dicts/SkoltSaami2X/Adpositions.md index 56bdd0d5..d511086a 100644 --- a/dicts/SkoltSaami2X/Adpositions.md +++ b/dicts/SkoltSaami2X/Adpositions.md @@ -1,21 +1,9 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - - - Adpositions are non-inflecting closed class in Skolt Saami. There are mostly postpositions and few prepositions. - - - # The file adp_sms2X - - - # Open question - Should we rather use separate files for pr and po? - - diff --git a/dicts/SkoltSaami2X/Inflections.md b/dicts/SkoltSaami2X/Inflections.md index 3281e0c1..efdaf625 100644 --- a/dicts/SkoltSaami2X/Inflections.md +++ b/dicts/SkoltSaami2X/Inflections.md @@ -1,21 +1,9 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - - - Inflected forms are listed as lemmas of its own (with translations) if they occur in teaching materials or other sources and need to be compiled for oahpa and other dictionaries where users might search for them. For saan.oahpa.org these data is not used (because the FST should find inflected forms). - - - # The file inf_sms2X - - - # Open question - -Is it useful to handle inflected forms like this? - - +Is it useful to handle inflected forms like this? diff --git a/dicts/SkoltSaami2X/Numerals.md b/dicts/SkoltSaami2X/Numerals.md index dccfa74c..902d1eb7 100644 --- a/dicts/SkoltSaami2X/Numerals.md +++ b/dicts/SkoltSaami2X/Numerals.md @@ -1,18 +1,7 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - - - So far, the db-file includes both cardinals, ordinals, and other quantifiers; syntactically these are different PoS's though - - - # The file num_sms2X - - - # Open question - - diff --git a/dicts/SkoltSaami2X/Variants.md b/dicts/SkoltSaami2X/Variants.md index 3c191b35..ed69eddd 100644 --- a/dicts/SkoltSaami2X/Variants.md +++ b/dicts/SkoltSaami2X/Variants.md @@ -1,21 +1,9 @@ This page is part of the documentation of the [Skolt Saami dictionary projects at Giellatekno](../SkoltSaami2X.html). - - - -Variants (dialectal, etc.) are stored in a separate file with the link to the respective main lemma as a variable. - - - +Variants (dialectal, etc.) are stored in a separate file with the link to the respective main lemma as a variable. # The file var_sms2X - - - # Open question - Note that this is a weird structure, perhaps, but I want to park these variants somewhere. We do not need them now for Giellatekno-tools. However,Their final place cannot be in the main lemma list. - - diff --git a/dicts/TerminologyProjects.md b/dicts/TerminologyProjects.md index a348e219..61b5a315 100644 --- a/dicts/TerminologyProjects.md +++ b/dicts/TerminologyProjects.md @@ -1,15 +1,8 @@ - - - - Relevant links: - -* Risten 2: [satni.org|http://satni.org] (btw., [baakoe.org](http://baakoe.org) links to the same page) -* [Termwiki](http://gtsvn.uit.no/termwiki/index.php/Váldosiidu) - +- Risten 2: [satni.org|http://satni.org] (btw., [baakoe.org](http://baakoe.org) links to the same page) +- [Termwiki](http://gtsvn.uit.no/termwiki/index.php/Váldosiidu) Cooperation partners: - -* [Giellagáldu](https://www.facebook.com/SamiGiellagaldu/info) +- [Giellagáldu](https://www.facebook.com/SamiGiellagaldu/info) diff --git a/dicts/TermwikiAsDictionaryEditor.md b/dicts/TermwikiAsDictionaryEditor.md index e2dfe5f3..12373938 100644 --- a/dicts/TermwikiAsDictionaryEditor.md +++ b/dicts/TermwikiAsDictionaryEditor.md @@ -1,10 +1,6 @@ -Møte om Termwiki og ordbøker 7.6.2017 - - - - -# TermWiki slik han er organisert no +# Møte om Termwiki og ordbøker 7.6.2017 +## TermWiki slik han er organisert no ``` Semantikk: Konsept @@ -12,13 +8,10 @@ Semantikk: Konsept Uttrykk sme smj nob ... ``` - -# Termwiki-strukturen brukt for ordbøker - +## Termwiki-strukturen brukt for ordbøker Lister over ord på kvart språk (=lemmaliste): - ``` sme smj sma nob ord-i ... ... ord-a @@ -26,17 +19,13 @@ ord-j ... ... ord-b ord-k ``` - der ord-i, ... er eit sett av (lemma, stamme, POS, ...), dvs. ord-i = representasjonen av ord-i i lexc. - Ordbøker kan då representerast som ei samling med lenker frå ord-i i språk A til ord-a i språk B, så sme-ord-k kan t.d. bli lenka til nob-ord-c, nob-ord-d, nob-ord-e. - Skal vi ha éi lemmaliste per språk, eller mange? - Når du dreg ut -- når programmeraren dreg ut -... og för det må lingvisten skilne mellom sme-nob og sme-nob* +... og för det må lingvisten skilne mellom sme-nob og sme-nob\* diff --git a/dicts/TestingDictFST.md b/dicts/TestingDictFST.md index be69ffca..c49e53c5 100644 --- a/dicts/TestingDictFST.md +++ b/dicts/TestingDictFST.md @@ -1,56 +1,43 @@ # Testing - For å teste at FSTer for ordbøker fungerer som dem skal, analyser følgende ord. Se ellers [om tagger og FSTer](/lang//sme/KompilereFST.html). +## sme -## sme - - -```analyser-dict-gt-desc.xfst``` -* `vuovdi` skal gi både `vuovdi+N+Sg+Nom` og `vuovdi+N+NomAg+Sg+Nom` -* `girjje` skal gi girji `girji+N+Sg+Gen+Allegro` -* `tunnealla` skal gi `tunealla+v2+N+G3+Sg+Nom` -* `tunealla` skal gi `tunealla+N+G3+Sg+Nom` for bruk i NDS, men for VD skal det være `tunealla+v1+N+G3+Sg+Nom`. For øyeblikket bruker jeg for NDS disse to kommandoene før jeg kompilerer, men dette burde legges inn i Makefile: -** `perl -pi -e "s/v1\+//g" affixes/*lexc stems/*lexc ` -** `perl -pi -e "s/\+v1:/:/g" affixes/*lexc stems/*lexc` - `analyser-dict-gt-desc-mobile.xfst`: -* `cienal` skal gi `čieŋal+A+Sg+Nom` - `generator-dict-gt-norm.xfst`: -* `girji+N+Sg+Gen+Allegro` skal gi `girjje` -* `girji+N+Sg+Gen` skal gi `girjji` -* `deaivvadit+V+Ind+Prt+Pl3` skal gi `deaivvadedje` -* `deaivvadit+V+Ind+Prt+Pl3+Use/NGminip` skal gi `deaivvade` -* `vuovdi+N+Sg+Acc` skal gi `vuovddi` -* `vuovdi+N+NomAg+Sg+Acc` skal gi `vuovdi` -* `golli+N+Sg+Nom` skal gi `+?` -* `golli+N+G3+Sg+Nom` skal gi `golli` - - -## sma - - -```analyser-dict-gt-desc.xfst``` -* `govledh` skal gi `+Hom1` og `+Hom2` i analysen: -** `govledh govledh+Hom2+V+IV+Inf` -** `govledh govledh+Hom1+V+TV+Inf` +`analyser-dict-gt-desc.xfst` +- `vuovdi` skal gi både `vuovdi+N+Sg+Nom` og `vuovdi+N+NomAg+Sg+Nom` +- `girjje` skal gi girji `girji+N+Sg+Gen+Allegro` +- `tunnealla` skal gi `tunealla+v2+N+G3+Sg+Nom` +- `tunealla` skal gi `tunealla+N+G3+Sg+Nom` for bruk i NDS, men for VD skal det være `tunealla+v1+N+G3+Sg+Nom`. For øyeblikket bruker jeg for NDS disse to kommandoene før jeg kompilerer, men dette burde legges inn i Makefile: + ** `perl -pi -e "s/v1\+//g" affixes/*lexc stems/*lexc ` + ** `perl -pi -e "s/\+v1:/:/g" affixes/*lexc stems/*lexc` + `analyser-dict-gt-desc-mobile.xfst`: +- `cienal` skal gi `čieŋal+A+Sg+Nom` + `generator-dict-gt-norm.xfst`: +- `girji+N+Sg+Gen+Allegro` skal gi `girjje` +- `girji+N+Sg+Gen` skal gi `girjji` +- `deaivvadit+V+Ind+Prt+Pl3` skal gi `deaivvadedje` +- `deaivvadit+V+Ind+Prt+Pl3+Use/NGminip` skal gi `deaivvade` +- `vuovdi+N+Sg+Acc` skal gi `vuovddi` +- `vuovdi+N+NomAg+Sg+Acc` skal gi `vuovdi` +- `golli+N+Sg+Nom` skal gi `+?` +- `golli+N+G3+Sg+Nom` skal gi `golli` -```generator-dict-gt-norm.xfst``` -* `govledh+Hom1+V+Ind+Prs+Sg3` skal gi `gåvla` -* `govledh+Hom2+V+Ind+Prs+Sg3` skal gi `govloe` -* `govledh+V+Ind+Prs+Sg3` skal gi `+?` +## sma +`analyser-dict-gt-desc.xfst` +- `govledh` skal gi `+Hom1` og `+Hom2` i analysen: + ** `govledh govledh+Hom2+V+IV+Inf` + ** `govledh govledh+Hom1+V+TV+Inf` +`generator-dict-gt-norm.xfst` +- `govledh+Hom1+V+Ind+Prs+Sg3` skal gi `gåvla` +- `govledh+Hom2+V+Ind+Prs+Sg3` skal gi `govloe` +- `govledh+V+Ind+Prs+Sg3` skal gi `+?` +## Obsolete (old infra) - - - - -# Obsolete (old infra) [Obsolete (old infra)](GammelKompilereNettordbok.html) - - diff --git a/dicts/TheOsloBergenTagger.md b/dicts/TheOsloBergenTagger.md index 8d5ea9d3..2044dc30 100644 --- a/dicts/TheOsloBergenTagger.md +++ b/dicts/TheOsloBergenTagger.md @@ -1,31 +1,18 @@ - - - - - - To run text with the Oslo-Bergen tagger within this project, here is the pipeline (with paths as of standing in `$GTHOME/st/nob/obt`: - +```sh +cat textfile |./bin/mtag-osx64 | vislcg3 -g src/nob_morf-prestat.cg3 | OBT-Stat/bin/run_obt_stat.rb ``` -cat textfile |./bin/mtag-osx64 | vislcg3 -g src/nob_morf-prestat.cg3 | OBT-Stat/bin/run_obt_stat.rb -``` - -The tagger is stored in git, and documented in a +The tagger is stored in git, and documented in a [readme file](https://github.com/noklesta/The-Oslo-Bergen-Tagger/blob/master/README.md) - -Note that there is +Note that there is [a shellscript](https://github.com/noklesta/The-Oslo-Bergen-Tagger/blob/master/tag-bm.sh) with a command slightly different from the one presented here. - In order to run it, we need to change it a bit: - -* bin/mtag must be bin/mtag-osx64 -* cg/bm_morf-prestat.cg must be src/nob_morf-prestat.cg3 - - +- bin/mtag must be bin/mtag-osx64 +- cg/bm_morf-prestat.cg must be src/nob_morf-prestat.cg3 diff --git a/dicts/VDcheck.md b/dicts/VDcheck.md index 55e009d4..9721d965 100644 --- a/dicts/VDcheck.md +++ b/dicts/VDcheck.md @@ -1,7 +1,7 @@ # SJEKKLISTE for VD-ordboka (gammel infra): -# Spesielt for VD +## Spesielt for VD * sjekk at alle lemmaer (adjektiver, verb, substantiver, numeraler, pronIndef og propernouns) som genererer med enn en grunnform, merkes med v1 osv i fst og vmax i dict-filene. Eksempel på kommando for å sjekke dette. Pass på at isme-norm.fst er kompilert med at v1-taggene går til 0 i tag-not-save.regex før kommandoene: @@ -19,10 +19,10 @@ grep '" | cut -d ">" -f3 | - evt. legge til `l_ref` -# SJEKKLISTE for den nykompilerte VD-ordboka (gammel infra): +## SJEKKLISTE for den nykompilerte VD-ordboka (gammel infra): -# Minst 2-3 ord fra hver ordklassefil - og sjekk +## Minst 2-3 ord fra hver ordklassefil - og sjekk - hvordan ordklasse er presentert - entalls- og flertallsstedsnavn @@ -46,7 +46,7 @@ grep '" | cut -d ">" -f3 | - at evt. spellrelax fungerer -# Minst ett ord fra hver statisk fil +## Minst ett ord fra hver statisk fil - hvordan ordklasse er presentert - analysetaggene diff --git a/dicts/WebdictCompilation.md b/dicts/WebdictCompilation.md index 9a4e958f..16a64fa3 100644 --- a/dicts/WebdictCompilation.md +++ b/dicts/WebdictCompilation.md @@ -2,30 +2,28 @@ This text documents the compilation process of [a set of Apertium-style web dictionaries](http://gtweb.uit.no/webdict/index.html). -# Documentation +## Documentation (to be rewritten, after Anders' rewrite) +## Old, obsolete documentation -# Old, obsolete documentation +### Prerequisites -## Prerequisites - -* Make sure the environment variable GTHOME is defined -* The saxon library must be installed +- Make sure the environment variable GTHOME is defined +- The saxon library must be installed Both these points should be fullfilled if the advices in [Gettings started](/infra/GettingStarted.md) have been done. -* apertium-dixtools must be installed (`git clone https://github.com/apertium/apertium-dixtools.git; cd apertium-dixtools; ant jar; sudo ant install`) - -## Converting +- apertium-dixtools must be installed (`git clone https://github.com/apertium/apertium-dixtools.git; cd apertium-dixtools; ant jar; sudo ant install`) -* Go to the dictionary folder: `cd $GTHOME/words/dicts` -* Choose one of the dictionary folders there (e.g. `smenob`) -* Run `scripts/gtdict2webdict.py smenob` (replace smenob with your preferred folder) -* The resulting file will be found in `$GTHOME/apps/dicts/apertium_dict/dics/sme-nob-lr-trie.xml` (`sme` and `nob` will be different for other language pairs) +### Converting +- Go to the dictionary folder: `cd $GTHOME/words/dicts` +- Choose one of the dictionary folders there (e.g. `smenob`) +- Run `scripts/gtdict2webdict.py smenob` (replace smenob with your preferred folder) +- The resulting file will be found in `$GTHOME/apps/dicts/apertium_dict/dics/sme-nob-lr-trie.xml` (`sme` and `nob` will be different for other language pairs) -## Converting the webdicts on the UiT server (for Tromsø employees) +### Converting the webdicts on the UiT server (for Tromsø employees) Read [this documentation](https://divvungiellatekno.github.io/giellalt.uit.no/dicts/WebdictCompilation.html) diff --git a/dicts/checklist.md b/dicts/checklist.md index 96a51b68..927ba61d 100644 --- a/dicts/checklist.md +++ b/dicts/checklist.md @@ -2,19 +2,19 @@ Denne sida viser hvordan vi skal sjekke at ordbøkene er i god stand, teknisk sett. Kommandoene gjelder for alle NDS-ordbøkene. -Ordboka skal være xml-valid. Kommando: +Ordboka skal være xml-valid. Kommando: ``` xmllint --valid FIL.xml > /dev/null ``` -Det skal ikke være dubletter. Kommando: +Det skal ikke være dubletter. Kommando: ``` cat FIL.xml|grep '' | cut -d '>' -f3 |sort | uniq -d ``` -Alle lemmaene som skal genereres må ha lemma i norm-fst. Kommando for å sjekke om lemmaene er riktig skrevet og/eller er i FST: +Alle lemmaene som skal genereres må ha lemma i norm-fst. Kommando for å sjekke om lemmaene er riktig skrevet og/eller er i FST: ``` cat FIL.xml|grep ''|cut -d '>' -f3|uXNorm|grep '?' @@ -22,29 +22,23 @@ cat FIL.xml|grep ''|cut -d '>' -f3|uXNorm|grep '?' Kommando for å sjekke at lemmaet er lemma i LEXC: (kommer) - -Sjekk at eksempler har stor forbokstav og punktum: +Sjekk at eksempler har stor forbokstav og punktum: ``` -grep '' FIL.xml |l -``` - +grep '' FIL.xml |l +``` -Sjekke rettskriving i eksempelsetningene. -Eksempel fra smanob: +Sjekke rettskriving i eksempelsetningene. +Eksempel fra smanob: ``` grep '' N_smanob.xml |tr '<' '>' | cut -d '>' -f3 |preprocess | usmaNorm | grep '?' |grep -v CLB |l ``` - -Sjekke rettskriving i eksempelsetningene. Eksempel fra nobsma: +Sjekke rettskriving i eksempelsetningene. Eksempel fra nobsma: ``` grep '' N_nobsma.xml |tr '<' '>' | cut -d '>' -f3 |preprocess | usmaNorm | grep '?' |grep -v CLB |l ``` - -Stedsnavn med flere `` - den vanligste mg bør stå først fordi det er den oversettelsen som brukes i miniparadigmet - - +Stedsnavn med flere `` - den vanligste mg bør stå først fordi det er den oversettelsen som brukes i miniparadigmet diff --git a/dicts/crk/PlainsCreeDictionaryFeatures.md b/dicts/crk/PlainsCreeDictionaryFeatures.md index a0a55f40..a6f19947 100644 --- a/dicts/crk/PlainsCreeDictionaryFeatures.md +++ b/dicts/crk/PlainsCreeDictionaryFeatures.md @@ -1,100 +1,72 @@ -# Lexicon files +# Lexicon files - -Lexicon files are a part of the *langs/crk/src/morphology* infrastructure. +Lexicon files are a part of the _langs/crk/src/morphology_ infrastructure. This documentation is not intended to be an exhaustive document for the structure of the lexicon, but so far just the lexicon elements that have an effect on the display of the dictionary through NDS. +## Entry structure -## Entry structure - - -### level - +### level **Source attribute** - In order to display a dictionary source in the entry, it should be included as an attribute on the node. This should be the full text that you wish to be displayed. - ``` ``` - TODO: example image - -### level - +### level TODO: example crk entry - -### level - +### level contains one or more (translation group) which can contain: +### - a word -### - a word - - -``` TODO: example entry with ``` - - -### - a phrase - - -``` TODO: example entry with ``` - - -### - An explanation: a sentence which explains the meaning of a word, but can't be used in the translation. - - -``` TODO: example entry with ``` - - -### - Restriction - +`TODO: example entry with ` -* gives a restriction for the translation, f.ex. norwegian *vest* has the restriction *of clothes*, to separate it from the navigational direction. +### - a phrase +`TODO: example entry with ` -``` TODO: example entry with ``` +### - An explanation: a sentence which explains the meaning of a word, but can't be used in the translation. +`TODO: example entry with ` -### level +### - Restriction +- gives a restriction for the translation, f.ex. norwegian _vest_ has the restriction _of clothes_, to separate it from the navigational direction. -TODO: +`TODO: example entry with ` +### level -###