diff --git a/ling/LinguisticAnalysis.md b/ling/LinguisticAnalysis.md index 47d628da..fe54efbc 100644 --- a/ling/LinguisticAnalysis.md +++ b/ling/LinguisticAnalysis.md @@ -4,24 +4,6 @@ Linguistic analysis Instead of compiling the grammatical tools yourself (as described elsewhere on these pages), you may also **download ready-compiled analysers for text analysis**. This page explains how. If you **have** compiled the tools on your machine, we recommend [this page](../tools/docu-sme-manual.md) instead. -## Automatic grammatical analysis - -**Summary:** When you have downloaded the files (cf. the **Download...** links below), you will be able to run the following command in a terminal window (the language code *sme* is for North Saami, for other languages, see below): - - -``` -cat yourtextfile.txt | hfst-tokenise -cg sme.pmhfst | vislcg3 -g sme.cg3 -``` - - -The textfile is sent through a two-step analysis: First through the morphological analyser **sme.pmhfst**, -by using the support program **hfst-tokenise**. The flag *-cg* ensures morphological analysis in the required format. -Thereafter the output is disambiguated with the disambiguator sme.cg3, by using the support program vislcg3. -The flag *-g* identifies the file *sme.cg3* as the grammar file. In order to see more options, you may write -*hfst-tokenise -h* and *vislcg3 -h*. - -You may also conduct automatic dictionary lookup, see below. - # Download commands ## 1. Download the required *support programs* @@ -33,20 +15,26 @@ These commands will download the compilers *hfst* and *vislcg3*. They require a **Download on Mac:** ``` curl http://apertium.projectjj.com/osx/install-nightly.sh > install-nightly.sh + chmod a+x install-nightly.sh + sudo ./install-nightly.sh ``` **Download on Linux ubuntu:** + ``` wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash + sudo apt-get -f install apertium-all-dev ``` **Download on Linux fedora:** + ``` curl https://apertium.projectjj.com/rpm/install-nightly.sh |sudo bash + sudo apt-get -f install apertium-all-devel ``` @@ -92,6 +80,28 @@ Replace the language code **sme** with the language you want (note! the language More languages may be added upon request, from [this list](https://giellalt.github.io/LanguageModels.html). + +# Using the programs + +## Automatic grammatical analysis + +**Summary:** When you have downloaded the files (cf. the **Download...** links below), you will be able to run the following command in a terminal window (the language code *sme* is for North Saami, for other languages, see below): + + +``` +cat yourtextfile.txt | hfst-tokenise -cg sme.pmhfst | vislcg3 -g sme.cg3 +``` + + +The textfile is sent through a two-step analysis: First through the morphological analyser **sme.pmhfst**, +by using the support program **hfst-tokenise**. The flag *-cg* ensures morphological analysis in the required format. +Thereafter the output is disambiguated with the disambiguator sme.cg3, by using the support program vislcg3. +The flag *-g* identifies the file *sme.cg3* as the grammar file. In order to see more options, you may write +*hfst-tokenise -h* and *vislcg3 -h*. + +You may also conduct automatic dictionary lookup, see below. + + ## Download other programs ### dictionaries