Skip to content

Commit

Permalink
Updates for issue tev2-tools/#27
Browse files Browse the repository at this point in the history
Signed-off-by: Rieks <[email protected]>
  • Loading branch information
RieksJ committed Apr 1, 2024
1 parent 7752aad commit 0401308
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 30 deletions.
2 changes: 1 addition & 1 deletion docs/specs/tools/21-mrgt.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ Then, the list of [term selection instructions](@) as specified in the appropria

[^1]: Two (or more) [MRG entries](@) cannot have the same value in their `termid` fields. Therefore, if an [MRG entry](@) is added whose value in its `termid` field already exists with an [MRG entry](@) that is already in the [provisional MRG](@), then this latter [entry](mrg-entry@) will be discarded, after which the new [entry](mrg-entry@) is added.

#### Processing FormPhrases
#### Processing FormPhrases {#processing-form-phrases}

[Form phrases](@) that are specified in a [curated text](@) may include uppercase characters, special characters, spaces etc., all of which make their use by tools cumbersome. In order to make it easier for [TEv2 tools](@) to use them, they need to be converted into [regularized form phrases](@).

Expand Down
22 changes: 11 additions & 11 deletions docs/terms/form-phrase-macro-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ termType: concept
isa:
glossaryAbbr: "Macro Map"
glossaryTerm: "Form Phrase Macro Map"
glossaryText: "a list of [form phrase macros](@); these maps are used by tools such as the [MRGT](@) and the [TRRT](@)."
glossaryText: "a list of [form phrase macros](@) that the [MRGT](@) will use to expand [form phrases](@) as specified for [curated texts](@) into [form phrases](@) as specified for [MRGs](@)."
glossaryNotes:
- "Form-phrase macro maps can be specified in the `scope` section of a [SAF](@)"
- "Form-phrase macro maps can be specified in (the `mrgt` section) of a [configuration file](/docs/specs/files/configuration-file) that is used when calling the [MRGT](@) and/or [TRRT](@)."
- "Form-phrase macro maps can be specified in (the `mrgt` section) of a [configuration file](/docs/specs/files/configuration-file) that is used when calling the [MRGT](@)."
formPhrases: [ "formphrase macro map{ss}", "formphrase macromap{ss}", "form-phrase macro map{ss}", "form-phrase macromap{ss}", "macro map{ss}", "macromap{ss}" ]
# Curation status
status: proposed
Expand All @@ -25,9 +25,9 @@ originalLicense: "[CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/?

# Form Phrase Macro Maps

A **Form Phrase Macro Map** is a list of [form phrase macros](@). Such lists are used by tools such as the [MRGT](@) and the [TRRT](@).
A **Form Phrase Macro Map** is a list of [form phrase macros](@) that the [MRGT](@) will use to expand [form phrases](@) as specified for [curated texts](@) into [form phrases](@) as specified for [MRGs](@).

[Form phrase macros](@) are typically useful in a limited set of languages. One might even say that for a given language, a specific set of useful [Form phrase macros](@) would exist.
[Form phrase macros](@) are typically useful for [scopes](@) in which [terms](@) are used that exist in particular languages, such as English, French, Dutch, etc. For every such language, particular sets of [form phrase macros](@) would be useful, which can be specified as [form-phrase macro maps](@).

## Purpose

Expand All @@ -38,7 +38,7 @@ A [form phrase macro map](@) enables [curators](@) to define a set of [form phra
[Form phrase macro maps](@) can be specified in

- the `scope` section of the [SAF](@) of such [scopes](@), or
- a [configuration file](/docs/specs/files/configuration-file) that is used when calling the [MRGT](@) and/or [TRRT](@).
- a [configuration file](/docs/specs/files/configuration-file) that is used when calling the [MRGT](@).

Whenever a [TEv2 tool](@) (e.g., the [MRGT](@)) needs a [form phrase macro map](@), this [macro map](@) is constructed as follows:

Expand All @@ -55,10 +55,10 @@ Here is an example of a [macro map](@) that specifies a set of [form phrase macr

~~~ yaml
macros:
- "{ss}": ["", "s", "'s", "(s)"], // "act{ss}" --> "act", "acts", "act's", "act(s)"
- "{ess}": ["", "es", "'s", "(es)"], // "regex{es}" --> "regex", "regexes", "regex's", "regex(es"
- "{yies}": ["y", "y's", "ies"], // "part{yies}" --> "party", "party's", "parties"
- "{ying}": ["y", "ying", "ies", "ied"], // "identif{ying}" --> "identify", "identifying", "identifies", "identified"
- "{es}": ["e", "es", "ed", "ing"], // "mangag{es}" --> "manage", "manages", "managed", "managing"
- "{able}": ["able", "ability"] // "cap{able}" --> "capable", "capability"
"{ss}": ["", "s", "'s", "(s)"], # "act{ss}" --> "act", "acts", "act's", "act(s)"
"{ess}": ["", "es", "'s", "(es)"], # "regex{es}" --> "regex", "regexes", "regex's", "regex(es"
"{yies}": ["y", "y's", "ies"], # "part{yies}" --> "party", "party's", "parties"
"{ying}": ["y", "ying", "ies", "ied"], # "identif{ying}" --> "identify", "identifying", "identifies", "identified"
"{es}": ["e", "es", "ed", "ing"], # "mangag{es}" --> "manage", "manages", "managed", "managing"
"{able}": ["able", "ability"] # "cap{able}" --> "capable", "capability"
~~~
41 changes: 23 additions & 18 deletions docs/terms/form-phrase.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@ displayed_sidebar: tev2SideBar
term: form-phrase
termType: concept
isa:
glossaryTerm: "Form Phrase (for a Term)"
glossaryText: "a word or phrase that occurs in oral or written texts and that refers to a particular [semantic unit](@), yet is not (necessarily) the [term](@) that is used in the [definition](@) of that [semantic unit](@). Form phrases can be, e.g., plural forms, possessive extensions, verb-conjugation forms, abbreviations, and other variations."
glossaryTerm: "Form Phrase (for a Semantic Unit)"
glossaryText: "a word or phrase that refers to a particular [semantic unit](@), yet is not (necessarily) the [term](@) that is used in the [definition](@) of that [semantic unit](@). Form phrases can be, e.g., plural forms, possessive extensions, verb-conjugation forms, abbreviations, and other variations."
glossaryNotes:
- "The set of [form phrases](@) that [TEv2 tools](@) can recognize, is specified in the [curated text](@) that documents that [unit](semantic-unit@). Such specifications may contain [form-phrase macros](@)."
- "For looking up the [semantic unit](@) (documentation, as specified in its corresponding [MRG entry](@)), [TEv2 tools](@) can match words or phrases they encounter with the [regularized texts](@) that are listed in the `formPhrases` field of [MRG entries](@). Such [regularized texts](@) do not contain [form-phrase macros](@)."
- "The [MRGT](@) ensures that the texts in the `formPhrases` field of a [curated text](@) are [properly converted](mrgt#processing-form-phrases@), and listed in the `formPhrases` field of the corresponding [MRG entry](@).
formPhrases: [ "formphrase{ss}", "form-phrase{ss}" ]
# Curation status
status: proposed
Expand All @@ -23,9 +27,16 @@ originalLicense: "[CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/?

A **Form Phrase** is a word or phrase that occurs in oral or written texts and that refers to a particular [semantic unit](@), yet is not (necessarily) the [term](@) that is used in the [definition](@) of that [semantic unit](@). Form phrases can be, e.g., plural forms, possessive extensions, verb-conjugation forms, abbreviations, and other variations.

<details>
<summary>Examples</summary>

[TermRefs](@) such as `[party](@)`, `[parties](@)` or `[party(s)](@)` should all refer to the same [semantic unit](@). This is achieved by specifiying "party", "parties", and "party(s)" as [form phrases](@) for that [semantic unit](@) in the [curated text](@) that documents that [unit](@).

</details>

## Purpose

[Form phrases](@) act as (standardized, human readable) identifiers for [semantic units](@), enabling consistent and unambiguous references across various texts such as manuals, specifications, and guidelines. This is particularly useful (if not vital) in fields where precise terminology is key, ensuring that all stakeholders have a common understanding of the terms used and thereby reducing the potential for misinterpretation or confusion.
[Form phrases](@) serve as (standardized, human readable) identifiers for [semantic units](@), enabling consistent and unambiguous references across various texts such as manuals, specifications, and guidelines. This is particularly useful (if not vital) in fields where precise terminology is key, ensuring that all stakeholders have a common understanding of the [terms](@) used and thereby reducing the potential for misinterpretation or confusion.

## Specifying Form Phrases in Curated Texts {#specifying}

Expand All @@ -49,27 +60,21 @@ The same varieties can easily be added for the human and machine actors, as foll
formPhrases: [ "actor{ss}", "human actor{ss}", "machine actor{ss}" ]
~~~

## Using/Matching Form Phrases {#matching}

Form phrases are used to refer to a particular [semantic unit](@) as known in a particular [terminology](@). In other words, they must identify the [MRG entry](@) and/or the [curated text](@) that documents this [semantic unit](@).
## Matching Form Phrases {#matching}

Here is how a [form phrase](@) is matched against:
Using (or: matching) [form phrases](@) is the process in which for a given word or phrase, it is determined whether or not it refers to a particular [semantic unit](@). This is done, e.g., by the [TRRT](@) as it [tries to find](trrt#finding-mrg-entry@) an [MRG entry](@) that corresponds with the [`showtext`](trrt#interpreter-profile@) field of a [TermRef](@).

1. [MRG entries](@), given the [MRG](@):
1. [Regularize](regularized-form-phrase#regularization-process@) the [form phrase](@);
2. Find all [MRG entries](@) that have the result an an entry in its `formPhrases`-field;
3. If there is a single such an [MRG entry](@), that is the one that matches the [form phrase](@).
This matching process uses the contents of the `formPhrases` field of [MRG entries](@), which are [derived from](mrgt#processing-form-phrases) the contents of the `formPhrases` field of [curated texts](@), and proceeds as follows:

2. [Curated texts](@):
1. [Regularize](regularized-form-phrase#regularization-process@) the [form phrase](@);
2. Find all [curated texts](@) (in the [curatedir](@) of the [current scope](@)) that have a `formPhrases`-field whose entries, after having been [regularized](regularized-form-phrase#regularization-process@), are identical to the result of step 1;
3. If there is a single such [curated text](@), that is the one that matches the [form phrase](@).
1. [Regularize](regularized-form-phrase#regularization-process@) the given word or phrase;
2. Find all [MRG entries](@) that have the result an an entry in its `formPhrases`-field;
3. If there is a single such an [MRG entry](@), then the text is a [form phrase](@) for the [semantic unit](@) described by that [MRG entry](@).

It is possible that there is no matching [MRG entries](@) and/or [curated texts](@).
It is possible that there is no matching [MRG entries](@).

If there are more than one matching [MRG entries](@) and/or [curated texts](@), that is an error condition - that should not happen. Such conditions are typically flagged, e.g., as an error by the [MRGT](@), and they need to be resolved.
If multiple [MRG entries](@) match, that is an error condition - that should not happen. Such conditions are typically flagged, e.g., as an error by the [MRGT](@), and they need to be resolved.

## Guidance for choosing Form Phrases {#guidance}
## Guidance for Specifying Form Phrases in Curated Texts {#guidance}

1. **Character Composition**: A form phrase is composed of a sequence of characters that may include letters, numbers, and spaces. Spaces are permissible if they are a standard part of the term (e.g., "hard drive").

Expand Down

0 comments on commit 0401308

Please sign in to comment.