Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terms for class and data type (English and Japanese) #303

Closed
wants to merge 41 commits into from

Conversation

TomKellyGenetics
Copy link
Contributor

@TomKellyGenetics TomKellyGenetics commented Apr 9, 2021

Contribution of related terms for class and data type (English and Japanese)

Sorry for a complicated PR but these are related terms so I think review together is appropriate.

Author:

Language:

  • English

Terms defined:

  • Type (new slug: data_type)
  • 'data munging' (synonym for 'data wrangling')
  • Data Frame (expanded to precise definition use in R community)

Language:

  • Japanese

Terms defined:

  • クラス (class)
  • データ型 (data type)
  • データフレーム (data.frame)
  • データマンジング (data munging)
  • データラングリング (data wrangling) 
  • ベクトル (vektor / vector)
  • パス (ファイルシステム内) / path (in filesystem)

@baileythegreen baileythegreen requested review from a team April 9, 2021 09:33
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated
en:
term: "data wrangling"
def: >
A colloquial name for small-scale [data engineering](#data_engineering).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this. I don't believe this term is only used in cases where the data are small; I don't believe it is only used colloquially; and the wording sounds a bit like 'only data engineers working with very large data do data engineering, everyone else is just wrangling data'.

I'm going to cite O'Reilly here, in their definition of data engineering they say this:

Data engineers wrangle data into a state that can then have queries run against it by data scientists.

I would suggest the definition be something like this:

Suggested change
A colloquial name for small-scale [data engineering](#data_engineering).
Another name for small-scale [data engineering](#data_engineering).

or:

Suggested change
A colloquial name for small-scale [data engineering](#data_engineering).
Some of the work done by [data engineers](#data_engineer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually agree with you. I wasn't so sure about this definition in English (my own work with genomics data is far from "small-scale"). To clarify, it was already defined in English here:

https://github.com/carpentries/glosario/blob/master/glossary.yml#L1928-L1932

glosario/glossary.yml

Lines 1928 to 1932 in 89cfe50

- slug: data_wrangling
en:
term: "data wrangling"
def: >
A colloquial name for small-scale [data engineering](#data_engineering).

What I've done is copied it from "data_wrangling" to "data_munging" and given a Japanese translation. As you rightly point out in PR #298 I confused the "ref" section for synonyms and I'm still not sure how to handle them. Does glosario project have any guidance on synonyms already?

I'm open to changing the original definition. It doesn't clearly describe what it involves to someone unfamiliar with "data engineering" either. It seems to have been contributed originally here: #65

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I think stating is colloquial is important. #303 (comment)

glossary.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@baileythegreen baileythegreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at the English additions and commented; I've also requested a review from the Japanese team for the rest.

glossary.yml Outdated Show resolved Hide resolved
Co-authored-by: Bailey Harrington <[email protected]>
@TomKellyGenetics
Copy link
Contributor Author

I've looked at the English additions and commented; I've also requested a review from the Japanese team for the rest.

Thanks for your prompt feedback! Masami has been alerted to this on the Carpentries Japan slack workspace. I'm sure we'll get the Japanese reviewed in due time. We're running an online Tokyo University workshop in Japanese at the moment so it's a good time for us update glosario for the topics discussed.

Copy link
Contributor

@naoe-tatara naoe-tatara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomKellyGenetics for the great effort on translation to Japanese!! I followed our agreement on how to end the translation in Glosario (ref. #229 ) and suggested changes. Hope they make sense to you.

@@ -1356,6 +1361,15 @@
propiedades y métodos. Los programadores generalmente definen comportamientos
genéricos o reutilizables en [superclases](#parent_class) y comportamientos
más específicos o detallados en [subclases](#child_class).
ja:
term: "クラス"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest the following translation, which is quite a direct translation of the English definition. It is inserted as change suggestion line by line in the following lines. Regarding the last sentence, I avoided to use the term inheritance, because IMO, the meaning of the term is quite programming-specific and its concept could be quite tricky for novice learners.

オブジェクト指向プログラミングにおいて、データと操作(メソッドと呼ばれる)を結びつけた構造のこと。プログラムはコンストラクタを用いて、クラスの持つ特性(プロパティ)やメソッドを備えたオブジェクトを作成する。通常プログラマは、汎用のあるいは再利用可能な振る舞いを親クラスに、より詳細なあるいは特定の振る舞いを子クラスに定義する。

Copy link
Contributor Author

@TomKellyGenetics TomKellyGenetics Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've updated the changes here: 7763c76 (only difference is 80 char line breaks as for rest of document)

As discussed above for English, "inheritance" is not defined yet so I agree to leaving it out. I wasn't sure how to translate the last sentence so I think this is an improvement. 😄

glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
glossary.yml Outdated Show resolved Hide resolved
@TomKellyGenetics
Copy link
Contributor Author

The following terms have been removed from this PR and moved to #307 for further discussion. All other terms have been reviewed without major issues so I think better to move this one.

Terms removed:

  • 'data munging' (synonym for 'data wrangling')
  • データマンジング (data munging)
  • データラングリング (data wrangling) 

Terms added:

  • パス (ファイルシステム内) / path (in filesystem)

@elletjies elletjies added the lang: ja issues and PR for Japanese entries label Jul 14, 2022
@elletjies
Copy link
Member

Closing this pull request since the last activity took place in April 2021. Please feel free to reopen in the future to make the necessary changes.

@elletjies elletjies closed this Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang: ja issues and PR for Japanese entries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants