-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathweek4.qmd
522 lines (281 loc) · 13.2 KB
/
week4.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
---
title: "Week 4 語音的世界 (2) "
author: "謝舒凱 台大語言所"
format:
revealjs:
slide-number: true
preview-links: true
chalkboard: true
# multiplex: true
logo: /pics/logo_round.png
jupyter: python3
bibliography: tol.bib
---
## 今天的主題
- 音韻學基本概念與分析
- 語音音韻的題型設計 (+ 上次的題目) 討論
- lab session : 語音轉寫與自動辨識應用
## 複習一下基本概念
- Phoneme vs. Allophone
- 如何描述子音與母音?
## 複習一下基本概念
::: {.callout-tip}
COOL 本週放上去的講義 (chaper 3) 有簡短說明。
:::
測試:[中文注音符號之漢語拼音與 IPA 拼寫對照表](https://homepage.ntu.edu.tw/~karchung/ChineseIPAJimmyrev.pdf)
## 另一種較大的角度
Consonant: Place of articulation
<small>
**Labial(唇音)**:由嘴唇的關閉或部分關閉來產生的,可以進一步細分為雙唇音(兩唇之間的發音)和唇齒音(嘴唇和牙齒間的發音)。
**Coronal(舌尖音/舌冠音)**:通過舌尖或舌冠與上腭的不同部位接觸來產生的,包括舌尖前音和舌尖後音等。
**Dorsal(舌背音)**:通過舌背與上腭或軟腭的接觸來產生的,涉及到的音包括軟腭音(k音和g音等)和腭音(如“y”音)。
**Laryngeal(喉音)**:通過聲帶和喉部結構的不同配置來產生的,包括喉塞音(例如英語中的h音)和聲門閉鎖音(產生一種停頓感)。
</small>
{fig-align="right"}
## Consonant: Manner of Articulation

## Consonant: Manner of Articulation

## Consonant: Manner of Articulation
Other manners of articulation
- Flap: Very short stop
- Trill
- Rhotic, Liquid
- Non-Pulmonic: Click, Implosive, Ejective
## Vowel
- Height
- Frontness
- Roundness
- Nasalization
- Length
{fig-align="right"}
## 超音段特徵 Suprasegmental Features
> 指那些跨越單個音素(語音的最小單位)的語音特徵,它們影響的是音節、單詞甚至更長語句的發音。不僅影響語音的音質,還能傳達情感、語氣和強調等語用信息。
##
<small>
**語調(Tone)**:語調是指聲音的高低變化,它在一些語言中是區分意義的重要特徵,如漢語。不同的語調可以改變單詞的意義。例如,漢語中的“媽”(高平調)表示“媽媽”,而“麻”(上聲調)則有不同的意義。
**重音(Stress)**:重音是指在單詞或語句中某個音節相對於其他音節發音更強、更高或更長。在許多語言中,重音的位置可以改變單詞的意義或語法特性。例如,英語中“record”(名詞)與“record”(動詞)就是通過重音的不同來區分的。
**節奏(Rhythm)**:節奏是指語言中音節或音素出現的時間模式。不同語言有不同的節奏特徵,例如,有些語言如西班牙語,其節奏比較均勻,而英語則是以重音音節為基礎,形成不均勻的節奏。
**長短(Length)**:在某些語言中,音素(特別是元音)的發音長度可以區分意義。例如,芬蘭語和日語中,音素的長短不同可以構成不同的單詞。
</small>
## 聽一個例子

## 回去有練習 IPA 嗎? 😏
<small>我們用一個題目來練看看</small>

完整題目在 COOL 本週。
## 至少知道這幾個基本概念
:::{.incremental}
- **Phoneme**: a contrastive sound or category of sounds within a language
- **Inventory**: the set of phonemes (sounds) in a language.
- **Allophone**: a variant of a phoneme
- **Phonotactics**: the rules governing the combination of phonemes in a language (e.g. no two consonants can occur together in English)
:::
## Phonetics vs. Phonology
- Phonetics: What the sounds are
- Phonology: How the sounds are organized
- Sounds that are phonetically different can be phonologically same

{width=360}
## Phoneme
- Are [p] and [pʰ] different sounds?
- In English?
- In Mandarin Chinese?
- How to determine: `minimal pair`
## Allophone
- Same phoneme, different environment
- e.g. pan [pʰæn] vs. span [spæn]
## Phonological analysis (1)

## 例子
## 音節 (Syllable) 和 語音組合法 (Phonotactics)
- Syllable structure
- Sonority scale
- Phonotactic constraints
## [音節結構](https://www.wikiwand.com/zh-tw/%E9%9F%B3%E8%8A%82)
- Onset(聲母):<small> 音節的開頭部分,包含一個或多個子音。它出現在核心之前。例如,在單詞 "cat" 中,"c" 是 onset。</small>
- Nucleus (核心):<small> 音節的核心部分,通常是一個母音。它是音節中最重要的部分,因為它賦予音節音高。例如,在單詞 "cat" 中,"a" 是核心。</small>
- Coda (韻尾):<small> 音節的結尾部分,包含一個或多個子音。它跟在核心後面。例如,在單詞 "cat" 中,"t" 是 coda。</small>
{width=300}
## 語音組合法 Phonotactics
- Consonant Cluster
- Open syllable vs. closed syllable:音節結尾是母音 (e.g. `go`) 還是子音 (e.g. `cat`)。
- Swahili, Hawaiian: (C)V
- English: (C)(C)(C)V(C)(C)(C)(C)(C?)
split; sixths; strengths
## Phonotactics: Sonority Hierarchy
<small>聲響階層
一種語音學原則,用於描述不同類型的語音元素(如元音、輔音)在發聲的響亮度或音響能量上的相對排列順序。這個層級幫助語言學家理解為什麼某些音節結構在世界的多數語言中更為普遍,以及為什麼某些音的組合在發音時會感覺更自然。</small>
- Basically: The sound closer to nucleus will be louder
- Loudness: Vowel > Sonorant > Fricative > Stop
> tree, art, blend, trust but not \*rtee, \*atr, *lbedn
::: {.callout-warning}
- Notable exception: s + stop : start
:::
## 重音 Stress
- Emphasis on a syllable of a word 某個音節相比於其他音節在發音時更加突出
- Stress pattern
- Stress and vowel reduction: `schwa`
- <photographer> /fəˈtɒɡrəfər/
- <photograph> /ˈfoʊtəˌgræf/
## 音調 Tone
- Pitch that carries lexical/grammatical information
- Tone pitch are relative
- Counter tone 聲調的區別在於音高隨時間的變化輪廓,如升調、降調、升降調等。
- Register tone 聲調的區別主要基於音高的絕對高度或相對高低,而不涉及音高的變化過程。
{fig-align="right" width=400}
::: {.callout-tip}
Pitch 是指聲音的音高,由發音時聲帶振動頻率的高低決定的物理特性。Tone 則指在某些語言中,不同的音高模式用來區分單詞意義或詞形變化的語音特徵。
:::
# 變調
[台語](https://www.wikiwand.com/zh-tw/%E9%96%A9%E5%8D%97%E8%AA%9E%E9%80%A3%E8%AE%80%E8%AE%8A%E8%AA%BF)
## 構詞音韻
<!-- Swahili (p13) -->
Allomorph
- Same morpheme, different pronunciation
- Why? Different environment!
- Example: English plural
{width=7in}
## Alternation
> an alternation is the phenomenon of a morpheme exhibiting variation in its phonological realization. Each of the various realizations is called an *alternant*. The variation may be conditioned by the phonological, morphological, and/or syntactic environment in which the morpheme finds itself.
**alternation** 是個語言學術語,不是音韻學術語
:::{.callout-note}
[維基百科沒提到的](https://www.wikiwand.com/en/Alternation_(linguistics))
<!-- - a scenario in which a **morpheme** has more than one surface realization, depending on the context in which it occurs. -->
<!-- - **alternation** is a special case of **free variation**. -->
<!-- - a phonological pattern in which an underlying form has more than one surface phone realization, depending on the context in which it occurs. -->
- sense alternation: regular polysemy; metonymy
:::
## Apophony

## Harmony

## Language Comparison

##

## Features, Rules, and Derivation
- Feature and natural class
- Form of phonological rules
- Rule and derivation
## Example rule
一個規則中,通常包括 *target* (the segment(s) which the rule applies to) 和 a *restructuring* (what the rule does to the target),以及一些 *environment* (the conditions under which the rule applies).
- **Rule 1** (for English plural): `[z] → [ə] before [i] and [u]`
- **Rule 2**: `[n] → [m] __ [velar]`
## How to solve phonological problems
[入門分析方法](https://instruct.uwo.ca/anthro/247a/1_Hints.pdf)
## 練習 1
`Indonesian`

- Identify each alternant of the Indonesian prefix.
- Identify the conditions under which each alternant appears.
- Construct rules that produce the various alternants of `/meŋ-/`
## 練習 2
`German neutralization`
- which roots alternate? what generalization can you make about the alternating and non-alternating roots?

<!-- ## Alternation with zero -->
## Sound change
<small>why: people are lazy 😄 </small>

## Example
{width=200}
## Sound Change: Assimilation
<small>Why: People are lazy 😄 </small>

## Sound Change: Elision and Cheshirization
- Elision: A sound disappears altogether
- Cluster reduction: English /wr/, /kn/, /gn/ > /r/, /n/
- Loss of unstressed vowel/syllable:
Latin centum > French cent /sɑ̃/
- Cheshirization: A sound leaves a trace
- English <night> /nixt/ > /ni:t/ > /naɪ̯t/
- French <vin blanc> /vim blaŋk/ > /vɛ͂ blɑ͂/
## Other Sound Changes
- Metathesis: swapping of sounds
e.g. English third (< OE þridda, cf. three)
- Fortition
- Dissimilation
- Epenthesis: Insertion of sounds
- e.g. ME emty > English empty
- e.g. Latin status > Spanish estado
## Chain Shift
A set of linked sound change

## Features of sound change
- No memory
- Ignores grammar
- Unstoppable
- Exceptionless
- Lexical Diffusion
- Influence from similar words
## Phonology and orthography
- Orthography retains an earlier form of the language
- English <knight> /knixt/ > /naɪ̯t/
- French <chanter> /tʃãnˈtæɾ/ > /ʃɑ̃ˈte/
- This phenomenon is primarily seen in language with long historical record
- Other: script insufficient for phonology
## 🙋 語音題目可以怎麼設計?
- 聽音辨音轉寫 (上週恐怖經驗 👻)
- 聲譜圖對應分析
- ?
## 一般的語言分析題目類型
- 翻譯 | Translation
- 對應 | Correspondence
- 填空 | Fill in the blanks
- 非自然語言| Non-language
## 如何解題
how to solve phonological problems (and turn them into puzzles)
:::{.incremental}
- Make observation
- Form a hypothesis (better with linguistic background)
- Test the hypothesis
- How to test
- Does it conform with the data?
- What if the data don’t match?
- Weaken the hypothesis
- Give the exceptions explanations
- Abandon the hypothesis altogether
:::
## 上週加分作業討論
<!-- # 聽寫練習 -->
<!-- `布農巒群方言` -->
<!-- - 請助教先說明一下 -->
## 機器即將做得比人好?
- 進入新的人機協作時代
- (多模態)大型語言模型 (e.g. **chatGPT**) (即將)改變了社會生活與教育的面貌。[@huang2023language]
[天下專題](https://www.cw.com.tw/article/5124816)
## 語言邏輯的重要
- **in-context learning**: a new NLP paradigm that allows language models to learn tasks given only a few examples in the form of demonstration.
- 從怎麼下 search query,到怎麼給 prompt。
## 練習用 `chatGPT` / `Claude 3`/`Gemini` 來解題
> 怎麼讓機器解出題目來?
> 怎麼不讓機器解出題目來?(你的期末專案題目)
## 語音也可以
- [OpenAI's whisper](https://openai.com/blog/whisper/) is an open-source, multilingual, general-purpose speech recognition model by OpenAI, trained on 680,000 hours of multilingual data collected from the web. (with 100 languages supported)
## 自動轉寫練習
**Speech to Text/Transcription** with OpenAI's whisper

[lab](https://colab.research.google.com/drive/10Ve_5sWbBoYk-U1Fr51n89-LBrH04eA4?usp=sharing):開啓後,請見建立【副本】再行編輯與執行。
## 課堂練習
- 布農巒群方言轉寫
- 錄一段你覺得人聽得懂,但機器還無法理解(轉寫)的語音句子,或對話。

<!-- [參考](https://towardsdatascience.com/speech-to-text-with-openais-whisper-53d5cea9005e) -->
# 聲音的視覺化 with Praat
- 一個 **免費**、跨平台的語音分析軟體,由荷蘭阿姆斯特丹大學的 Paul Boersma 和 David Weenink 維護。
## 練習
自行從 `NACLO` 或 `UKLO` 中選擇一個題目,自己解完之後(對答案),設法讓 `chatGPT` (or `GPT-4`) 來解題,並且將此過程記錄下來,寫成文檔上傳。
- 記錄的內容
- 題目 (來源、類型)
- 解題過程與時間,以及給機器的 prompting chains)
- 解題結果(人與機器的正確率)
- 討論
- 你或機器 (錯誤的原因)
- 與其他想法
##
:::{.callout-note}
- step-by-step / chain-of-thought
:::
## 第一份正式作業
##