Skip to content

Commit

Permalink
Deploying to gh-pages from @ ec710db 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
MLopez-Ibanez committed Aug 12, 2023
1 parent 7871a11 commit 9cfd8ab
Show file tree
Hide file tree
Showing 4 changed files with 73 additions and 5 deletions.
37 changes: 35 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -19782,11 +19782,44 @@ <h2>References</h2>
Adrian Leemann.
<b>General Northern English: Exploring regional variation in the
North of England with machine learning</b>.
<em>Frontiers in Artificial Intelligence</em>, 2020.<br />
<em>Frontiers in Artificial Intelligence</em>, 3(48), 2020.<br />
[&nbsp;<a href="index_bib.html#StrLopBroLee2020">bib</a>&nbsp;|
<a href="https://doi.org/10.3389/frai.2020.00048">DOI</a>&nbsp;]
<blockquote>
In this paper, we present a novel computational approach to the analysis of accent variation. The case study is dialect leveling in the North of England, manifested as reduction of accent variation across the North and emergence of General Northern English (GNE), a pan-regional standard accent associated with middle-class speakers. We investigated this instance of dialect leveling using random forest classification, with audio data from a crowd-sourced corpus of 105 urban, mostly highly-educated speakers from five northern UK cities: Leeds, Liverpool, Manchester, Newcastle upon Tyne, and Sheffield. We trained random forest models to identify individual northern cities from a sample of other northern accents, based on first two formant measurements of full vowel systems. We tested the models using unseen data. We relied on undersampling, bagging (bootstrap aggregation) and leave-one-out cross-validation to address some challenges associated with the data set, such as unbalanced data and relatively small sample size. The accuracy of classification provides us with a measure of relative similarity between different pairs of cities, while calculating conditional feature importance allows us to identify which input features (which vowels and which formants) have the largest influence in the prediction. We do find a considerable degree of leveling, especially between Manchester, Leeds and Sheffield, although some differences persist. The features that contribute to these differences most systematically are typically not the ones discussed in previous dialect descriptions. We propose that the most systematic regional features are also not salient, and as such, they serve as sociolinguistic regional indicators. We supplement the random forest results with a more traditional variationist description of by-city vowel systems, and we use both sources of evidence to inform a description of the vowels of General Northern English.
In this paper, we present a novel computational approach to
the analysis of accent variation. The case study is dialect
leveling in the North of England, manifested as reduction of
accent variation across the North and emergence of General
Northern English (GNE), a pan-regional standard accent
associated with middle-class speakers. We investigated this
instance of dialect leveling using random forest
classification, with audio data from a crowd-sourced corpus
of 105 urban, mostly highly-educated speakers from five
northern UK cities: Leeds, Liverpool, Manchester, Newcastle
upon Tyne, and Sheffield. We trained random forest models to
identify individual northern cities from a sample of other
northern accents, based on first two formant measurements of
full vowel systems. We tested the models using unseen
data. We relied on undersampling, bagging (bootstrap
aggregation) and leave-one-out cross-validation to address
some challenges associated with the data set, such as
unbalanced data and relatively small sample size. The
accuracy of classification provides us with a measure of
relative similarity between different pairs of cities, while
calculating conditional feature importance allows us to
identify which input features (which vowels and which
formants) have the largest influence in the prediction. We do
find a considerable degree of leveling, especially between
Manchester, Leeds and Sheffield, although some differences
persist. The features that contribute to these differences
most systematically are typically not the ones discussed in
previous dialect descriptions. We propose that the most
systematic regional features are also not salient, and as
such, they serve as sociolinguistic regional indicators. We
supplement the random forest results with a more traditional
variationist description of by-city vowel systems, and we use
both sources of evidence to inform a description of the
vowels of General Northern English.
</blockquote>
<blockquote>
Keywords: vowels, accent features, dialect leveling, Random forest
Expand Down
41 changes: 38 additions & 3 deletions index_bib.html
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ <h2>What is this?</h2>

<h2>References</h2>

<h1>tmpZEClfNpkL0.bib</h1><pre>
<h1>tmpIHPH4dDB1W.bib</h1><pre>
@comment{{This file has been generated by bib2bib 1.99}}
</pre>

<pre>
@comment{{Command line: bib2bib --warn-error --expand --expand-xrefs authors.bib abbrev.bib journals.bib articles.bib biblio.bib crossref.bib --remove pdf -ob /tmp/tmpZEClfNpkL0.bib -oc /tmp/citefiletdYMjW2af3}}
@comment{{Command line: bib2bib --warn-error --expand --expand-xrefs authors.bib abbrev.bib journals.bib articles.bib biblio.bib crossref.bib --remove pdf -ob /tmp/tmpIHPH4dDB1W.bib -oc /tmp/citefiletXmPFg7K0d}}
</pre>

<pre>
Expand Down Expand Up @@ -18704,10 +18704,45 @@ <h1>tmpZEClfNpkL0.bib</h1><pre>
author = { Strycharczuk, Patrycja and Manuel L{\'o}pez-Ib{\'a}{\~n}ez and Brown, Georgina and Adrian Leemann },
journal = { Frontiers in Artificial Intelligence },
year = 2020,
volume = 3,
number = 48,
keywords = {vowels, accent features, dialect leveling, Random forest
(bagging), Feature selecion},
doi = {10.3389/frai.2020.00048},
abstract = {In this paper, we present a novel computational approach to the analysis of accent variation. The case study is dialect leveling in the North of England, manifested as reduction of accent variation across the North and emergence of General Northern English (GNE), a pan-regional standard accent associated with middle-class speakers. We investigated this instance of dialect leveling using random forest classification, with audio data from a crowd-sourced corpus of 105 urban, mostly highly-educated speakers from five northern UK cities: Leeds, Liverpool, Manchester, Newcastle upon Tyne, and Sheffield. We trained random forest models to identify individual northern cities from a sample of other northern accents, based on first two formant measurements of full vowel systems. We tested the models using unseen data. We relied on undersampling, bagging (bootstrap aggregation) and leave-one-out cross-validation to address some challenges associated with the data set, such as unbalanced data and relatively small sample size. The accuracy of classification provides us with a measure of relative similarity between different pairs of cities, while calculating conditional feature importance allows us to identify which input features (which vowels and which formants) have the largest influence in the prediction. We do find a considerable degree of leveling, especially between Manchester, Leeds and Sheffield, although some differences persist. The features that contribute to these differences most systematically are typically not the ones discussed in previous dialect descriptions. We propose that the most systematic regional features are also not salient, and as such, they serve as sociolinguistic regional indicators. We supplement the random forest results with a more traditional variationist description of by-city vowel systems, and we use both sources of evidence to inform a description of the vowels of General Northern English.}
abstract = {In this paper, we present a novel computational approach to
the analysis of accent variation. The case study is dialect
leveling in the North of England, manifested as reduction of
accent variation across the North and emergence of General
Northern English (GNE), a pan-regional standard accent
associated with middle-class speakers. We investigated this
instance of dialect leveling using random forest
classification, with audio data from a crowd-sourced corpus
of 105 urban, mostly highly-educated speakers from five
northern UK cities: Leeds, Liverpool, Manchester, Newcastle
upon Tyne, and Sheffield. We trained random forest models to
identify individual northern cities from a sample of other
northern accents, based on first two formant measurements of
full vowel systems. We tested the models using unseen
data. We relied on undersampling, bagging (bootstrap
aggregation) and leave-one-out cross-validation to address
some challenges associated with the data set, such as
unbalanced data and relatively small sample size. The
accuracy of classification provides us with a measure of
relative similarity between different pairs of cities, while
calculating conditional feature importance allows us to
identify which input features (which vowels and which
formants) have the largest influence in the prediction. We do
find a considerable degree of leveling, especially between
Manchester, Leeds and Sheffield, although some differences
persist. The features that contribute to these differences
most systematically are typically not the ones discussed in
previous dialect descriptions. We propose that the most
systematic regional features are also not salient, and as
such, they serve as sociolinguistic regional indicators. We
supplement the random forest results with a more traditional
variationist description of by-city vowel systems, and we use
both sources of evidence to inform a description of the
vowels of General Northern English.}
}
</pre>

Expand Down
Binary file modified testbib.pdf
Binary file not shown.
Binary file modified testshortbib.pdf
Binary file not shown.

0 comments on commit 9cfd8ab

Please sign in to comment.