The second CorpusPhon Workshop is up, a satellite event at Labphon 2026.
Please check this out and submit your abstract!
#corpusphonetics #phonetics #labphon2026 #corpusphon
The second CorpusPhon Workshop is up, a satellite event at Labphon 2026.
Please check this out and submit your abstract!
#corpusphonetics #phonetics #labphon2026 #corpusphon
Chomsky on Foucault: βI'd never met anyone who was so totally amoralβ
I guess Foucault didnβt retain that distinction forever
I do have some cool figures that I can share here:
Work on Chinese tone/speech errors tends to show that speakers replace entire tones with different ones, e.g. tone /51/ instead of tone /213/. To me that has always meant a kind of holistic melody that is non-decomposable. Thatβs different from languages where contours are decomposable.
ASA2022 was based on the GAMM chapter and the kinematic analysis chapter, and PCC2023 was based on the duration chapter. I didn't collect new data or perform new analysis.
www.researchgate.net/publication/... Our paper on the nature of Changsha Xiang positional sensitive tone sandhi has been accepted for publication in the Journal of Phonetics.
Tswana is among the languages with a really large VF0 in our data! (language code: tn)!
This Vowel Intrinsic F0 (VF0) pattern shows a deep cognitive bias toward uniform representation of vowels, modulated by flexible, communicative adjustments.
Read the preprint here: doi.org/10.31234/osf... #Linguistics #Phonetics #CognitiveScience #VF0 #PhoneticUniversals
π€― Phonetic Universal Uncovered! π€
We analyzed over 60,000 speakers across 75 languages and confirmed a universal phonetic bias: High vowels (like /i, u/) are consistently spoken with a slightly higher pitch (F0) than low vowels (/a/).
#statstab #465 How to embrace variation and accept
uncertainty in linguistic and
psycholinguistic data analysis
Thoughts: An accessible paper on communicating your results with nuance.
#bayes #bayesian #uncertainty #error #bias #guide #tutorial
sites.stat.columbia.edu/gelman/resea...
π Finally out in Journal of Phonetics, tutorial with @paulbuerkner.com
π "Bayesian beta regressions with brms in R: A tutorial for phoneticians"
Accepted manuscript here: doi.org/10.31219/osf...
Repo: github.com/stefanocoret...
Publisher link: www.sciencedirect.com/science/arti...
Excited to share our new preprint with @mzhang89.bsky.social : βA crosslinguistic corpus phonetic analysis of intrinsic vowel durationβ π
π osf.io/preprints/ps...
Bring back the iPod classic and the 3.5mm headphone jack
output from a GAM in the linked essay
Simon Wood, the GOAT of generalized additive models & creator of the mgcv #rstats package, has an Annual Review of Statistics essay on GAMs, available open access #statssky #mlsky
www.annualreviews.org/content/jour...
I feel things corrected by Grammarly feel less AI-generated than those corrected by general AI tools (ChatGPT-like). Is it just my illusion?
Interspeech 2025 poster on Quantifying and reducing speaker heterogeneity within the Common Voice Corpus
π£οΈMozilla Common Voice users!π£οΈ
Important notice: the client ID does not always correspond to a single speaker ID! Every so often, a single client ID contains more than one speakerβs voice. Our #Interspeech2025 paper examines the extent of this problem and proposes a solution
β
Similarity scores: huggingface.co/datasets/pac...
πPaper: www.isca-archive.org/interspeech_...
π»Code: github.com/pacscilab/CV...
π«This was joint work with @mzhang89.bsky.social, Aref Farhadipour, Annie Baker, Jiachen Ma, and Bogdan Pricop
LabPhon 20 will be held in MontrΓ©al June 25β28, 2026, on the theme βLooking Back and Looking Forward,β to reflect on the fieldβs foundational contributions while highlighting new directions in laboratory phonology. Abstract submission deadline: Dec 1, 2025 labphon.org/labphon20/home
The similarity score file can be found in our VoxCommunis huggingface repo: huggingface.co/datasets/pac.... You can also see the scripts we used to obtain the similarity scores here: github.com/areffarhadi/...
We presented our attempt to clean the Common Voice client ID for phonetic analysis at Interspeech 2025. Please check the poster here: www.researchgate.net/publication/.... The paper is also available at: www.isca-archive.org/interspeech_...
Introducing the tidynorm package! It's got convenience functions for applying your favorite vowel normalization methods to point measures, formant tracks, and DCT coefficients in a tidyverse workflow, as well as a flexible framework for defining your own normalization methods!
New insights into German #prosody! How do speakers & listeners distinguish utterance-medial vs. utterance-final #intonation boundaries in #German? Subtle differences in intonation, particularly in the rhyme's f0, are key cues for listeners. #LabPhon #openaccess #kinematics doi.org/10.16995/lab...
When people talk about neutralization in phonology, it's very important to check some phonetic data. It's very probable that we either didn't perceive it or overinterpreted some variance as non-natives.
ggplot2 is turning 18! π
For nearly two decades, itβs helped data scientists turn complex data into clear, beautiful insights.
Weβre throwing a birthday party at Data+AI Summit, with treats and limited-edition swag. Come celebrate with us and @hadley.nz!
π Posit Lounge (402)
π
June 10, 6β8pm
arxiv.org/abs/2506.00733 Our Interspeech 2025 preprint.
The worst kind of echo chamber.
The debate doesn't exist in China, we just call it [ΚΚiΛ₯ aΙͺΛ₯ ΙΛ§Λ₯fu]
In case people don't use it very often, or never knew its existence, glimpse() from dplyr is a much better function to use when you want to have a very rough look at your dataset than head() or summary().
Congratulations! Wow starting from Associate Professor sounds really great!