Miao Zhang (@mzhang89)

The second CorpusPhon Workshop is up, a satellite event at Labphon 2026.
Please check this out and submit your abstract!

#corpusphonetics #phonetics #labphon2026 #corpusphon

02.03.2026 18:02 👍 1 🔁 1 💬 0 📌 0

Chomsky on Foucault: “I'd never met anyone who was so totally amoral”

I guess Foucault didn’t retain that distinction forever

03.02.2026 22:27 👍 35 🔁 4 💬 2 📌 1

Newly released files shed new light on Chomsky and Epstein relationship Latest communications undermine Chomsky’s earlier claims that he primarily had financial dealings with Epstein

www.theguardian.com/us-news/2026...

04.02.2026 06:54 👍 1 🔁 1 💬 0 📌 0

I do have some cool figures that I can share here:

15.01.2026 13:01 👍 2 🔁 0 💬 0 📌 0

Work on Chinese tone/speech errors tends to show that speakers replace entire tones with different ones, e.g. tone /51/ instead of tone /213/. To me that has always meant a kind of holistic melody that is non-decomposable. That’s different from languages where contours are decomposable.

11.01.2026 05:00 👍 2 🔁 1 💬 2 📌 0

ASA2022 was based on the GAMM chapter and the kinematic analysis chapter, and PCC2023 was based on the duration chapter. I didn't collect new data or perform new analysis.

15.01.2026 12:27 👍 0 🔁 0 💬 0 📌 0

(PDF) Tone sandhi and tonal coarticulation in disyllabic sequences in Changsha Xiang PDF | This study investigates tone sandhi and tonal coarticulation in disyllabic sequences in Changsha Xiang, a Sinitic language with six lexical tones... | Find, read and cite all the research you ne...

www.researchgate.net/publication/... Our paper on the nature of Changsha Xiang positional sensitive tone sandhi has been accepted for publication in the Journal of Phonetics.

18.12.2025 07:38 👍 2 🔁 0 💬 0 📌 0

Tswana is among the languages with a really large VF0 in our data! (language code: tn)!

09.12.2025 15:45 👍 4 🔁 0 💬 1 📌 0

OSF

This Vowel Intrinsic F0 (VF0) pattern shows a deep cognitive bias toward uniform representation of vowels, modulated by flexible, communicative adjustments.

Read the preprint here: doi.org/10.31234/osf... #Linguistics #Phonetics #CognitiveScience #VF0 #PhoneticUniversals

08.12.2025 11:24 👍 2 🔁 0 💬 0 📌 0

OSF

🤯 Phonetic Universal Uncovered! 🎤
We analyzed over 60,000 speakers across 75 languages and confirmed a universal phonetic bias: High vowels (like /i, u/) are consistently spoken with a slightly higher pitch (F0) than low vowels (/a/).

08.12.2025 11:24 👍 12 🔁 4 💬 2 📌 0

#statstab #465 How to embrace variation and accept
uncertainty in linguistic and
psycholinguistic data analysis

Thoughts: An accessible paper on communicating your results with nuance.

#bayes #bayesian #uncertainty #error #bias #guide #tutorial

sites.stat.columbia.edu/gelman/resea...

21.11.2025 17:19 👍 3 🔁 2 💬 0 📌 0

a close up of a rat looking at the camera with the word drunken written in the corner ALT: a close up of a rat looking at the camera with the word drunken written in the corner

🎉 Finally out in Journal of Phonetics, tutorial with @paulbuerkner.com

📖 "Bayesian beta regressions with brms in R: A tutorial for phoneticians"

Accepted manuscript here: doi.org/10.31219/osf...

Repo: github.com/stefanocoret...

Publisher link: www.sciencedirect.com/science/arti...

15.11.2025 15:21 👍 29 🔁 8 💬 1 📌 0

OSF

Excited to share our new preprint with @mzhang89.bsky.social : “A crosslinguistic corpus phonetic analysis of intrinsic vowel duration” 🎉

🔗 osf.io/preprints/ps...

02.10.2025 16:46 👍 11 🔁 2 💬 1 📌 0

Bring back the iPod classic and the 3.5mm headphone jack

20.09.2025 10:51 👍 12 🔁 2 💬 0 📌 0

output from a GAM in the linked essay

Simon Wood, the GOAT of generalized additive models & creator of the mgcv #rstats package, has an Annual Review of Statistics essay on GAMs, available open access #statssky #mlsky

www.annualreviews.org/content/jour...

10.09.2025 02:14 👍 90 🔁 39 💬 0 📌 1

I feel things corrected by Grammarly feel less AI-generated than those corrected by general AI tools (ChatGPT-like). Is it just my illusion?

05.09.2025 13:58 👍 0 🔁 0 💬 0 📌 0

Interspeech 2025 poster on Quantifying and reducing speaker heterogeneity within the Common Voice Corpus

🗣️Mozilla Common Voice users!🗣️

Important notice: the client ID does not always correspond to a single speaker ID! Every so often, a single client ID contains more than one speaker’s voice. Our #Interspeech2025 paper examines the extent of this problem and proposes a solution

29.08.2025 10:25 👍 16 🔁 3 💬 1 📌 0

pacscilab/VoxCommunis at main We’re on a journey to advance and democratize artificial intelligence through open source and open science.

✅Similarity scores: huggingface.co/datasets/pac...

📄Paper: www.isca-archive.org/interspeech_...

💻Code: github.com/pacscilab/CV...

💫This was joint work with @mzhang89.bsky.social, Aref Farhadipour, Annie Baker, Jiachen Ma, and Bogdan Pricop

29.08.2025 10:25 👍 2 🔁 1 💬 0 📌 0

Home | Labphon

LabPhon 20 will be held in Montréal June 25–28, 2026, on the theme “Looking Back and Looking Forward,” to reflect on the field’s foundational contributions while highlighting new directions in laboratory phonology. Abstract submission deadline: Dec 1, 2025 labphon.org/labphon20/home

26.08.2025 13:32 👍 4 🔁 4 💬 0 📌 0

The similarity score file can be found in our VoxCommunis huggingface repo: huggingface.co/datasets/pac.... You can also see the scripts we used to obtain the similarity scores here: github.com/areffarhadi/...

21.08.2025 10:50 👍 2 🔁 0 💬 0 📌 0

(PDF) Quantifying and Reducing Speaker Heterogeneity within the Common Voice Corpus for Phonetic Analysis PDF | With its crosslinguistic and cross-speaker diversity, the Mozilla Common Voice Corpus (CV) has been a valuable resource for multilingual speech... | Find, read and cite all the research you need...

We presented our attempt to clean the Common Voice client ID for phonetic analysis at Interspeech 2025. Please check the poster here: www.researchgate.net/publication/.... The paper is also available at: www.isca-archive.org/interspeech_...

21.08.2025 10:33 👍 3 🔁 0 💬 0 📌 1

Introducing tidynorm – Væl Space Here’s a brief introduction to the new tidynorm package.

Introducing the tidynorm package! It's got convenience functions for applying your favorite vowel normalization methods to point measures, formant tracks, and DCT coefficients in a tidyverse workflow, as well as a flexible framework for defining your own normalization methods!

16.06.2025 15:35 👍 44 🔁 15 💬 2 📌 2

How final is final: The production and perception of utterance-medial and utterance-final boundaries We examine the production and perception of two types of phrase-final prosodic boundaries, specifically, utterance-medial and utterance-final intonation phrase (IP) boundaries in German. These two typ...

New insights into German #prosody! How do speakers & listeners distinguish utterance-medial vs. utterance-final #intonation boundaries in #German? Subtle differences in intonation, particularly in the rhyme's f0, are key cues for listeners. #LabPhon #openaccess #kinematics doi.org/10.16995/lab...

14.06.2025 03:17 👍 8 🔁 3 💬 0 📌 0

When people talk about neutralization in phonology, it's very important to check some phonetic data. It's very probable that we either didn't perceive it or overinterpreted some variance as non-natives.

10.06.2025 20:47 👍 2 🔁 0 💬 0 📌 0

ggplot2 is turning 18! 🎂

For nearly two decades, it’s helped data scientists turn complex data into clear, beautiful insights.

We’re throwing a birthday party at Data+AI Summit, with treats and limited-edition swag. Come celebrate with us and @hadley.nz!

📍 Posit Lounge (402)
📅 June 10, 6–8pm

09.06.2025 21:58 👍 119 🔁 16 💬 6 📌 8

Quantifying and Reducing Speaker Heterogeneity within the Common Voice Corpus for Phonetic Analysis With its crosslinguistic and cross-speaker diversity, the Mozilla Common Voice Corpus (CV) has been a valuable resource for multilingual speech technology and holds tremendous potential for research i...

arxiv.org/abs/2506.00733 Our Interspeech 2025 preprint.

03.06.2025 13:15 👍 1 🔁 0 💬 0 📌 0

The worst kind of echo chamber.

31.05.2025 21:45 👍 2 🔁 0 💬 0 📌 0

The debate doesn't exist in China, we just call it [ʈʂi˥ aɪ˥ ɛ˧˥fu]

27.05.2025 12:10 👍 0 🔁 0 💬 0 📌 0

In case people don't use it very often, or never knew its existence, glimpse() from dplyr is a much better function to use when you want to have a very rough look at your dataset than head() or summary().

27.05.2025 11:43 👍 2 🔁 0 💬 1 📌 0

Congratulations! Wow starting from Associate Professor sounds really great!

26.05.2025 11:10 👍 1 🔁 0 💬 1 📌 0

Miao Zhang

Latest posts by Miao Zhang @mzhang89