On Monday, I'm pursuing the quixotic enterprise of talking to an NLP workshop about phonetics.
LRNLP ("Language Resources for NLP") 2018 is a workshop associated with COLING 2018 in Santa Fe NM. My abstract:
Semi-automatic analysis of digital speech collections is transforming the science of phonetics, and offers interesting opportunities to researchers in other fields. Convenient search and analysis of large published bodies of recordings, transcripts, metadata, and annotations – as much as three or four orders of magnitude larger than a few decades ago – has created a trend towards “corpus phonetics,” whose benefits include greatly increased researcher productivity, better coverage of variation in speech patterns, and essential support for reproducibility.
The results of this work include insight into theoretical questions at all levels of linguistic analysis, as well as applications in fields as diverse as psychology, sociology, medicine, and poetics, as well as within phonetics itself. Crucially, analytic inputs include annotation or categorization of speech recordings along many dimensions, from words and phrase structures to discourse structures, speaker attitudes, speaker demographics, and speech styles. Among the many near-term opportunities in this area we can single out the possibility of improving parsing algorithms by incorporating features from speech as well as text.
Due to semester-initial commitments at Penn, I won't be able to stay for COLING, but I'm looking forward to an interesting day of presentations at the workshop.