Category Archives: Uncategorized

November 17, 2025 · 07:39

Chopping Up a TextGrid in Praat

Praat has added the functionality for automatically saving portions of audio files based on labeled intervals in a TextGrid, something that used to require a script. Here are the steps, demonstrated on an audio file by reader AG from the Librivox project “Her Hair“.

Open the audio file and the corresponding text grid, and create a “word” interval tier that has labeled intervals. The “sentence” tier in the example is just for convenient navigation between larger chunks of audio.

2. In the Praat Object window, click on “Extract”, and select “non-empty intervals.” In my example, the tier I want is the 2nd tier (number 2 on the left in the pic above), so I enter “tier 2”. Leave “preserve times” unchecked.

3. With all the sound files selected in the object window, click on Annotate>To TextGrid… Enter some sensible interval names (I went with “word”, “allophone”, and “context”. Remove the text from the box for “point tiers”. Click “OK”.

3. You should now have sound objects and corresponding text grids in your object window. Select the sound file along with its corresponding text grid, being careful to match the objects. Annotate the text grid (here, I am showing “dot org”, so I labeled the allophone as a flap and entered its context as “word-final” in the corresponding tier 3).

4. Once you have annotated all the text grids, you might want to check that everything matches correctly. Then select the sound files and the text grids, being careful to NOT select the original large sound file and mega-textgrid. Go to Save>save as binary file… and give the file a name that at the very least identifies the speaker by initials. This is what it looks like on my system:

5. Before removing the objects from your object window, check that your collection opens and that everything you just did looks right: go to “Open>Read from file”, navigate to where you just saved your collection, and open it. You’ll end up with a list of the same objects you just saved (sound+text grids) placed below the objects you were just working with–make sure you don’t get confused, so don’t save the whole list again. Once you’ve verified everything worked, you can remove the objects from your window and move on to the next speaker.

Comments Off on Chopping Up a TextGrid in Praat

Filed under Uncategorized

Tagged as praat, praat.collection, textgrids

April 14, 2025 · 07:54

NYU Linguistics Colloquium Speakers Since 2001

Some of the talks in the older years (2006 and before) are job talks, but I decided to leave them on the list since they were listed on the NYU webpages. You can find the old department webpages on archive.org.

Wang	Yang	2026
Peltier	Joy	2026
Major	Travis	2026
Jasbi	Masoud	2026
Kazanina	Nina	2025
Dillon	Brian	2025
Rett	Jessica	2025
Arregi	Karlos	2025
Kuo	Jennifer	2025
Chodroff	Eleanor	2025
Hacquard	Valentine	2025
Wright	Kelly	2025
Stockall	Linnaea	2024
Nadathur	Prerna	2024
Diercks	Michael	2024
Weissler	Rachel	2024
Murray	Sarah	2024
Bennett	Ryan	2024
Abels	Klaus	2023
Despic	Miloje	2023
Hall	Erin	2023
Matchin	William	2023
Coon	Jessica	2023
Beckford Wassink	Alicia	2023
Armoskaite	Solveiga	2023
Coppock	Elizabeth	2023
Sundara	Megha	2023
Zimman	Lal	2023
James	Ariel	2022
Yuan	Michelle	2022
Katz	Jonah	2022
Harley	Heidi	2022
Polinsky	Maria	2022
Moody	Simanique	2022
Sudo	Yasutada	2022
Jarosz	Gaja	2022
DiCanio	Christian	2021
Degen	Judith	2021
Wehbe	Leila	2021
Hou	Lina	2021
Garellek	Marc	2020
Aravind	Athulya	2020
Walker	Abby	2020
Deo	Ashwini	2020
Kramer	Ruth	2020
Hill	Joseph	2020
Momma	Shota	2020
D’imperio	Mariapaola	2020
Carstens	Vicki	2020
Degen	Judith	2020
Halpert	Claire	2019
Kaufman	Magda	2019
McPherson	Laura	2019
AnderBois	Scott	2019
Kalin	Laura	2019
Deal	Amy Rose	2019
Tessier	Anne-Michelle	2019
D’onofrio	Annette	2018
Potts	Chris	2018
Roelofsen	Floris	2018
Steriade	Donca	2018
Schuler	Kathryn	2018
Bedny	Marina	2018
Cheng	Lisa	2018
Embick	Dave	2018
Cristia	Alex	2018
Sharma	Devyani	2018
Legate	Julie	2017
Brennan	John	2017
Morzycki	Marcin	2017
Davidson	Kathryn	2017
Omaki	Akira	2017
Kennedy	Chris	2017
Tamminga	Meredith	2017
Dodsworth	Robin	2017
Zuraw	Kie	2017
White	James	2017
Heycock	Caroline	2017
Nagy	Naomi	2016
Grenoble	Lenore	2016
Wiltschko	martina	2016
Syrett	Kristen	2016
Sprouse	Jon	2016
Kang	Yoonjung	2016
Mielke	Jeff	2016
Wellwood	Alexis	2016
Citko	Barbara	2016
Benor	Sarah	2015
Jesney	Karen	2015
Coon	Jessica	2015
Anand	Pranav	2015
Keating	Pat	2015
Yang	Charles	2015
Xiang	Ming	2015
Henderson	Robert	2015
Landau	Idan	2015
Coetzee	Andries	2014
Pesetsky	David	2014
Piantadosi	Steven	2014
Miyagawa	Shigeru	2014
Peperkamp	Sharon	2014
Culbertson	Jennifer	2014
Murray	Sarah	2014
Babel	Molly	2014
Schwarzschild	Roger	2014
Fruehwald	Josef	2014
Emmorey	Karen	2013
Frank	Michael	2013
Pancheva	Roumi	2013
Larson	Richard	2013
Keshet	Ezra	2013
Baayen	Harald	2013
Merchant	Jason	2013
Yu	Alan	2013
Squires	Lauren	2013
Feldman	Naomi	2013
Rickford	John	2013
Bhatt	Rajesh	2012
Pearl	Lisa	2012
Yu	Kristine	2012
Johnson	Kyle	2012
Krifka	Manfred	2012
Walker	Rachel	2012
Hall-Lew	Lauren	2012
Hinrichs	Lars	2012
Hackl	Martin	2012
Padgett	Jaye	2011
Levy	Roger	2011
Holmberg	Anders	2011
Richards	Norvin	2011
Gick	Bryan	2011
Blevins	Juliette	2011
Thomason	Sarah	2011
Potts	Chris	2011
Hacquard	Valentine	2011
Carter	Phillip	2011
Bobaljik	Jonathan	2010
Childs	Tucker	2010
Kahnemuyipour	Arsalan	2010
Beaver	David	2010
Kendall	Tyler	2010
Goldstein	Louis	2010
Jaeger	Florian	2010
Flemming	Edward	2010
Kiesling	Scott	2010
den Dikken	Marcel	2009
Munson	Ben	2009
Stabler	Edward	2009
McCarthy	Corrinne	2009
Jacobson	Pauline	2009
Hale	John	2009
Demuth	Katherine	2009
Mallinson	Christine	2009
Kennedy	Chris	2009
Hornstein	Norbert	2009
Sigurðsson	Halldor	2009
Podesva	Robert	2008
Chemla	Emmanuel	2008
Winter	Yoad	2008
Nunes	Jairo	2008
Selkirk	Elisabeth	2008
Mendoza-Denton	Norma	2008
Becker	Michael	2008
Bresnan	Joan	2008
Matthewson	Lisa	2008
Hornstein	Norbert	2008
Pierrehumbert	Janet	2008
Clopper	Cynthia	2007
Wurmbrand	Susi	2007
Bernstein	Judy	2007
Wagner	Michael	2007
von Fintel	Kai	2007
Yang	Charles	2007
Bakovic	Eric	2007
Boberg	Charles	2007
von Stechow	Arnim	2007
Green	Lisa	2007
Davis	Matt	2007
McCarthy	John	2007
Anand	Pranav	2007
Labov	Bill	2007
Gordon	Matt	2006
Carey	Susan	2006
Harley	Heidi	2006
Hazen	Kirk	2006
Gouskova	Maria	2006
Broselow	Ellen	2006
Paster	Mary	2006
Bhatt	Rajesh	2006
Postal	Paul	2006
Stowell	Tim	2006
Pulvermuller	Friedemann	2006
Bucholz	Mary	2006
Marantz	Alec	2005
Zukowski	Andrea	2005
Wedel	Andy	2005
Chierchia	Gennaro	2005
Richards	Norvin	2005
Lahiri	Aditi	2005
Kiparsky	Paul	2004
Kratzer	Angelika	2004
Baker	Mark	2004
Senghas	Annie	2004
Rickford	John	2004
Marcus	Gary	2004
Blommaert	Jan	2004
Adger	David	2004
Jacobson	Polly	2004
Hayes	Bruce	2004
Beaver	David	2003
Phillips	Colin	2003
Steriade	Donca	2003
Thomas	Erik	2003
Marantz	Alec	2003
Collins	Chris	2003
Legate	Julie	2003
Honorof	Doug	2003
Davidson	Lisa	2003
Deprez	Viviane	2003
Cho	Taehong	2003
Hróarsdóttir	Thornbjörg	2003
Cutler	Cece	2003
Kenstowicz	Michael	2003
Pollock	Jean-Yves	2002
Zilles	Ana	2002
Landau	Idan	2002
Broselow	Ellen	2002
Szabo	Zoltan	2002
Bobaljik	Jonathan	2002
Sportiche	Dominque	2002
Carpenter	Bob	2002
Pater	Joe	2002
Pylkkanen	Liina	2002
Elbourne	Paul	2002
Schlenker	Philippe	2002
Sauerland	Uli	2002
Roeper	Tom	2002
Zoll	Cheryl	2002
Williams	Edwin	2002
Lasnik	Howard	2002
Grosu	Alex	2002
Mesthrie	Rajend	2002
Hornstein	Norbert	2002
Auer	Peter	2002
Kayne	Richard	2002
Gueron	Jacqueline	2002
Brody	Michael	2001
Dresher	Elan	2001
Blake	Renee	2001
Moltmann	Friederike	2001
Weldon	Tracey	2001
Delilkan	Ann	2001
Wilder	Chris	2001
Bruening	Benjamin	2001
Déchaine	Rose-Marie	2001
Crosswhite	Katherine	2001
Kavitskaya	Darya	2001
Starke	Michal	2001
Anttila	Arto	2001
Gross	Maurice	2001
Terry	J. Michael	2001
Blondeau	Helene	2001
Van Lancker	Diana	2001

Comments Off on NYU Linguistics Colloquium Speakers Since 2001

Filed under Uncategorized

April 8, 2025 · 17:10

Grad School and Honors Theses

Table Of Contents

Should you apply?
Before You Apply
Writing Samples: Dos and Don'ts
Personal and Research Statements
CV, transcripts, tests
Letters of Recommendation
The Interview
The Waiting, and the Waitlist
The Honors Thesis Option
Research Assistantships and Teaching

Here is some advice I have been giving to students who are interested in graduate school in linguistics. Some of it is a more detailed version of what you will find elsewhere, including on the NYU linguistics FAQ page for grad applicants. But some of the advice below is based on my own experience of reading applications, making admissions decisions, and advising honors theses. It doesn’t necessarily reflect how other people in the department think about these things, or how other universities do their admissions.

Should you apply?

Probably not. Grad school is not for everyone. Grad school trains you for a professional career. The primary skill you learn is how to conduct research, and this is a different skill than being “good at college”–you need creativity and an ability to function in a somewhat adversarial environment. Plenty of people who finish PhD programs do not succeed at getting jobs in academia, but the professors teaching you are primarily skilled at getting academic jobs. There are not many academic jobs. For example, in any given year, there may be anywhere between 5 and 7 tenure-track jobs for phonology nationwide in the United States. There may be as many as 100 candidates applying for these jobs. Of the jobs that are available, many will not be in universities or departments like the one where you get your PhD. Quite a lot of doctoral students end up working as adjuncts, with little job security or benefits. There are definitely easier ways to make a living, and there is a lot to be said for entering the job market right after college, so you can accumulate experience and retirement benefits earlier.

In addition to the job market being abysmal, the grad school environment does not agree with everyone. It is much more competitive than your college experience. All of a sudden, your coursework is also your job. Everyone else is asking smarter questions than you and has a longer and more impressive CV–or so it seems. Everyone else is presenting at conferences seemingly every other week, while you get rejections or end up with writer’s block. There is a ton of reading every week, and the homework is really hard. After the first year or two, there is no homework, but you are somehow expected to learn how to be your own boss and produce research without someone else setting deadlines for you. Some people thrive in this environment, but others find it miserable. Hint: your professors are probably people who did great in grad school, and they might tell you that it was the happiest time of their lives. But there are plenty of people who do not do well. You don’t know which type you are until you go through it.

Got all that? Good. Still want to apply? Sigh, okay. You should do it if you really love the idea of being a professor and cannot imagine doing anything else. After all, there are some great things about this job, and about grad school. Everyone around you is pretty smart–well obviously, they like the same things as you, so they must be, right? You get to make your own choices about what you study or write about. The schedule is relatively flexible. You get to teach and put mold in young minds (or something like that). So, how do you get in?

Before You Apply

You need to start thinking about this early, probably in your sophomore year or early in your junior year. The reason is that you really need to do some work at the graduate level, both to ensure that you can function in that setting and to show to the programs you are applying to that you know what you are getting into. So, you take whatever basic undergrad courses you have to early, and then take graduate courses–we let undergrads enroll in those with the professor’s permission.

A graduate course will usually require you to write a term paper. A standard part of any grad school application is a writing sample. The writing sample demonstrates both your ability to do research and your ability to write, so it is essential. And the fastest way by far to produce a good writing sample is to take a graduate or advanced class.

Writing Samples: Dos and Don’ts

Do include at least one writing sample in your area of interest. An application from someone interested in phonology needs to include a paper on phonology. Even better if your paper is in a specific area that you describe in your personal statement.
Your writing sample must be proof-read, not just by you but by a professor in your area of interest. Make it shine.
Your writing sample should not be a first draft. If you cranked the paper out on the eve of the due date on Adderall and haven’t looked at it since you got a grade, and you never did anything with the comments, you are not grad school material. Let me repeat that: people who do not make use of the comments that their professors painstakingly write on their papers are not good candidates for grad school.
Do include several writing samples, as long as they are all of decent quality (thus, all have been through at least one round of revisions). NYU officially asks for one, but you can append multiple PDFs. I personally like to see evidence of range.
Research on an original topic of your choosing is the best kind of writing sample. If a paper includes a lit review but no new ideas of your own, it should not be the only thing in the packet.
Do include a little note, if necessary, that explains what each of the samples is about and its history. For example, I assign a final project in my phonology class that involves making up a problem and coming up with a solution (hat tip to Beverley Goodman, who assigned such a project in my first phonology class). But if I were to include such a piece as one of my writing samples, I would also explain what the assignment was. Term papers usually stand on their own, but you can still explain that the paper was written for a seminar, or is not in your main area.
I do not recommend including papers from courses unrelated to linguistics. I rarely look at such papers, and I wonder why they are in the packet (especially if there is no linguistics sample).
Do not include subpar work just to pad your application out. Do not include every exercise and problem set you ever did in a phonology class.
Very long papers will probably not get read closely (50+ pages is on the long side), but if you have an MA thesis and it’s your best and only written work in linguistics, then of course you should include it.

I hope this clarifies why it is a good idea to take multiple classes that include a term paper as a component. You want to give yourself options.

Personal and Research Statements

When I was applying to graduate school, there was only one statement, but now some departments (including NYU) ask for two, and students often don’t know what the difference is.

The research statement is sometimes called “Statement of Purpose”. This is the essential document that explains what you plan to do in grad school, what your specific interests are, and what you have already done. A good research statement is specific: it is not enough to say you are interested in phonology, or even in stress or syllable structure or whatever. The statement needs to show your familiarity with theoretical issues or your experience in designing and running experiments. I also like to see more than one potential area of interest, with some specificity. Graduate programs usually require that you write two qualifying papers in distinct areas, and it is good if you already have a tentative plan for what those might be. You are allowed to change your mind later.

Should you mention specific potential advisors? It’s not essential. NYU’s online application system does ask applicants to check names of “people of interest”. You do not need to name-check every faculty member in the department in your statement. But if you do mention specific people, make sure you know what they actually work on and what kind of advising they have done. When in doubt, leave it out. At the very least, talk to faculty in the same area before you mention people.

It should go without saying that your statement should be read by a professor before you submit it.

Personal statements are optional at NYU, and I wouldn’t submit one myself unless I absolutely had to. I just assume that nobody wants to hear my life story. I also have seen quite a wide range of things that people put in these statements. Sometimes it’s relevant to the grad school application, but often it is not. So, my advice is to write whatever is relevant to the grad school application in that statement. For example, your path toward the decision to apply could be described there; whether it’s your childhood love of dictionaries or your chemistry MA that didn’t work out. Corollary: I don’t think that stuff needs to be in your research statement. I would make that strictly about your research interests, not your life story.

CV, transcripts, tests

Transcripts are usually not optional, and they are usually self-explanatory. If there are any blots on your transcript–any grade of C or lower, I’d say–then I would include a note if you have an explanation for that.

The Curriculum Vitae is a document that summarizes your professional experience. You should take a look at CVs of faculty and grad students on the department websites for examples and inspiration. Ideally, you have something like a conference presentation and maybe some research assistantships to include. It isn’t expected to be long.

Tests such as the GRE used to be a normal part of the application but have been largely phased out. If they come back, find out what a decent score is, and sit the test twice if necessary. For the GRE, the raw score would come with a percentile, and you’d want to be somewhere in the top quarter of the pack in every category. Linguistics involves quantitative and verbal reasoning, so having abnormally low scores in either area is a worrying sign.

Letters of Recommendation

You usually need three of these. Ideally they come from faculty who have seen your research, so again taking a wide variety of high-level courses is essential.
You should ask your professors early–not two days before these letters are due. I would give them at least a couple of weeks.
You should verify that the professor is able to write you a strong letter. If the answer is no, find someone else. (But also ask yourself if you have some self-improvement to do.)
Finally, if you are asking more than a semester after you took the class, remind the professor what kind of work you did for the class. I can usually reconstruct this from my records, but it is better if the student provides me with some clues.

The Interview

We do Zoom interviews with people whose applications caught our eye. Many schools do not. Some schools will invite their top choices for a campus visit and make the decisions only after the open house is over.

So, how do you prepare for the interview? I would ask a faculty member to give you a mock interview. You should expect to describe your research interests, and your past experience as well as future plans. You might get follow-up questions about your writing samples, so make sure you are prepared for some grilling. I would even go so far as to ask to be interviewed in the same medium as the grad school will be using: for a Zoom interview, practice on Zoom.

When doing a video interview, you need to pay attention to whether the people are still listening to you, and make sure you do not talk too long. Ask follow-up questions: “Did I answer your question or would you like more detail?” That kind of thing.

The Waiting, and the Waitlist

Usually you can expect to hear something by mid-February. If you get in, great! If you get multiple offers, discuss them with your professors. The schools would like to know what you plan, either way, as soon as possible. A quick “no” is useful; grad schools compete for some of the same candidates so they do not expect everyone they offer admission to say yes. Quick decisions also allow them to move on to the waitlist if needed.

When I applied to grad school, I got waitlisted at all three programs. Then, right before April 15, I got offers from all three. (Looking back to my application, I can see why I didn’t make the first cut: even though I had taken a ton of classes, my writing was awful. One of my professors gave me a writing book, Williams’s Style: Clarity and Grace as a parting present. Perhaps it was a hint.)

Don’t take it personally if you get waitlisted or rejected; sometimes it has nothing to do with the quality of your application. It can matter, for example, whether other subfields in the department have enough students. At NYU, we have a target every year (e.g. 6 or fewer admits). If we get too many students in one year, we are often not allowed to make many offers in the following year, so many good applicants get rejected.

If you do not hear from anyone for a long time after applying, it is fine to write to the department and ask when and whether decisions have been made. Do not expect a detailed explanation of why you didn’t get in. But you shouldn’t just submit the same application again to the same programs. Talk to your professors, but one option would be to apply for a Master’s, possibly in a European university where such programs are more plentiful. You can also find a gig as a lab manager, either in the US or abroad. That kind of experience greatly boosts your chances of succeeding the following year.

If you are applying to the same program for the second time, you should make clear what is different in your application this time around. I would put that information prominently in the research statement (which is usually the first part of the application I read).

The Honors Thesis Option

Now we get to the Honors thesis. I generally do not recommend this option for students interested in grad school. The reason is that the thesis takes too long. Deadlines for grad programs are usually in December or so, and theses are not completed till May of the following semester. You shouldn’t send an unfinished thesis, and if all you have is the proposal, you will not stand out compared to someone with extensive research experience.

In Italy in the 1970’s, everyone who went to college had to write a thesis. Apparently it was such a nightmare to advise these that Umberto Eco wrote an entire book explaining what a thesis is and how to do research for one, and how long it takes. Look it up: it’s called, shockingly, How to Write a Thesis. It is very good, and it has much of the same advice that I have given to students for decades.

In American universities, theses are not usually required, though some schools do have this requirement. Most just make it an option for graduating with honors. Is it worth doing a thesis just so you can graduate with honors? Honestly, I don’t think so. If you plan to go on the non-academic job market, nobody cares what grades you got, or whether you got honors. They probably care that you went to college, and possibly which one. But even that might be a thing of the past. Talking to non-academics about this, the general consensus I got was that any job that asks you for your GPA or whether you got honors in college is one you should run from.

Also, pragmatically, an honors thesis is a dubious preparatory experience for non-academic life. You learn to write something quite long, while real life usually requires you to be brief. You write for an audience of two people (thesis advisor + reader), on something quite obscure. You do learn how to organize information and go deep into a subject, but there are other, more useful ways of doing that.

For the rare few who write a thesis after having completed many good term papers in advanced courses, of course, the thesis demonstrates the ability to engage with a long-term research project and to write regularly. I have seen this come to fruition exactly twice in my teaching career, and neither thesis ended up being developed into a publication.

If you are intent on writing a thesis anyway, here’s my advice.

Take an advanced class, preferably a grad one, in your area, and test-drive the topic as a term paper first.
Establish a relationship with a professor, ensuring that your styles mesh. I tend to want regular, timely work. If your style isn’t like that, we won’t get along. It can go the other way: if you are prompt and diligent, but your prospective advisor does not give you feedback and is hard to reach, you should know before you embark on the project.
The topic should be chosen with the advisor’s help. I usually recommend to all my students that they narrow things down to two or three topics, do enough research on each to explain them to the prof, and select among them with the prof’s help. Do not just bring the topic to the prof as an ultimatum; it needs to be a two-way street.
If you intend to use the thesis as a writing sample for grad school, you either need to complete the writing and revisions before December or apply in the year following your graduation.

Research Assistantships and Teaching

These fall under the category of ‘transferable skills’—probably more transferable than any honors thesis writing.

In a research assistantship, you perform tasks set to you by a professor. This can be data preparation or processing, or it can involve collecting data from experimental participants. RAs are often trained to use specialized software such as Praat. You might be asked to do some library research. When I was a student, I was asked to help a professor find video materials that could be useful in teaching an introductory linguistics course–this is somewhere between an RAship and a TAship.

At NYU, TA-ing is normally done by our own grad students, but we sometimes hire former BA majors, as well. TAs do grading, lead recitation sections, and occasionally have to prepare their own materials for teaching. TAs also often have to re-explain material that students either missed in lecture or didn’t understand on first pass, so they have to have a good grasp of the subject being taught. Even if you do not work as a TA, though, you can do some tutoring at NYU’s University Learning Center.

It is obvious how these skills would help in an academic job. But how do you transfer these skills to a corporate job? Well, you have to do tasks set to you by a manager, on a schedule. You also might have to train employees and supervise their work. Training involves breaking the activity up into separate easy steps, and you have to understand the activity fairly deeply in order to do that. So, you should seek out these opportunities if you want to acquire these skills.

Comments Off on Grad School and Honors Theses

Filed under Uncategorized

February 10, 2025 · 10:24

Gouskova 2025

Phonological Selection in Small Sublexicons

In Proceedings of Annual Meeting on Phonology 2023-2024, edited by Gerard Avelino, Merlin Balihaxi, Quartz Colvin, Vincent Czarnecki, Hyunjung Joo, Chenli Wang, Utku Zorbarlar, Adam Jardine, Adam McCollum.

[PDF with grayscale figures] [PDF with color figures] [git repo/supp materials]

Note: if you do not see lines in certain figures that claim to have lines, please try a different PDF viewer.

Affixal phonological selection tends to be coarse-grained rather than granular. Stress, syllable count, and C/V composition figure in many examples, while subsegmental/featural generalizations are less common. I argue that this is a consequence of statistical learning. When the learning dataset is comparatively small, only coarse generalizations are reliable. My case study investigates the Russian suffix -ast, which predominantly attaches to body part nouns to form adjectives (e.g., [glaz-ast-ij] ‘big-eyed’). There are only 17 lemmata in common usage, and they respect a size limit (mono- and disyllables). I show that the disyllabic maximum is productive in a corpus study, and that productive use shows frequency matching for syllable count to the learning data but not to nominal stems in general. By contrast, fine-grained featural generalizations about the stem-final consonants appear to be largely ignored in productive use. Speakers extend the suffix to featural contexts unseen in learning data, assuming that the sparse sample is representative–a conclusion supported by the composition of the lexicon. I relate my findings to the Subcategorization Frame theory of selection, the modular separation theory of Scheer (2016), and tolerance theory of Yang (2016).

Comments Off on Gouskova 2025

Filed under Uncategorized

December 13, 2023 · 16:08

How to Make a .Collection File

A Praat .Collection file allows you to save all the audio files and text grids that go with them into a single file. It’s a bit like a zip file, except it isn’t compressed. It also, conveniently, keep the order in which the original audio files and text grids (and any other objects) appeared in your Praat object window.

Make sure all your objects–sound files, text grids, etc. are named what you want them to be named, and that they appear in the correct order. For another person looking at your files, it’s most convenient if your sound file is adjacent to the corresponding text grid in the Praat object window. Here, you can see that I have audio files and text grids paired by name:

2. Select all the objects in the Praat window that you want to save together. In my operating system (Linux), Ctrl+A selects all the objects. Windows is probably similar. In Mac OS, it will be the Command key.

3. Go to “Save” and select “Save as binary file…”:

4. Give the file some name that will make sense to you later. The default is “praat.Collection”. Hit “save”.

5. To reopen the .Collection file, first open Praat, then go to “Open>Read from file” and navigate to wherever you originally saved it. If all went well, you will see a window that looks exactly like step 2 above.

Comments Off on How to Make a .Collection File

Filed under Uncategorized

August 14, 2023 · 13:13

FAQ

Reviewing

I am currently (2025-2028) serving as an Associate Editor of Language, and as an editorial board member for NLLT and Phonology, I do a fair amount of reviewing for those journals as well. I cannot accept review requests on topics unrelated to my current areas of research. If in doubt, ask, but please understand if I decline.

Prospective Grad Students

I often get email inquiries from people interested in our PhD program, with a CV or a writing sample attached and a request that I look at the file. The way our program does admissions, no one faculty member has power over who gets in. There is a detailed explanation of this on our department’s FAQ for prospective applicants.

If you think that your research is a close match to my interests (morphology and phonology), of course I will consider advising you if you get accepted–but writing to me instead of applying to the program is not going to help you get in. Instead, I recommend that you study the FAQ closely. When you submit your application, make sure to check my name as someone who influenced your decision to apply to NYU Linguistics, and I will read your file then. Good luck, and thank you for your interest in our program.

Undergrads Interested In Research Opportunities

The best way by far to get into research is to take a course with me and to do well in it. If I have not taught you, I am unlikely to choose you as a research assistant.

You should also read this page, which goes into some detail about our undergraduate program and advising structure: https://as.nyu.edu/departments/linguistics/undergraduate/frequently-asked-questions-for-undergraduate-students.html. I have some expectations of students who want to do an honors thesis under my direction, as outlined here.

Comments Off on FAQ

Filed under Uncategorized

March 25, 2022 · 13:52

Linguist names

The goal of this page is collect some names of linguists whose pronunciations are non-obvious.

Table Of Contents

My name
Everyone else's names: notes
A
B
C
D
E
F
G
H
J
K
L
M
N
O
P
R
S
T
V
W
Y
Z
Acknowledgments

My name

The audio files and transcriptions for my name are here. Please do not call me Masha. Relatedly, did you know that Michael Becker and Misha Becker are not the same person?

Everyone else’s names: notes

People who are linked without comment have included IPA transcriptions of their name pronunciations on their websites. NB: If people pronounce your name differently from how you’d like it to be pronounced, or if you’ve ever been asked how to pronounce your name, that’s a hint that you should put that information on your website. It is more likely to reach the target audience if it’s on your site than on mine. Roman Jakobson–you’re off the hook on this one.

Transcription notes:

the bias is towards anglicized pronunciations here, with some sloppiness in whether vowels are reduced. The focus is on stress placement and non-obvious consonant and vowel qualities, or vowels you might think are silent that aren’t.
I offer neither explanation nor apology for my inconsistent use of [r] vs. [ɹ] vs. [ɚ].
I mark stresses before the stressed vowels because life is too short to try to figure out the syllabification in some of these names.
In a few names, there is a dorsal fricative. I decided to leave that as [x] in the transcriptions, as many American linguists seem comfortable with approximating it. If you are not one of them, my prescriptive suggestion would be to substitute [h]; to my ears it sounds like the next best thing. Maybe a [k] if Ident[place]>>Ident[continuant] for you.

If you have a request or a correction, send me an email!

A

John Alderete: [ɑldərˈɛɾi] not [ɑldərɛt]

Scott AnderBois: [ˈændɚbwɑ]

Amalia Arvaniti: [amalˈia arvanˈiti]

B

William Badecker: [b’eɪɾəkɚ]

Eric Baković

Alicia Beckford Wassink: [əlˈisiə] not [əlˈiʃə].

Gašper Beguš: [gˈaʃpɚ bˈɛɡuʃ]

Stefan Benus (Štefan Beňuš)

Christina Bethin: [bətˈin] not [bˈɛθən]

Claire Bowern: [bˈoʊɚn]

David Bowie: [bˈu.i], like the town in Maryland not like the singer

Adrian Brasoveanu: [braʃovˈanu]

Canaan Breiss: [kˈeɪnən braɪs]

Marisa Brook: [mərˈisə]

Mary Bucholtz: [bˈʌkəlts]

Rachel Burdin

C

Pavel Caha: [tsˈɑhə] or [sˈɑhə], not [kɑhə] or [tʃahə]

Dustin Chacón: [tʃakˈon] not [ʃ…]

Vincent Chanethom

Ioana Chitoran: [ioˈɑnə kitsorˈɑn]

Eleanor Chodroff

Catherine Chvany: [tʃvˈɑni]; Hypocoristic: [kˈatʲə].

Andries Coetzee

Uriel Cohen Priva: [ˈuriəl kˈoʊən prˈivə] not [juriəl]!

Ailís Cournane

Meg Cychosz

Patricia Cukor-Avila: [sˈukɚ ˈævɪlə]

D

Éva Dékány

Derek Denis: [dˈɛnɪs]

Katherine Demuth: [dˈiməθ]

Ray Dougherty: [dˈɑgɚti]. Like other Irish surnames, the Anglicization of this one is variable

Anthony Dubach Green: [dˈubax] — need verification here, this is just an educated guess

Karthik Durvasula

E

Ben Eischens: [‘aɪʃənz]

Masha (Maria) Esipova

F

Donka Farkas: [fˈɑrkɑʃ]

Matthew Faytak: [fˈeɪtˌæk]

Paul Foulkes: [faʊks]

Joseph Fruehwald: [dʒˈoʊsɨf frˈuːwɔld]

G

Adamantios Gafos: [adamˈɑntios gˈɑfəs] (or [ɣˈafos]). Hypocoristic: Diamandis [dˌiəməndˈis]

Mark Garellek

Eleanor Glewwe: [glˈɛvi]

Anna Grabovac: [grˈɑbovak]

Vera Gribanova: [vˈɛrə ɡribˈɑnəvə]

John Gumperz: [ˈɡʌmpəɹz] [source]

H

Liliane Haegeman: [hˈɑxəmən] not [heɪɡə…]

Boris Harizanov: [bˈoris harizˈanov]

Stephanie Harves: [hˈɑrvəs] not [hˈɑrvz]

Martin Haspelmath

J

Roman Jakobson: [jəkabsˈon] or [jˈɑkəbsən], not [dʒeɪkəbsən]

Gaja Jarosz

Peter Jurgec: [jˈurgəts]

K

René Kager: [kˈɑxɚ] not [keɪdʒɚ]!

Roni Katzir: [rˈɔni kətsˈir]

Abigail Kaun: [kɑn]

Jaklin Kornfilt: [ʒaklˈin kˈɔrnfɪlt]

Jelena Krivokapić: [j’ɛlɛnə krivok’apitʃ]

L

William Labov: [ləbˈoʊv] (since people immediately started arguing with me on this, I’m including a citation to an authoritative source; 2006, Journal of English Linguistics 34:4)

Terry Langendoen: [lˈæŋɡəndən]

Beth Levin: [ləvˈin] (note stress location and vowel quality)

Erez Levon: [ˈɛɹɛz ˈlɛvɑn] not [əˈɹɛz ləˈvon]

Mark Liberman: [lˈɪbɚmən] not [lˈibɚmən]

Anna (Ania) Łubowicz: [wubˈovitʃ]

M

Sally McConnell-Ginet: [ʒɪˈne]

K.P. Mohanan: [mˈoʊhanan] not [moʊhˈænən]!

Marcin Morzycki

N

Naomi Nagy: [nˈeɪɡi]

Savithry Namboodiripad

Luiza Newlin-Łukowicz: [luˈiza nˈɛvlin wukˈovitʃ]

Máire Ní Chiosáin: [mˈɑrə nˈi xisˈɑn] — or [x’isɑn], depending on dialect, or even [mˠˈɑːrʲə nʲiː xʲˈisˠɑːnʲ] depending on your preferred level of pedanticism (I have also been given [kis’ɑn] but that has been challenged so I will leave all of these here and let you decide)

Yining Nie

Jennifer Nycz

O

Elinor Ochs: [oʊks]

David Odden: [oʊdən]

Cemil Orhan Orgun: [dʒemˈil orhˈan orɡˈun]

P

Joe Pater: [peɪɾɚ] not [pɑ…]

Charles Peirce: [pɚs]

Katya Pertsova: [pertsˈovə] not [pˈertsəvə]

Janet Pierrehumbert: [pieɹhˈʌmbɚt]

Glyne Piggott: [glɪn] not [ɡlaɪn]

Omer Preminger: [ˈoʊmɚ prˈɛməndʒɚ]; [ŋɡ] in German or Hebrew

Adam Przepiórkowski

R

Ezer Rasin

Yulia Rodina: [jˈuliə rˈodinə]

Marcos Rohena-Madrazo

Nicholas Rolle

Jerzy Rubach: [jˈeʒi rˈubax] not “Jersey”. You can try a retroflex ʐ, [jɛʐɨ]

Amanda Rysling

S

Elizabeth Sagey: [sˈeɪdʒi] not [seɪɡi]

Gillian Sankoff: [gˈɪliən sˈæŋkˌɑf] not [dʒɪliən]

Philippe Schlenker: [filˈip ʃlɛnkˈɛr] (note, final stresses in both)

James Scobbie: [skˈɑbi] not [skoʊbi]

Márton Sóskuthy: [ʃˈoʃkuti]

Michal Starke: [mˈixal ʃtˈarkə]

Robert Staubs: [stɑbz]

William Stokoe: [stoʊki]

Patrycja Strycharczuk: [patrˈitsjə strixˈartʃək] or [stri{h|k}ˈartʃək] (Polish: [patrˈɨtsja strɨxˈarʈʂuk])

Anna Szabolcsi: [ˈænə sˈɑboltʃi] (note, not [ɑnə])

Benedikt Szmrecsanyi

T

Sali Tagliamonte: [tˌæɡliəmˈɑnti]

Meredith Tamminga: [tˈæmɪŋɡə] not [təmˈɪŋɡə]

Anne-Michelle Tessier: [tˈɛsiˌeɪ] not [tɛsiɚ]

One Tlale Boyer: [ˈone tˈɑle bˈojɚ]

Peter Trudgill: [tɹˈʌdɡˌɪl] not [tɹʌdʒɪl]

V

Ljuba Veselinova: [liˈubə veselˈinəvə]

W

Gert Webelhuth: [vˈe:bəlhˌut]

Y

Charles Yang: [jɑŋ], not [jæŋ]/[jeŋ]

Z

Draga Zec: [drˈɑɡə zˈɛts]

Erik Zyman

Elizabeth Zsiga: [zˈigə]

Kie Zuraw: [kˈaɪ zˈurˌɑ] not [zurˈaʊ]

Acknowledgments

Special thanks to Laurel MacKenzie, who supplied many transcriptions and suggestions. Thanks also (in no particular order and without links, sorry) to Tricia Irwin, Wayles Browne, Steven Franks, Michael Becker, Ryan Bennett, Ailís Cournane, Greg Guy, Lisa Davidson, Anna Szabolcsi, and Emily Gasser for suggestions and transcription verification.

Comments Off on Linguist names

Filed under Uncategorized

November 10, 2021 · 19:13

Using Praat Scripts

Table Of Contents

Praat vs. the internet and your phone
A confidence builder
A random list of problems and solutions
Some other gotchas in Praat scripting
Respecting spelling and whitespace
Ready to go beyond these basics?
And now, the obligatory metaphor

Goal: Before you attempt to write your own Praat script, you will probably try to use one of the many existing scripts. This guide is not going to teach you how to write your own scripts, or even how to modify existing scripts in minor ways for your own needs. All it covers is how to use scripts that someone else wrote. It is intended for students in an introductory phonetics course, who usually have little to no background in working with scripts or programming languages.

The point of Praat scripting is to automate tasks, so you, the human, can do all the smart stuff, and leave the computer to do repetitive boring stuff. For example, suppose you are working on a study where your interest is in measuring durations of certain consonants. Praat cannot accurately label the edges of those consonants for you–that task falls under “smart stuff” that requires human eyes and ears. But Praat can rapidly collect the durations of labeled intervals from a TextGrid for you, if you prepare the right sort of TextGrid and tell Praat exactly what you want.

The most basic thing you will do with a Praat script is make it run and tell it to do something to a file you provide, and then save the result to your computer. The confidence builder exercise below talks you through that.

But first…

Praat vs. the internet and your phone

Praat cannot read your mind.

If you are like most students taking an introductory phonetics course right now (I’m writing in 2021), the internet has always existed for you, and it isn’t your mom’s internet of 2002. Most of the apps on your phone are fronts for web apps–they allow you to interact with a website or a distributed network. Thus, your familiarity with computers is intimately tied with how the internet has evolved. Websites and web apps are designed to try to read the user’s mind, and to fail gracefully in the face of typos and missing pages. They keep your stuff organized for you, so that you can search for pictures or music or videos and somehow always find it. All the magic is hidden. One side effect of this is that kids these days can’t find things on their computers.

By contrast, Praat was originally written in the late 1990s. It has been continually maintained, and features have been added to it, but its basic nature is very 90’s. It is meant for people who know their way around their computer and who understand how to get the computer to do things the old-fashioned way. The challenge before you, then, is to learn how to talk to a 90’s computer. You need to tell the computer exactly where stuff is. (You need to tell the computer the order in which to do things, too. If you’re working with existing scripts, as in this tutorial, you generally don’t have to worry about that.)

Another side effect of Praat’s not knowing how to read your mind is that you have to be very careful with typos and whitespace. (I’ll explain what “whitespace” means below.) Approximate is not good enough; you have to be precise. You need to be extremely attentive to detail.

You are eventually going to encounter some situations where things go wrong, but Praat doesn’t tell you why. See, Praat cannot read your mind, but you have to read its mind. This is what we call debugging. You need to decompose the situation that leads to failure into steps that you can change, one at a time, to isolate the step that is causing problems. This requires an analytic approach that can be a lot of fun, as long as you frame the problem correctly.

A confidence builder

Let’s try a simple script, which is known to run on a pre-chosen audio file. The script we’ll use is Mietta Lennes’s mark pauses, and the audio file is Newgatenovelist’s reading of Her Hair, a poem from the free Librivox project. You can hear the original mp3 at the linked page; here is the .wav file I made (I used ffmpeg). As a rule, you should be working with .wav files in Praat, not .mp3 files, because the time stamps in mp3 files are not accurate.

First, go to your Desktop and make a folder there named pauses. [Mac instructions][Windows instructions]
Save the .wav file to that folder. Once the file is on your computer, you can view its properties by right-clicking on it (Mac users, hit Control+click). The properties will give you the location of the file, as well as its size and other details.
Download the script file to the same folder.
Open Praat, and load the audio file into it as a LongSound object. To do this, from Praat, go to Open>Open long sound fIle. The newest sound object opened should be selected by default. If it isn’t, click on it to select it. This will become important in a sec.
Now open the script: click on Praat>Open Praat script.... You’ll see a boring-looking text file, like this:
Hit “Run”. You now see a dialog window called "Run script: Give the parameters for pause analysis."
Okay, now it’s time to edit something. Pay attention! You need to change the path in the last field, the one called Save TextGrid file to folder:, to the location of your audio file. If you put it in pauses on your Desktop, then the path will end in Desktop/pauses/ if you’re on a Mac or Linux machine (or Chromebook?), and Desktop\pauses\ if you’re on Windows. The part before that will be your home directory; look at the path to the file as explained in step 2 if you don’t know how to find your home directory. The slash at the end of that is important–and if you make any typos, things will break.
When you’re ready, hit OK. If you did everything correctly, you should see a white Praat Info window that says Ready! The TextGrid file was saved as /wherever/you/told/Praat/to/save/it.TextGrid. If not, well, it’s time to debug things, so read the rest of this page and come back to step 7.
You should also be able to see a new TextGrid object in your Praat object window. Open it together with the sound file, and you’ll see that there is a TextGrid with a “sentence” level that has intervals marked. The intervals are silences longer than 0.6 seconds. Look at the Run script window again and you’ll see that this is a default value defined in the third field of that dialog. You can also figure out how the script identified pauses by looking at that window: there is an intensity threshold (59 decibels) that defines a pause.

Okay, so what was the point of that? If you look at Mietta Lennes’s other scripts, you begin to see how this might be useful. There is a script on that page that will label a TextGrid with labels from a text file. For example, your poem. You will need to edit the TextGrid to make sure that the pauses coincide with line breaks, but it’s a lot quicker to do than to doing it from scratch. In the case of this recording, you just need to remove some interval boundaries.

There is also a script that will save labeled intervals to separate audio files, and create matching TextGrids to boot. I hope you can see how that would be useful for the final project I’ve assigned you.

A random list of problems and solutions

Many problems can be fixed by starting with a clean Praat session. Remove all objects from the Object window, and start afresh. Praat scripts often operate by putting objects in the window and doing stuff to them; if there are already objects there, and they are of the wrong type, it can break the script.
“Unable to save file“… If you get errors about files not being found, you messed up your path somehow. Maybe you left out a slash at the end of path, or added a slash when you shouldn’t have. Maybe you skipped a crucial directory in your path. Check everything–and use the information about file locations that you get from looking at their properties, as described above.
Script works on some files but breaks on others. For scripts that are supposed to work on a bunch of files: if the script is working for some of your files and failing for others, you should examine closely those text grids where the script is having trouble. That will give you an idea of what could be going wrong (for example, one of the interval boundaries might be missing in that file, but not in the others–that can break a script.)
The script is asking you to enter a non-zero value, and you don’t know what to enter. (The default value in the “save labeled intervals to wav files” script is 0.000, which I normally change to 0.0001.) Try a few different numbers and see what the output looks like. The script doesn’t modify your original data, so there’s no harm in experimenting. That’s how you learn what the script does.
The script giving you undefined values for some of your intervals. This could be because you mis-segmented your text grids–if you placed the boundary on something that isn’t a vowel and has no formants to measure, then of course the script will not be able to measure your formants. (Ditto for intensity, possibly.) Solution: make sure your segmentation identifies vowels.
Are you running the script on the wrong file? You’re having trouble running the formant/intensity/duration collecting script on your large data file with a single text grid. That’s because the script is intended for a bunch of small individual files paired with text grids. Make sure you read the homework instructions carefully, don’t skip steps.
Empty results.txt file. This one has many causes, but probably the chief one is that you’re telling the script to look for files in the wrong location. Did you change the paths to ones for you computer, or did you leave them as the defaults that were already there? On a Mac, your path probably should start with “/Users”; on a Windows machine, with “C:”. The paths below are from my Linux machine; your paths will need to be different.
The script break on files that have spaces in their names. Your labeled intervals have spaces in them. Your sound files and your text grid files do not have matching names, so the script chokes. What’s the solution? Respect whitespace! See below.
You’re running your data collection script on empty text grids. This is another source of the “empty results.txt file” problem. Did you save the TextGrid files after you changed their contents? Remember, Praat doesn’t save files automatically for you, like other programs you’re used to. There is no autosave of changed objects in the Object Window. You have to overwrite the files in the directory where Praat is looking for annotated text grids.

Some other gotchas in Praat scripting

Praat itself will warn you if you try to overwrite an existing file, but a Praat script will not. Always test a script in such a way that there is no danger of it overwriting important work.
- Solution: create a copy of the directory with your text grids and run the script on the copy, so there is no danger of screwing up your originals. If all goes well, you can delete the originals.
A Praat script might be configured to have a default file name for a text grid or an audio file it created, so if you want to save multiple audio files cut out of a single big audio file, it may overwrite the same file with new content without telling you.
- Solution: If you end up with one file but you expected several, and the one file you see is the last file you expected to be created, then the answer to your problem lies in how files are named.
A typo in a key location will break things. There’s no solution here but to pay attention; see also below.
Praat will not give you interpretable error messages unless the script you’re running provided for them. Again, there’s no solution other than to be careful.
There are no helpful mouseover (or finger press) hints that explain what buttons do. Once you hit that “Run” button, you might be running a risk of losing stuff. So check everything many times before you run a script.
There might be technical problems with a particular measurement: for some acoustic analyses, the time window has to be of a certain length, so if your vowel, say, is too short, a Praat script measuring it will fail for inexplicable reasons.
- Solution: attempt to do the measurement by hand at the point where the script is failing.

Respecting spelling and whitespace

What is whitespace, you ask? It’s all the stuff that looks white on a screen (or black, if you like Dark Mode) but contains invisible characters that the computer does not ignore.

Examples of whitespace:

An actual space, as in between words.
A tab, like what you use in a word processor. It’s usually 4-5 spaces wide, but it’s distinct from a sequence of spaces. Just ask a Python programmer to explain this to you.
An end-of-line character. There are actually several of these, and they differ depending on the operating system. This is one of the reasons why simple .txt files sometimes look insane when they travel from a Windows computer to a Linux or a Mac computer.
A truly diabolical example of a whitespace character is a letter that is rendered in a font color that matches the background. Select the line below with your mouse, and you’ll see.

Hahaha! I am whitespace in white color! I got you, sucka!

Why are we talking about whitespace so much? Because Praat does not like spaces. In the olden days, many computers could not handle spaces in file names at all. This is the reason why oldsters such as myself use underscores for readability, as in, sound_and_language_syllabus.pdf. Behind the scenes, spaces in file names have to be modified; in URLs, for example, spaces are replaced with other character sequences. Look at the URL of any website and you will not see spaces in it. This is because computers treat spaces as meaningful breaks between objects.

So, if you open a file in Praat with spaces in the filename, Praat will automatically convert them to underscores in the name of an object. Try it! If you ignore the underscores in your handling of files, which I think many non-programmers tend to do, you will be punished for your carelessness. You’ll wonder why you have what looks like duplicates, or wonder where your file went, or, if you have spaces in the names of your folders, things might not work at all.

Solution: Do not use spaces in file names. Use a concatenation of lower and uppercase letters, as in SoundLangSyllabus.pdf, or, better, use underscores, as above.

Capital vs. Lowercase

Speaking of capitals: capital and lowercase letters are not the same characters to a computer. Whenever a program seems to treat upper- and lowercase as the same, it was explicitly told to do this. So, make sure to respect that difference.

Differences between symbols that look the same to you

Now we get to some really old-school stuff, like the ASCII vs. non-ASCII distinction. The ASCII character set (pronounced “ass key”) is the oldest, designed for English. Even commonly used Western European characters such as <ñ> and <ü> are not part of the basic ASCII set of 127 (see it here, if you need to). The oldest programs do not know how to deal with anything other than ASCII.

And this includes Praat. If you want to annotate a TextGrid, Praat will not let you enter IPA characters in Unicode; you can enter them using deadkey combinations, as explained in the manual, but you cannot paste them in directly from your word processor that uses a proper, modern IPA font.

Where this becomes a problem in working with Praat scripts is if your script depends on labels matching something. For example, if you want to extract all the intervals labeled “g”, but some of your “g” labels are IPA [ɡ], and others are regular font [g]. (Look carefully and you’ll see a difference in those symbols.) For me, some problems I’ve run into involved Russian script ” а” vs ASCII “a”; they look identical but are different to a computer. (There were scam URLs a while back that capitalized on Latin and Cyrillic using some similar looking letters… You think you’re visiting Amazon.com, but really you’re somewhere in the middle of the Azov sea. But I digress.)

Ready to go beyond these basics?

In the confidence builder example, you didn’t change the script text itself, and we did not look at the contents of the scripts. But all the Praat scripts linked above were created by people who weren’t Praat programmers (Boersma and Weenink)–and most can be modified by users such as yourself. You can take these scripts, look at their contents, and change them to suit your needs.

For another confidence builder, then, open the mark_pauses.Praat script as in the instructions above, but before you hit “Run”, do the following: go to “search”, enter “lennes”. That will take you to a line that says

text folder /home/lennes/

Change this line to /home/yourname/, and save the script. Now run it–what do you see in the field at the bottom of the dialog?

What you did is modify the form that the script generates. That window is part of the script, and the field is pre-populated by a default value. All of the form values can be defined either in the form when you run the script or by changing the script itself.

Once you understand that the contents of that file is not magic, you can take on the Praat scripting tutorial in Praat itself, or check out the various resources written by other people.

I like Eleanor Chodroff’s Praat scripting tutorial here. She has some other nice resources on her page.

And now, the obligatory metaphor

If you’ve taken classes with me, you know how I like tortured metaphors. And similes.

The way young people use computers is sort of like Hermione’s magic bag from the Harry Potter and the Deathly Hallows book. It’s a tiny bag, magically crammed with stuff, and she often uses the Accio charm to get things out of it. This is a summoning charm that just find the right thing in the mess–sort of like that search function you rely on. Hermione doesn’t bother to organize books, Muggle clothing, essence of Dittany, and all the rest of the junk in the bag because it’s bound to fall over anyway. And why bother, when you have a magic summoning charm? When you rely on search to find stuff on your computer, or on the internet, it’s pretty much like using the Accio charm.

[Another tortured metaphor: relying on the internet to keep your stuff for you is pretty much Leprechaun gold territory. Internet stuff disappears. You can practice a Zen-like non-attachment to your stuff, or you can keep it locally, and organized, if you want it to be around for years to come.]

To an older person, hearing about this Search approach is akin to learning that you keep your socks and your forks and your toothpaste and your books just randomly piled all over your room, jumbled together. We keep things in separate cabinets and drawers. We know where to find forks not because we have a magic summoning charm, but because we put the forks into the place where forks go.

The way computers actually work in reality is that you have to tell them were the fork drawer is, in explicit detail, without typos. Even when you have an operating system that can search through your files, this becomes useless if you don’t know what to search for. And believe me, a time will come when you won’t remember what you called that one file, or where on the internet you saw that one thing.

Because you see, the search-for-it approach is only good if you know what to search for.

Back to Praat. In setting up an experimental project, there are a few common-sense, fork-drawer-organizing things you should consider doing:

Put files into directories. For example, have a “raw_audio” directory where you keep the unsegmented data, and a “segmented_audio” and maybe “textgrids” for the eponymous things.
Give files sensible names without spaces. Just ask yourself–would I be able to figure out what all these files are 10 years from now?
Keep a small local “readme.txt” file that explains what order you did things in, and what each folder is for, in case you forget (because believe me, you will… it’s not just a matter of how well your brain works but also how much stuff is cluttering your memory; the more you know, the harder it is to recall things, paradoxically).

Okay, I’m done hectoring you for now! Did this help? Do you have any tips to include here, or anything you ran into that I didn’t cover above? Let me know.

Comments Off on Using Praat Scripts

Filed under Uncategorized

July 5, 2021 · 13:36

Cyrillic encodings

It seems to be increasingly uncommon for programs to have an easy way to guess encodings correctly and to manipulate them gracefully. Or maybe this is just a Linux problem. In any case, I’ve found myself looking this up repeatedly, after failing to get the right results in various text editors (VIM, Gedit, xed). This used to be so easy in Mac OS’s TextWrangler–there’s an actual menu item, “Reopen using encoding…”, with a drop-down list. You just pecked around the Cyrillic options until you saw something other than alphabet salad.

Anyway, here is the way to view and change encoding. One common source of problems is that the encoding in the file’s metadata is often wrong. For example, Sharoff’s frequency lists are supposed to be using CP-1251. But the files claim to be in ISO-8859.


$ file lemma_al.txt

lemma_al.txt: ISO-8859 text, with CRLF line terminators

This mismatch is what causes VIM and its ilk to display trash instead of Cyrillic. Since the original information is lost, you have to do some guessing. In this case, the following command worked on first try:

$ iconv -f cp1251 -t utf8 lemma_al.txt -o lemma_al_utf8.txt

This converts -f from encoding CP-1251 -t to encoding UTF-8, taking the next argument as the input and the -o argument as the output. Open it in a text editor to see if it did the trick.

Now, CP-1251 is just one encoding. What others might you have to try? There is a good review here. The usual legacy encodings are koi8r, koi8u, cp866, ruscii, cp1251, iso8859. They are known under different names sometimes, so you might have to do some digging to get it right.

Comments Off on Cyrillic encodings

Filed under Uncategorized

January 17, 2021 · 10:06

Gouskova 2023

Gouskova, Maria. 2023. Phonological asymmetries between roots and affixes. Wiley Blackwell Companion to Morphology, Eds. Peter Ackema, Sabrina Bendjaballah, Eulàlia Bonet, and Antonio Fábregas. [doi]

pdf Download

This review surveys the phonological asymmetries between roots and non-roots (affixes, clitics). It starts with an extraphonological, structural definition of roots, and considers those non-phonological properties that are phonologically relevant: they are easily borrowed, and they are most deeply embedded. The empirical portion of the review concentrates on templaticism and size restrictions, asymmetries in segmental contrast/inventories, the properties of multi-root words (compounds), and accentual characteristics that differ between roots and affixes. The theoretical section surveys theories that account for these properties: Prosodic Morphology, Positional Faithfulness, the cycle and its analogs, and Anti-Faithfulness. I then critically review several recent and not-so-recent proposals that blur the line between affixes and roots, using the ‘root’ designation diacritically or recasting diacritic distinctions as structural distinctions. The concluding section discusses the role of roots in phonological learnability.

Comments Off on Gouskova 2023

Filed under Uncategorized