| 2022 |  How Well Do LSTM Language Models Learn Filler-gap Dependencies?. Proceedings of the Society for Computation in Linguistics. | 
| 2021 |  East Tusom: A phonetic and phonological sketch of a largely undocumented Tangkhulic language. Linguistics of the Tibeto-Burman Area 44(2). | 
| 2020 |  Neural Polysynthetic Language Modelling. Technical Report , Frederick Jelinek Memorial Summer Workshop. | 
|  Computerized Forward Reconstruction for Analysis in Diachronic Phonology, and Latin to French Reflex Prediction. Proceedings of the 1st Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA 2020). | |
|  AlloVera: A Multilingual Allophone Database. Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). | |
|  Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods. Proceedings of the Society for Computation in Linguistics 2020 (SCiL 2020). | |
|  Universal Phone Recognition with a Multilingual Allophone System. Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020). | |
|  Towards Zero-shot Learning for Automatic Phonemic Transcription. Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020). | |
| 2019 |  Hmong (Mong Leng). Chapter in The Mainland Southeast Asia Linguistic Area. | 
| 2018 |  Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). | 
|  Epitran: Precision G2P for Many Languages. Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018). | |
|  Parser combinators for Tigrinya and Oromo morphology. Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018). | |
|  The ARIEL-CMU situation frame detection pipeline for LoReHLT16: a model translation approach. Machine Translation 32(1–2). | |
| 2017 |  URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017). | 
|  Hmong-Mien Languages. Chapter in Oxford Research Encyclopedia of Linguistics. | |
| 2016 |  Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik. Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016). | 
|  PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors. Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016). | |
|  Phonologically Aware Neural Model for Named Entity Recognition in Low Resource Transfer Settings. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016). | |
|  Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). |