Publications and Papers

This page was generated by a super-secret proprietary Perl script (Fri Dec 26 12:05:56 PST 2025).

Jump to year: to appear ‘25 ‘24 ‘23 ‘22 ‘21 ‘20 ‘19 ‘18 ‘17 ‘16 ‘15 ‘14 ‘13 ‘12 ‘11 ‘10 ‘09 ‘08 ‘07 ‘06 ‘05 ‘04 ‘03 ‘02 ‘01 20th century

To Appear

Catching Fire in the News: The Necessary Conditions for Media Storms. Amber E. Boydstun, Jill Laufer, Dallas Card, and Noah A. Smith. Cambridge University Press, 2025.

2025

The Leaderboard Illusion. Shivalika Singh, Yiyang Nan, Alex Wang, Daniel D’souza, Sayash Kapoor, Ahmet Üstün, Sanmi Koyejo, Yuntian Deng, Shayne Longpre, Noah A. Smith, Beyza Ermis, Marzieh Fadaee, and Sara Hooker. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2025), San Diego, December 2025.
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations. Brian Siyuan Zheng, Alisa Liu, Orevaoghene Ahia, Jonathan Hayase, Yejin Choi, and Noah A. Smith. In Proceedings of the Neural Information Processing Systems (NeurIPS 2025), San Diego, December 2025.
FlexOLMo: Open Language Models for Flexible Data Use. Weijia Shi, Akshita Bhagia, Kevin Farhat, Niklas Muennighoff, Jacob Morrison, Evan Pete Walsh, Dustin Schwenk, Shayne Longpre, Jake Poznanski, Allyson Ettinger, Daogao Liu, Margaret Li, Mike Lewis, Wen-tau Yih, Dirk Groeneveld, Luca Soldaini, Kyle Lo, Noah A. Smith, Luke Zettlemoyer, Pang Wei Koh, Hannaneh Hajishirzi, Ali Farhadi, and Sewon Min. In Proceedings of the Neural Information Processing Systems (NeurIPS 2025), San Diego, December 2025.
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation. David Heineman, Valentin Hofmann, Ian Magnusson, Yuling Gu, Noah A. Smith, Hannaneh Hajishirzi, Kyle Lo, and Jesse Dodge. In Proceedings of the Neural Information Processing Systems (NeurIPS 2025), San Diego, December 2025.
Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index. Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), Suzhou, November 2025.
Tülu 3: Pushing Frontiers in Open Language Model Post-Training. Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V. Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D. Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, and Hannaneh Hajishirzi. In Proceedings of the Conference on Language Models (COLM 2025), Montréal, October 2025.
Establishing Task Scaling Laws via Compute-Efficient Model Ladders. Akshita Bhagia, Jiacheng Liu, Alexander Wettig, David Heineman, Oyvind Tafjord, Ananya Harsh Jha, Luca Soldaini, Noah A. Smith, Dirk Groeneveld, Pang Wei Koh, Jesse Dodge, and Hannaneh Hajishirzi. In Proceedings of the Conference on Language Models (COLM 2025), Montréal, October 2025.
OLMo 2: Pareto-Optimal, Fully Open Language Modeling. Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William Merrill, Lester James V. Miranda, Jacob Morrison, Tyler Murray, Crystal Nam, Valentina Pyatkin, Aman Rangapur, Michael Schmitz, Sam Skjonsberg, David Wadden, Christopher Wilhelm, Michael Wilson, Luke Zettlemoyer, Ali Farhadi, Noah A. Smith, and Hannaneh Hajishirzi. In Proceedings of the Conference on Language Models (COLM 2025), Montréal, October 2025.
SuperBPE: Space Travel for Language Models. Alisa Liu, Jonathan Hayase, Valentin Hofmann, Sewoong Oh, Noah A. Smith, and Yejin Choi. In Proceedings of the Conference on Language Models (COLM 2025), Montréal, October 2025.
Fluid Language Model Benchmarking. Valentin Hofmann, David Heineman, Ian Magnusson, Kyle Lo, Jesse Dodge, Maarten Sap, Pang Wei Koh, Chun Wang, Hannaneh Hajishirzi, and Noah A. Smith . In Proceedings of the Conference on Language Models (COLM 2025), Montréal, October 2025.
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback. Lester James Validad Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, and Pradeep Dasigi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2025), Vienna, July/August 2025.
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens. Jiacheng Liu, Taylor Blanton, Yanai Elazar, Sewon Min, Yen-Sung Chen, Arnavi Chheda-Kothary, Huy Tran, Byron Bischoff, Eric Marsh, Michael Schmitz, Cassidy Trier, Aaron Sarnat, Jenna James, Jon Borchardt, Bailey Kuehl, Evie Yu-Yen Cheng, Karen Farley, Taira Anderson, David Albright, Carissa Schoenick, Luca Soldaini, Dirk Groeneveld, Rock Yuren Pang, Pang Wei Koh, Noah A. Smith, Sophie Lebrecht, Yejin Choi, Hannaneh Hajishirzi, Ali Farhadi, and Jesse Dodge. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Demonstration Papers) (ACL 2025), Vienna, July/August 2025.
LlamaPIE: Proactive In-Ear Conversation Assistants. Tuochao Chen, Nicholas Scott Batchelder, Alisa Liu, Noah A. Smith, and Shyamnath Gollakota. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings), Vienna, July/August 2025.
DataDecide: How to Predict Best Pretraining Data with Small Experiments. Ian Magnusson, Nguyen Tai, Ben Bogin, David Heineman, Jena D. Hwang, Luca Soldaini, Akshita Bhagia, Jiacheng Liu, Dirk Groeneveld, Oyvind Tafjord, Noah A. Smith, Pang Wei Koh, and Jesse Dodge. In Proceedings of the International Conference on Machine Learning (ICML 2025), Vancouver, July 2025.
Troubles in Text: Using Natural Language Processing to Recognize Government Rationalizations for Rights Abuses. Sarah K. Dreier, Sofia Serrano, Emily K. Gade, and Noah A. Smith. Journal of Politics 87(3), July 2025.
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models. Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, Yen-Sung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda-Kothary, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Evan Pete Walsh, Christopher Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, and Aniruddha Kembhavi. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR 2025), Nashville, TN, June 2025.
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation. Shivam Duggal, Yushi Hu Oscar Michel, Aniruddha Kembhavi, William T. Freeman, Noah A. Smith, Ranjay Krishna, Antonio Torralba, Ali Farhadi, and Wei-Chiu Ma. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR 2025), Nashville, TN, June 2025.
RewardBench: Evaluating Reward Models for Language Modeling. Nathan Lambert, Valentina Pyatkin, Jacob Morrison, Lester James Validad Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi . In Findings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025 Findings), Albuquerque, NM, April/May 2025.
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models. Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, NM, April/May 2025.
ComPO: Community Preferences for Language Model Personalization. Sachin Kumar, Chan Young Park, Yulia Tsvetkov, Noah A. Smith, and Hannaneh Hajishirzi. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, NM, April/May 2025.
The Leaderboard Illusion. Shivalika Singh, Yiyang Nan, Alex Wang, Daniel D’Souza, Sayash Kapoor, Ahmet Üstün, Sanmi Koyejo, Yuntian Deng, Shayne Longpre, Noah A. Smith, Beyza Ermis, Marzieh Fadaee, and Sara Hooker. April 2025.
MUSE: Machine Unlearning Six-Way Evaluation for Language Models. Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A. Smith, and Chiyuan Zhang. In Proceedings of the International Conference on Learning Representations (ICLR 2025), Singapore, April 2025.
OLMoE: Open Mixture-of-Experts Language Models. Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Evan Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, and Hannaneh Hajishirzi. In Proceedings of the International Conference on Learning Representations (ICLR 2025), Singapore, April 2025.
On Linear Representations and Pretraining Data Frequency in Language Models. Jack Merullo, Noah A. Smith, Sarah Wiegreffe, and Yanai Elazar . In Proceedings of the International Conference on Learning Representations (ICLR 2025), Singapore, April 2025.

2024

Scaling Expert Language Models with Unsupervised Domain Discovery. Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, and Luke Zettlemoyer. {Journal of Machine Learning Research, 2024.
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers. Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, and Yulia Tsvetkov. Transactions of the Association for Computational Linguistics, 2024.
Paloma: A Benchmark for Evaluating Language Model Fit. Ian Magnusson, Akshita Bhagia, Valentin Hofmann Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, and Jesse Dodge. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2024), Vancouver, December 2024.
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback. Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A. Smith, Yejin Choi, and Hannaneh Hajishirzi. In Proceedings of the Neural Information Processing Systems (NeurIPS 2024), Vancouver, December 2024.
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models. Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, and Ranjay Krishna. In Proceedings of the Neural Information Processing Systems (NeurIPS 2024), Vancouver, December 2024.
Evaluating Copyright Takedown Methods for Language Models. Boyi Wei, Weijia Shi, Yangsibo Huang, Noah A. Smith, Chiyuan Zhang, Luke Zettlemoyer, Kai Li, and Peter Henderson. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2024), Vancouver, December 2024.
Decoding-Time Language Model Alignment with Multiple Objectives. Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, and Simon Shaolei Du. In Proceedings of the Neural Information Processing Systems (NeurIPS 2024), Vancouver, December 2024.
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization. Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Valentin Hofmann, Tomasz Limisiewicz, Yulia Tsvetkov, and Noah A. Smith. In Proceedings of the Neural Information Processing Systems (NeurIPS 2024), Vancouver, December 2024.
Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions. Jonathan Hayase, Alisa Liu, Yejin Choi, Sewoong Oh, and Noah A. Smith. In Proceedings of the Neural Information Processing Systems (NeurIPS 2024), Vancouver, December 2024.
The Art of Saying No: Contextual Noncompliance in Language Models. Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, and Hannaneh Hajishirzi. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2024), Vancouver, December 2024.
Summarization-Based Document IDs for Generative Retrieval with Language Models. Alan Li, Phillip Keung, Jungo Kasai, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Advancing Natural Language Processing for Wikipedia, Miami, November 2024.
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals. Yanai Elazar, Bhargavi Paranjape, Hao Peng, Sarah Wiegreffe, Khyathi Chandu, Vivek Srikumar, Sameer Singh, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Findings), Miami, FL, November 2024.
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models. Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, and Luke Zettlemoyer . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, FL, November 2024.
Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG. William Merrill, Noah A. Smith, and Yanai Elazar . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, FL, November 2024.
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects. Orevaoghene Ahia, Anuoluwapo Aremu, Diana Abagyan, Hila Gonen, David Ifeoluwa Adelani, Daud Abolade, Noah A. Smith, and Yulia Tsvetkov. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, FL, November 2024.
CPS-TaskForge: Generating Collaborative Problem Solving Environments for Diverse Communication Tasks. Nikita Haduong, Irene Wang, Bo-Ru Lu, Prithviraj Ammanabrolu, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Customizable NLP, Miami, November 2024.
Toward a More Complete OMR Solution. Guang Yang, Muru Zhang, Lin Qiu, Yanming Wan, and Noah A. Smith. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR 2024), San Francisco, CA, November 2024.
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging. Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Pang Wei Koh, Jesse Dodge, and Pradeep Dasigi. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Findings), Miami, FL, November 2024.
Does Collaborative Human–LM Dialogue Generation Help Information Extraction from Human–Human Dialogues? Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, and Mari Ostendorf. In Proceedings of the Conference on Language Models (COLM 2024), Philadelphia, PA, October 2024.
Tuning Language Models by Proxy. Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference on Language Models (COLM 2024), Philadelphia, PA, October 2024.
BLINK: Multimodal Large Language Models Can See but Not Perceive. Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, and Ranjay Krishna. In Proceedings of the European Conference on Computer Vision (ECCV 2024), Milano, September/October 2024.
Time is Encoded in the Weights of Finetuned Language Models. Kai Nylund, Suchin Gururangan, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, August 2024.
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, and Kyle Lo. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, August 2024.
OLMo: Accelerating the Science of Language Models. Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, and Hannaneh Hajishirzi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, August 2024.
Set the Clock: Temporal Alignment of Pretrained Language Models. Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hanna Hajishirzi, and Noah A. Smith. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2024 Findings), Bangkok, August 2024.
How Language Model Hallucinations Can Snowball. Muru Zhang, Ofir Press, William Merrill, Alisa Liu, and Noah A. Smith. In Proceedings of the International Conference on Machine Learning (ICML 2024), Vienna, June 2024.
A Call for Clarity in Beam Search: How It Works and When It Stops. Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Dragomir Radev, Yejin Choi, and Noah A. Smith. In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 77–90, Torino, Italy, May 2024.
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore. Sewon Min, Suchin Gururangan, Eric Wallace, Weijia Shi, Hannaneh Hajishirzi, Noah A. Smith, and Luke Zettlemoyer. In Proceedings of the International Conference on Learning Representations (ICLR 2024), Vienna, May 2024.
In-Context Pretraining: Language Modeling Beyond Document Boundaries. Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Wen-tau Yih, and Mike Lewis. In Proceedings of the International Conference on Learning Representations (ICLR 2024), Vienna, May 2024.
What’s In My Big Data? Yanai Elazar, Akshita Bhagia, Ian Helgi Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Evan Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hannaneh Hajishirzi, Noah A. Smith, and Jesse Dodge. In Proceedings of the International Conference on Learning Representations (ICLR 2024), Vienna, May 2024.
What Can Natural Language Processing Do for Peer Review? Ilia Kuznetsov, Osama Mohammed Afzal, Koen Dercksen, Nils Dycke, Alexander Goldberg, Tom Hope, Dirk Hovy, Jonathan K. Kummerfeld, Anne Lauscher, Kevin Leyton-Brown, Sheng Lu, Mausam, Margot Mieskes, Aurélie Névéol, Danish Pruthi, Lizhen Qu, Roy Schwartz, Noah A. Smith, Thamar Solorio, Jingyan Wang, Xiaodan Zhu, Anna Rogers, Nihar B. Shah, and Iryna Gurevych. May 2024.
Know Your Audience: The Benefits and Pitfalls of Generating Plain Language Summaries Beyond the “General” Audience. Tal August, Kyle Lo, Noah A. Smith, and Katharina Reinecke. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI 2024), Honolulu, May 2024.
Estimating the Causal Effect of Early ArXiving on Paper Acceptance. Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, and Noah A. Smith. In Proceedings of the Conference on Causal Learning and Reasoning (CLeaR 2024), April 2024.
Third-Party Language Model Performance Prediction from Instruction. Rahul Nadkarni, Yizhong Wang, and Noah A. Smith. March 2024.
Encode once and decode in parallel: Efficient transformer decoding. Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A Smith, and Mari Ostendorf. March 2024.

2023

Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation. Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, and Hannaneh Hajishirzi. 2023.
Transparency Helps Reveal When Language Models Learn Meaning. Zhaofeng Wu, William Merrill, Hao Peng, Iz Beltagy, and Noah A. Smith. Transactions of the Association for Computational Linguistics 11:617-634, 2023.
Morphosyntactic Probing of Multilingual BERT Models. Judit Ács, Endre Hamerlik, Roy Schwartz, Noah A. Smith, and Andras Kornai. Journal of Natural Language Engineering, 2023.
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context? Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings), Singapore, December 2023.
RealTime QA: What’s the Answer Right Now? Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Velocity Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, and Kentaro Inui. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2023), New Orleans, LA, December 2023.
Measuring and Narrowing the Compositionality Gap in Language Models. Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, and Mike Lewis. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings), Singapore, December 2023.
Demystifying Prompts in Language Models via Perplexity Estimation. Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, and Luke Zettlemoyer. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings), Singapore, December 2023.
We’re Afraid Language Models Aren’t Modeling Ambiguity}. Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, and Yejin Choi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, December 2023.
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, and Yulia Tsvetkov. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, December 2023.
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training. Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, and Hannaneh Hajishirzi. In Proceedings of the Neural Information Processing Systems (NeurIPS 2023), New Orleans, LA, December 2023.
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, and Hannaneh Hajishirzi. In Proceedings of the Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2023), New Orleans, LA, December 2023.
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements. Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, and Hannaneh Hajishirzi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, December 2023.
Using Proprietary Language Models in Academic Research Requires Explicit Justification. Alexis Palmer, Noah A. Smith, and Arthur Spirling. Nature Computational Science (Correspondences), December 2023.
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2. Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, and Hannaneh Hajishirzi. November 2023.
Language Models: A Guide for the Perplexed. Sofia Serrano, Zander Brumbaugh, and Noah A. Smith. November 2023.
Prompt-Guided Image Captioning for VQA with GPT-3. Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, and Jiebo Luo. In Proceedings of the International Conference on Computer Vision (ICCV 2023), Paris, October 2023.
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering. Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, and Noah A. Smith. In Proceedings of the International Conference on Computer Vision (ICCV 2023), Paris, October 2023.
Risks and NLP Design: A Case Study on Procedural Document QA. Nikita Haduong, Alice Gao, and Noah A. Smith. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023 Findings), Toronto, July 2023.
LEXPLAIN: Improving Model Explanations via Lexicon Supervision. Orevaoghene Ahia, Hila Gonen, Vidhisha Balachandran, Yulia Tsvetkov, and Noah A. Smith. In Proceedings of the Joint Conference on Lexical and Computational Semantics (*SEM 2023), Toronto, July 2023.
Elaboration-Generating Commonsense Question Answering at Scale. Wenya Wang, Vivek Srikumar, Hannaneh Hajishirzi, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto, July 2023.
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors. Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, and Pradeep Dasigi. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023 Findings), Toronto, July 2023.
One Embedder, Any Task: Instruction-Finetuned Text Embeddings. Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, and Tao Yu . In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023 Findings), Toronto, July 2023.
Self-Instruct: Aligning Language Models with Self-Generated Instructions. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto, July 2023.
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference. Haoxin Li, Phillip Keung, Daniel Cheng, Jungo Kasai, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto, July 2023.
Stubborn Lexical Bias in Data and Models. Sofia Serrano, Jesse Dodge, and Noah A. Smith. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023 Findings), Toronto, July 2023.
Reproducibility in NLP: What Have We Learned from the Checklist? Ian Magnusson, Noah A. Smith, and Jesse Dodge. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2023 Findings), Toronto, July 2023.
Selective Annotation Makes Language Models Better Few-Shot Learners. Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, and Tao Yu . In Proceedings of the International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, May 2023.
Binding Language Models in Symbolic Languages. Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, and Tao Yu . In Proceedings of the International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, May 2023.

2022

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models. Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, and Luke Zettlemoyer. 2022.
Saturated Transformers are Constant-Depth Threshold Circuits. William Merrill, Ashish Sabharwal, and Noah A. Smith. Transactions of the Association for Computational Linguistics 10:843–856, 2022.
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers. Michael Hassid, Hao Peng, Daniel Rotem, Jungo Kasai, Ivan Montero, Noah A. Smith, and Roy Schwartz. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings), Abu Dhabi, UAE, December 2022.
Toward Reproducible and Standardized Human Evaluation for Text Generation. Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, and Daniel Weld. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, UAE, December 2022.
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation. Alisa Liu, Swabha Swayamdipta, Noah A. Smith, and Yejin Choi. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings), Abu Dhabi, UAE, December 2022.
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, and Tao Yu . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, UAE, December 2022.
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection. Suchin Gururangan, Dallas Card, Sarah K. Dreier, Emily K. Gade, Leroy Zhifei Wang, Zeyu Wang, Luke Zettlemoyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, UAE, December 2022.
In-Context Learning for Few-Shot Dialogue State Tracking. Yushi Hu, Chia-Hsuan Lee, Tianbao Xie, Tao Yu, Mari Ostendorf, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings), Abu Dhabi, UAE, December 2022.
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ Tasks. Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Atharva Naik, Arjun Ashok, Arut Selvan Dhanasekaran, Anjana Arunkumar, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Virendrabhai Purohit, Ishani Mondal, Jacob William Anderson, Kirby C. Kuznia, Krima Doshi, Kuntal Kumar Pal, Maitreya Patel, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Savan Doshi, Shailaja Keyur Sampat, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, and Daniel Khashabi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, UAE, December 2022.
Twist Decoding: Diverse Generators Guide Each Other. Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, UAE, December 2022.
Unsupervised Learning of Hierarchical Conversation Structure. Bo-Ru Lu, Yushi Hu, Hao Cheng, Mari Ostendorf, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings), Abu Dhabi, UAE, December 2022.
Modeling Context With Linear Attention for Scalable Document-Level Translation. Zhaofeng Wu, Hao Peng, Nikolaos Pappas, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings), Abu Dhabi, UAE, December 2022.
Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow. Maarten Sap, Anna Jafarpour, Yejin Choi, Noah A. Smith, James W. Pennebaker, and Eric Horvitz. Proceedings of the National Academy of Sciences, November 2022.
Patterns of Bias: How Mainstream Media Operationalize Links between Mass Shootings and Terrorism. Sarah K. Dreier, Emily K. Gade, Dallas Card, and Noah A. Smith. Political Communication, September 2022.
Transparent Human Evaluation for Image Captioning. Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Daniel Morrison, Ronan Le Bras, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand. Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Daniel Morrison, Alexander Fabbri, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
DEMix Layers: Disentangling Domains for Modular Language Modeling. Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, and Luke Zettlemoyer. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
Time Waits for No One! Analysis and Challenges of Temporal Misalignment. Kelvin Luu, Daniel Khashabi, Suchin Gururangan, Karishma Mandyam, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics. Ximing Lu, Sean Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, and Yejin Choi. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022), Seattle, WA, July 2022.
The Engage Corpus: A Social Media Dataset for Text-Based Recommender Systems. Daniel Cheng, Kyle Yan, Phillip Keung, and Noah A. Smith. In Proceedings of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France, June 2022.
Domain Mismatch Doesn’t Always Prevent Cross-lingual Transfer Learning. Daniel Edmiston, Phillip Keung, and Noah A. Smith. In Proceedings of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France, June 2022.
Measuring Machine Learning Software Carbon Intensity in Cloud Instances. Jesse Dodge, Taylor Prewitt, Remi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, Alexandra Sasha Luccioni, Noah A. Smith, Nicole DeCario, and Will Buchanan. In Conference on Fairness, Accountability, and Transparency (FAccT 2022), Seoul, June 2022.
Generating Scientific Definitions with Controllable Complexity. Tal August, Katharina Reinecke, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, May 2022.
Is GPT-3 Text Indistinguishable from Human Text? SCARECROW: A Framework for Scrutinizing Machine Text. Yao Dou, Maxwell Forbes, Rik Koncel-Kedziorski, Noah A. Smith, and Yejin Choi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, May 2022.
ABC: Attention with Bounded-memory Control. Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, May 2022.
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation. Ofir Press, Noah A. Smith, and Mike Lewis. In Proceedings of the International Conference on Learning Representations (ICLR 2022), virtual conference, April 2022.

2021

Infusing Finetuning with Semantic Dependencies. Zhaofeng Wu, Hao Peng, and Noah A. Smith. Transactions of the Association for Computational Linguistics 9:226–242, 2021.
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? William Merrill, Roy Schwartz, Yoav Goldberg, and Noah A. Smith. Transactions of the Association for Computational Linguistics 9:1047–1060, 2021.
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent. William Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Punta Cana, Dominican Republic, November 2021.
Measuring Association Between Labels and Free-Text Rationales. Sarah Wiegreffe, Ana Marasović, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Punta Cana, Dominican Republic, November 2021.
Finetuning Pretrained Transformers into RNNs. Jungo Kasai, Hao Peng, Yizhe Zhang, Dani Yogatama, Gabriel Ilharco, Nikolaos Pappas, Yi Mao, Weizhu Chen, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Punta Cana, Dominican Republic, November 2021.
Probing Across Time: What Does RoBERTa Know and When? Leo Zeyu Liu, Yizhong Wang, Jungo Kasai, Hannaneh Hajishirzi, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021 Findings), Punta Cana, Dominican Republic, November 2021.
Competency Problems: On Finding and Removing Artifacts in Language Data. Matt Gardner, William Merrill, Jesse Dodge, Matthew Peters, Alexis Ross, Sameer Singh, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Punta Cana, Dominican Republic, November 2021.
Specializing Multilingual Language Models: An Empirical Study. Ethan C. Chau and Noah A. Smith. In Proceedings of the EMNLP Workshop on Multilingual Representation Learning (MRL 2021), Punta Cana, Dominican Republic, November 2021.
Sentence Bottleneck Autoencoders from Transformer Language Models. Ivan Montero, Nikolaos Pappas, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), November 2021.
Expected Validation Performance and Estimation of a Random Variable’s Maximum. Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021 Findings), Punta Cana, Dominican Republic, November 2021.
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study. Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh Hajishirzi, and Tom Hope. In Proceedings of the Conference on Automated Knowledge Base Construction (AKBC 2021), Irvine, CA, October 2021.
Explaining Relationships Between Scientific Documents. Kelvin Luu, Xinyi Wu, Rik Koncel-Kedziorski, Kyle Lo, Isabel Cachola, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021), virtual conference, August 2021.
Promoting Graph Awareness in Linearized Graph-to-Text Generation. Alexander Miserlis Hoyle, Ana Marasović, and Noah A. Smith. In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021 Findings), virtual conference, August 2021.
Shortformer: Better Language Modeling using Shorter Inputs. Ofir Press, Noah A. Smith, and Mike Lewis. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021), virtual conference, August 2021.
DExperts: On-the-Fly Controlled Text Generation with Experts and Anti-Experts. Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, and Yejin Choi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021), virtual conference, August 2021.
All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text. Elizabeth Clark, Tal August, Sofia Serrano, Nikita Haduong, Suchin Gururangan, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021), virtual conference, August 2021.
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers. Pradeep Dasigi, Kyle Lo, Iz Beltagy, Arman Cohan, Noah A. Smith, and Matt Gardner. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021), virtual conference, June 2021.
Choose Your Own Adventure: Paired Suggestions in Collaborative Writing for Evaluating Story Generation Models. Elizabeth Clark and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021), virtual conference, June 2021.
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation. Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, and Noah A. Smith. In Proceedings of the International Conference on Learning Representations (ICLR 2021), virtual conference, May 2021.
Random Feature Attention. Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, and Lingpeng Kong. In Proceedings of the International Conference on Learning Representations (ICLR 2021), virtual conference, May 2021.
Challenges in Algorithmic Debiasing for Toxic Language Detection. Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021), virtual conference, April 2021.
GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation. Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, and Daniel S. Weld. January 2021.

2020

Green AI. Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. Communications of the Association for Computing Machinery, 2020.
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings. Phillip Keung, Julian Salazar, Yichao Lu, and Noah A. Smith. Transactions of the Association for Computational Linguistics 8:828–841, 2020.
Evaluating NLP Models via Contrast Sets. Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hannaneh Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, and Ben Zhou. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020 Findings), virtual conference, November 2020.
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics. Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, and Yejin Choi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020 Findings), virtual conference, November 2020.
Grounded Compositional Outputs for Adaptive Language Modeling. Nikolaos Pappas, Phoebe Mulcaire, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
Parsing with Multilingual BERT, a Small Treebank, and a Small Corpus. Ethan C. Chau, Lucy H. Lin, and Noah A. Smith. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020 Findings), virtual conference, November 2020.
Multilevel Text Alignment with Cross-Document Attention. Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs. Ana Marasović, Chandra Bhagavatula, Jae Sung Park, Ronan Le Bras, Noah A. Smith, and Yejin Choi. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020 Findings), virtual conference, November 2020.
Writing Strategies for Science Communication: Data and Computational Analysis. Tal August, Lauren Kim, Katharina Reinecke, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
The Multilingual Amazon Reviews Corpus. Phillip Keung, Yichao Lu, György Szarvas, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
Plug and Play Autoencoders for Conditional Text Generation. Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith, and James Henderson. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), virtual conference, November 2020.
Thinking Like a Skeptic: Defeasible Inference in Natural Language. Rachel Rudinger, Vered Shwartz, Jena D. Hwang, Chandra Bhagavatula, Maxwell Forbes, Ronan Le Bras, Noah A. Smith, and Yejin Choi. In Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020 Findings), virtual conference, November 2020.
Improving Transformer Models by Reordering their Sublayers. Ofir Press, Noah A. Smith, and Omer Levy. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
Social Bias Frames: Reasoning about Social and Power Implications of Language. Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, and Yejin Choi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
The Right Tool for the Job: Matching Model and Instance Complexities. Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
A Formal Hierarchy of RNN Architectures. William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, and Eran Yahav. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
A Mixture of h – 1 Heads is Better than h Heads. Hao Peng, Roy Schwartz, Dianqi Li, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
Recollection versus Imagination: Exploring Human Memory and Cognition via Large Language Models. Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, and James Pennebaker. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2020), virtual conference, July 2020.
Exploring the Effect of Author and Reader Identity in Online Story Writing: the StoriesInTheWild Corpus. Tal August, Maarten Sap, Elizabeth Clark, Katharina Reinecke, and Noah A. Smith. In ACL Workshop on Narrative Understanding, Storylines, and Events (NUSE 2020), July 2020.
Multilingual and Interlingual Semantic Representations for Natural Language Processing: A Brief Introduction. Marta R. Costa-jussà, Cristina España-Bonet, Pascale Fung, and Noah A. Smith. Computational Linguistics, June 2020.
Contextual Word Representations: Putting Words into Computers. Noah A. Smith. Communications of the Association for Computing Machinery 63(6), June 2020.
On Consequentialism and Fairness. Dallas Card and Noah A. Smith. Frontiers in Artificial Intelligence 3, May 2020.
Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science. Tal August, Dallas Card, Gary Hsieh, Noah A. Smith, and Katharina Reinecke. In Proceedings of the ACM CHI Conference on Human Factors in Computing Systems (CHI 2020), April 2020.
Multi-View Learning for Vision-and-Language Navigation. Qiaolin Xia, Xiujun Li, Chunyuan Li, Yonatan Bisk, Zhifang Sui, Jianfeng Gao, Yejin Choi, and Noah A. Smith. March 2020.
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping. Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, and Noah A. Smith. February 2020.
Etch-a-Sketching: Evaluating the Post-Primary Rhetorical Moderation Hypothesis. Brice D. L. Acree, Yanchuan Sim, Justin H. Gross, Amber E. Boydstun, and Noah A. Smith. American Politics Research 48(1):99–131, January 2020.

2019

Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning. Pradeep Dasigi, Nelson F. Liu, Ana Marasović, Noah A. Smith, and Matt Gardner. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
Topics to Avoid: Demoting Latent Confounds in Text Classification. Sachin Kumar, Shuly Wintner, Noah A. Smith, and Yulia Tsvetkov. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
PaLM: A Hybrid Parser and Language Model. Hao Peng, Roy Schwartz, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
Robust Navigation with Language Pretraining and Stochastic Sampling. Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah A. Smith, and Yejin Choi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
Show Your Work: Improved Reporting of Experimental Results. Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
RNN Architecture Learning with Sparse Regularization. Jesse Dodge, Roy Schwartz, Hao Peng, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
Knowledge Enhanced Contextual Word Representations. Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 2019.
Low-Resource Parsing with Crosslingual Contextualized Representations. Phoebe Mulcaire, Jungo Kasai, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2019), Hong Kong, November 2019.
Measuring Online Debaters’ Persuasive Skill from Text over Time. Kelvin Luu, Chenhao Tan, and Noah A. Smith. Transactions of the Association for Computational Linguistics 7:537–550, September 2019.
Situating Sentence Embedders with Nearest Neighbor Overlap. Lucy H. Lin and Noah A. Smith. September 2019.
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks. Matthew Peters, Sebastian Ruder, and Noah A. Smith. In Proceedings of the ACL Workshop on Representation Learning for Natural Language Processing (RepL4NLP 2019), Florence, Italy, August 2019.
Shallow Syntax in Deep Water. Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, and Noah A. Smith. August 2019.
Evaluating Gender Bias in Machine Translation. Gabriel Stanovsky, Noah A. Smith, and Luke Zettlemoyer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 2019.
Variational Pretraining for Semi-supervised Text Classification. Suchin Gururangan, Tam Dang, Dallas Card, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 2019.
Is Attention Interpretable? Sofia Serrano and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 2019.
Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts. Elizabeth Clark, Asli Celikyilmaz, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 2019.
The Risk of Racial Bias in Hate Speech Detection. Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 2019.
Polyglot Contextual Representations Improve Crosslingual Transfer. Phoebe Mulcaire, Jungo Kasai, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), Minneapolis, MN, June 2019.
Linguistic Knowledge and Transferability of Contextual Representations. Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), Minneapolis, MN, June 2019.
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets. Nelson F. Liu, Roy Schwartz, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), Minneapolis, MN, June 2019.
Analyzing Privacy Policies at Scale: From Crowdsourcing to Automated Annotations. Shomir Wilson, Florian Schaub, Frederick Liu, Kanthashree Mysore Sathyendra, Daniel Smullen, Sebastian Zimmeck, Rohan Ramanath, Peter Story, Fei Liu, Norman Sadeh, and Noah A. Smith. ACM Transactions on the Web 13(1), February 2019.
ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. Maarten Sap, Ronan LeBras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, and Yejin Choi. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI 2019), Honolulu, HI, January/February 2019.
Deep Weighted Averaging Classifiers. Dallas Card, Michael Zhang, and Noah A. Smith. In Proceedings of the ACM Fairness, Accountability, and Transparency Conference (FAT* 2019), New York, NY, January 2019.

2018

Rational Recurrences. Hao Peng, Roy Schwartz, Sam Thomson, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium, November 2018.
Neural Cross-lingual Named Entity Recognition with Minimal Resources. Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, and Jaime Carbonell. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium, November 2018.
Syntactic Scaffolds for Semantic Structures. Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium, November 2018.
Framing Effect: The Choice of Slogans Used to Advertise Online Experiments Can Boost Recruitment and Lead to Sample Biases. Tal August, Nigini Oliveira, Chenhao Tan, Noah A. Smith, and Katharina Reinecke. In Proceedings of the Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2018), Jersey City, NJ, November 2018.
You May Not Need Attention. Ofir Press and Noah A. Smith. October 2018.
Natural Language Processing for Analyzing Disaster Recovery Trends Expressed in Large Text Corpora. Lucy H. Lin, Scott B. Miles, and Noah A. Smith. In IEEE Global Humanitarian Technology Conference (GHTC 2018), San Jose, California, October 2018.
Semantic Matching Against a Corpus: New Methods and Applications. Lucy H. Lin, Scott B. Miles, and Noah A. Smith. August 2018.
Neural Models for Documents with Metadata. Dallas Card, Chenhao Tan, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 2018.
LSTMs Exploit Linguistic Attributes of Data. Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, and Noah A. Smith. In Proceedings of the ACL Workshop on Representation Learning for Natural Language Processing (RepL4NLP 2018), Melbourne, Australia, July 2018.
Polyglot Semantic Role Labeling. Phoebe Mulcaire, Swabha Swayamdipta, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 2018.
Backpropagating through Structured Argmax using a SPIGOT. Hao Peng, Sam Thomson, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 2018.
Event2Mind: Commonsense Inference on Events, Intents, and Reactions. Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, and Yejin Choi. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 2018.
Bridging CNNs, RNNs, and Weighted Finite-State Machines. Roy Schwartz, Sam Thomson, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, July 2018.
The Importance of Calibration for Estimating Proportions from Annotations. Dallas Card and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, LA, June 2018.
Neural Text Generation in Stories using Entity Representations as Context. Elizabeth Clark, Yangfeng Ji, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, LA, June 2018.
Sounding Board: A User-Centric and Content-Driven Social Chatbot. Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, and Mari Ostendorf. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (Demonstration Papers) (NAACL 2018), New Orleans, LA, June 2018.
Discovering Phonesthemes with Sparse Regularization. Nelson F. Liu, Gina-Anne Levow, and Noah A. Smith. In Proceedings of the NAACL Workshop on Subword and Character-Level Models in Natural Language Processing, New Orleans, LA, June 2018.
Parsing Tweets into Universal Dependencies. Yijia Liu, Yi Zhu, Wanxiang Che, Bing Qin, Nathan Schneider, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, LA, June 2018.
Learning Joint Semantic Parsers from Disjoint Data. Hao Peng, Sam Thomson, Swabha Swayamdipta, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, LA, June 2018.
Annotation Artifacts in Natural Language Inference Data. Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, LA, June 2018.
“You are no Jack Kennedy”: On Media Highlights of Presidential Debates. Chenhao Tan, Hao Peng, and Noah A. Smith. In Proceedings of the International World Wide Web Conference (WWW 2018), Lyon, France, April 2018.
Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories. Elizabeth Clark, Anne Spencer Ross, Chenhao Tan, Yangfeng Ji, and Noah A. Smith. In Proceedings of the Conference on Intelligent User Interfaces (IUI 2018), Tokyo, Japan, March 2018.
Politifact Language Audit. Dallas Card, Lucy H. Lin, and Noah A. Smith. March 2018.

2017

End-to-End Neural Segmental Models for Speech Recognition. Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, and Steve Renals. IEEE Journal of Selected Topics in Signal Processing 11(8):1254–1264, December 2017.
Dynamic Entity Representations in Neural Language Models. Yangfeng Ji, Chenhao Tan, Sebastian Martschat, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), Copenhagen, Denmark, September 2017.
The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task. Roy Schwartz, Maarten Sap, Ioannis Konstas, Leila Zilles, Yejin Choi, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, BC, August 2017.
Multitask Learning with CTC and Segmental CRF for Speech Recognition. Liang Lu, Lingpeng Kong, Chris Dyer, and Noah A. Smith. In Proceedings of InterSpeech (InterSpeech 2017), Stockholm, Sweden, August 2017.
Deep Multitask Learning for Semantic Dependency Parsing. Hao Peng, Sam Thomson, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver, BC, July/August 2017.
Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts. Chenhao Tan, Dallas Card, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver, BC, July/August 2017. Also available: appendix.
Neural Discourse Structure for Text Categorization. Yangfeng Ji and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver, BC, July/August 2017.
Greedy Transition-based Dependency Parsing with Stack LSTMs. Miguel Ballesteros, Chris Dyer, Yoav Goldberg, and Noah A. Smith. Computational Linguistics 43(2), June 2017.
Open Loop Hyperparameter Optimization and Determinantal Point Processes. Jesse Dodge, Kevin Jamieson, and Noah A. Smith. June 2017.
Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold. Swabha Swayamdipta, Sam Thomson, Chris Dyer, and Noah A. Smith. June 2017.
World Vaping Day: Contextualizing Vaping Culture in Online Social Media using a Mixed Methods Approach. Jason B. Colditz, Joel Welling, Noah A. Smith, A. Everette James, and Brian A. Primack. Journal of Mixed Methods Research, April 2017.
What Do Recurrent Neural Network Grammars Learn About Syntax? Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia, Spain, April 2017.
Story Cloze Task: UW NLP System. Roy Schwartz, Maarten Sap, Ioannis Konstas, Li Zilles, Yejin Choi, and Noah A. Smith. In Proceedings of the Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem 2017), pages 52–55, Valencia, Spain, April 2017.

2016

Analyzing Framing through the Casts of Characters in the News. Dallas Card, Justin H. Gross, Amber E. Boydstun, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
A Neural Model for Language Identification in Code-Switched Tweets. Aaron Jaech, Phoebe Mulcaire, Mari Ostendorf, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Computational Approaches to Linguistic Code Switching (LICS 2016), Austin, TX, November 2016.
Hierarchical Character-Word Models for Language Identification. Aaron Jaech, Phoebe Mulcaire, Shobhit Hathi, Mari Ostendorf, and Noah A. Smith. In Proceedings of the International Workshop on Natural Language Processing for Social Media (SocialNLP 2016), Austin, TX, November 2016.
Character Sequence Models for Colorful Words. Kazuya Kawakami, Chris Dyer, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser. Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
Semi-Supervised Learning of Sequence Models with Method of Moments. Zita Marinho, André F. T. Martins, Shay B. Cohen, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
Friends with Motives: Using Text to Infer Influence on SCOTUS. Yanchuan Sim, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
Training with Exploration Improves a Greedy Stack LSTM Parser. Miguel Ballesteros, Yoav Goldberg, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, TX, November 2016.
Linguistic Markers of Status in Food Culture: Bourdieu’s Distinction in a Menu Corpus. Dan Jurafsky, Victor Chahuneau, Bryan R. Routledge, and Noah A. Smith. Cultural Analytics, October 2016.
Segmental Recurrent Neural Networks for End-to-end Speech Recognition. Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, and Steve Renals. In Proceedings of InterSpeech (InterSpeech 2016), San Francisco, CA, September 2016.
Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs. Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2016), Berlin, Germany, August 2016.
Many Languages, One Parser. Waleed Ammar, Phoebe Mulcaire, Miguel Ballesteros, Chris Dyer, and Noah A. Smith. Transactions of the Association for Computational Linguistics 4:431–444, July 2016.
Recurrent Neural Network Grammars. Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2016), San Diego, CA, June 2016.
Generation from Abstract Meaning Representation using Tree Transducers. Jeffrey Flanigan, Chris Dyer, Noah A. Smith, and Jaime Carbonell. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2016), San Diego, CA, June 2016.
CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss. Jeffrey Flanigan, Chris Dyer, Noah A. Smith, and Jaime Carbonell. In Proceedings of the NAACL Workshop on Semantic Evaluations (SemEval 2016), San Diego, CA, June 2016.
UW-CSE: Detecting Multiword Expressions and Supersenses using Double-Chained Conditional Random Fields. Mohammad Javad Hosseini, Noah A. Smith, and Su-In Lee. In Proceedings of the NAACL Workshop on Semantic Evaluations (SemEval 2016), San Diego, CA, June 2016.
Segmental Recurrent Neural Networks. Lingpeng Kong, Chris Dyer, and Noah A. Smith. In Proceedings of the International Conference on Learning Representations (ICLR 2016), San Juan, PR, May 2016.
Crowdsourcing Annotations for Websites’ Privacy Policies: Can It Really Work? Shomir Wilson, Florian Schaub, Rohan Ramanath, Norman Sadeh, Fei Liu, Noah A. Smith, and Frederick Liu. In Proceedings of the International World Wide Web Conference (WWW 2016), Montréal, Quebec, April 2016.
Massively Multilingual Word Embeddings. Waleed Ammar, Phoebe Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, and Noah A. Smith. February 2016.

2015

A Sparse and Adaptive Prior for Time-Dependent Model Parameters. Dani Yogatama, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the NIPS Workshop on Time Series, Montréal, Québec, December 2015.
Annotating Character Relationships in Literary Texts. Philip Massey, Patrick Xia, David Bamman, and Noah A. Smith. December 2015.
Open Extraction of Fine-Grained Political Statements. David Bamman and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs. Miguel Ballesteros, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
A Utility Model of Authors in the Scientific Community. Yanchuan Sim, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
Extractive Summarization by Maximizing Semantic Volume. Dani Yogatama, Fei Liu, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
Bayesian Optimization of Text Representations. Dani Yogatama, Lingpeng Kong, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
The Media Frames Corpus: Annotations of Frames Across Issues. Dallas Card, Amber E. Boydstun, Justin H. Gross, Philip Resnik, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
Transition-Based Dependency Parsing with Stack Long Short-Term Memory. Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
Sparse Overcomplete Word Vector Representations. Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
A Supertag-Context Model for Weakly-Supervised CCG Parser Learning. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2015), Beijing, China, July 2015.
Frame-Semantic Role Labeling with Heterogeneous Annotations. Meghana Kshirsagar, Sam Thomson, Nathan Schneider, Jaime Carbonell, Noah A. Smith, and Chris Dyer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
Learning Word Representations with Hierarchical Sparse Coding. Dani Yogatama, Manaal Faruqui, Chris Dyer, and Noah A. Smith. In Proceedings of the International Conference on Machine Learning (ICML 2015), Lille, France, July 2015.
Retrofitting Word Vectors to Semantic Lexicons. Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
Transforming Dependencies into Phrase Structures. Lingpeng Kong, Alexander M. Rush, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
Toward Abstractive Summarization Using Semantic Representations. Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
A Corpus and Model Integrating Multiword Expressions and Supersenses. Nathan Schneider and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
Contextualized Sarcasm Detection on Twitter. David Bamman and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2015), Oxford, UK, May 2015.
Modeling User Arguments, Interactions, and Attributes for Stance Prediction in Online Debate Forums. Minghui Qiu, Yanchuan Sim, Noah A. Smith, and Jing Jiang. In Proceedings of the SIAM Conference on Data Mining (SDM 2015), Vancouver, BC, April/May 2015. Also available: appendix.
AD³: Alternating Directions Dual Decomposition for MAP Inference in Graphical Models. André F. T. Martins, Mário A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, and Eric P. Xing. Journal of Machine Learning Research 16:495–545, March 2015.
The Utility of Text: The Case of Amicus Briefs and the Supreme Court. Yanchuan Sim, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, TX, January 2015.
Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, TX, January 2015.

2014

Conditional Random Field Autoencoders for Unsupervised Structured Prediction. Waleed Ammar, Chris Dyer, and Noah A. Smith. In Advances in Neural Information Processing Systems 27 (NIPS 2014), Montréal, Quebec, December 2014.
Identifying Relevant Text Fragments to Help Crowdsource Privacy Policy Annotations. Rohan Ramanath, Florian Schaub, Shomir Wilson, Fei Liu, Norman Sadeh, and Noah A. Smith. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2014), Pittsburgh, PA, November 2014.
Diffusion of Language Change in Social Media. Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. PLoS ONE, November 2014.
A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, October 2014.
Unsupervised Discovery of Biographical Structure from Text. David Bamman and Noah A. Smith. Transactions of the Association for Computational Linguistics 2(2014):363–376, October 2014.
Tracking the Development of Media Frames within and across Policy Issues. Amber E. Boydstun, Dallas Card, Justin H. Gross, Philip Resnik, and Noah A. Smith. August 2014.
A Step Towards Usable Privacy Policy: Automatic Alignment of Privacy Statements. Fei Liu, Rohan Ramanath, Norman Sadeh, and Noah A. Smith. In Proceedings of the International Conference on Computational Linguistics (COLING 2014), Dublin, Ireland, August 2014.
CMU: Arc-Factored, Discriminative Semantic Dependency Parsing. Sam Thomson, Brendan O’Connor, Jeffrey Flanigan, David Bamman, Jesse Dodge, Swabha Swayamdipta, Nathan Schneider, Chris Dyer, and Noah A. Smith. In Proceedings of the International (COLING) Workshop on Semantic Evaluations (SemEval 2014), Dublin, Ireland, August 2014.
A Bayesian Mixed Effects Model of Literary Character. David Bamman, Ted Underwood, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
Distributed Representations of Geographically Situated Language. David Bamman, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
A Discriminative Graph-Based Parser for the Abstract Meaning Representation. Jeffrey Flanigan, Sam Thomson, Jaime Carbonell, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
Weakly-Supervised Bayesian Learning of a CCG Supertagger. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2014), Baltimore, MD, June 2014.
Simplified Dependency Annotations with GFL-Web. Michael T. Mordowanec, Nathan Schneider, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2014 demonstration track), Baltimore, MD, June 2014.
Unsupervised Alignment of Privacy Policies using Hidden Markov Models. Rohan Ramanath, Fei Liu, Norman Sadeh, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014. Also available: appendix.
Linguistic Structured Sparsity in Text Categorization. Dani Yogatama and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014. Also available: talk slides.
Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers. Dani Yogatama and Noah A. Smith. In Proceedings of the International Conference on Machine Learning (ICML 2014), Beijing, China, June 2014.
Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features. Kevin Gimpel and Noah A. Smith. Computational Linguistics 40(2), June 2014.
Overview of the 2014 NLP Unshared Task in PoliInformatics. Noah A. Smith, Claire Cardie, Anne L. Washington, and John D. Wilkerson. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pages 5–7, Baltimore, MD, June 2014.
Comprehensive Annotation of Multiword Expressions in a Social Web Corpus. Nathan Schneider, Spencer Onuffer, Nora Kazour, Emily Danchik, Michael T. Mordowanec, Henrietta Conrad, and Noah A. Smith. In Proceedings of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, May 2014.
An Empirical Comparison of Parsing Methods for Stanford Dependencies. Lingpeng Kong and Noah A. Smith. April 2014.
Narrative Framing of Consumer Sentiment in Online Restaurant Reviews. Dan Jurafsky, Victor Chahuneau, Bryan R. Routledge, and Noah A. Smith. First Monday 19(4), April 2014.
Dynamic Models of Streaming Text. Dani Yogatama, Chong Wang, Bryan R. Routledge, Noah A. Smith, and Eric P. Xing. Transactions of the Association for Computational Linguistics 2:181–192, April 2014.
Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut. Nathan Schneider, Emily Danchik, Chris Dyer, and Noah A. Smith. Transactions of the Association for Computational Linguistics 2:193–206, April 2014.
Frame-Semantic Parsing. Dipanjan Das, Desai Chen, André F. T. Martins, Nathan Schneider, and Noah A. Smith. Computational Linguistics 40(1):9–56, March 2014.

2013

Translating into Morphologically Rich Languages with Synthetic Phrases. Victor Chahuneau, Eva Schlinger, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013.
Learning Topics and Positions from Debatepedia. Swapna Gottipati, Minghui Qiu, Yanchuan Sim, Jing Jiang, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013. Also available: appendix.
Measuring Ideological Proportions in Political Speeches. Yanchuan Sim, Brice D. L. Acree, Justin H. Gross, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013. Also available: appendix.
Predicting the NFL Using Twitter. Shiladitya Sinha, Chris Dyer, Kevin Gimpel, and Noah A. Smith. In Proceedings of the ECML/PKDD Workshop on (Machine Learning and Data Mining for) Sports Analytics, Prague, Czech Republic, September 2013.
Learning to Extract International Relations from Political Context. Brendan O’Connor, Brandon Stewart, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013. Also available: appendix.
Learning Latent Personas of Film Characters. David Bamman, Brendan O’Connor, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013.
Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers. André F. T. Martins, Miguel Almeida, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013.
A Framework for (Under)specifying Dependency Syntax without Overloading Annotators. Nathan Schneider, Brendan O’Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, and Jason Baldridge. In Proceedings of the ACL Linguistic Annotation Workshop (LAW 2013), Sofia, Bulgaria, August 2013. Also available: extended technical report.
Testing the Etch-a-Sketch Hypothesis: A Computational Analysis of Mitt Romney’s Ideological Makeover During the 2012 Primary vs. General Elections. Justin H. Gross, Brice D. L. Acree, Yanchuan Sim, and Noah A. Smith. Presented at the Annual Meeting of the American Political Science Association, August 2013.
A Penny for your Tweets: Campaign Contributions and Capitol Hill Microblogs. Tae Yano, Dani Yogatama, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2013), Boston, MA, July 2013.
Knowledge-Rich Morphological Priors for Bayesian Language Models. Victor Chahuneau, Noah A. Smith, and Chris Dyer. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
A Simple, Fast, and Effective Reparameterization of IBM Model 2. Chris Dyer, Victor Chahuneau, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters. Olutobi Owoputi, Brendan O’Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
Supersense Tagging for Arabic: the MT-in-the-Middle Attack. Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
New Alignment Methods for Discriminative Summarization. David Bamman and Noah A. Smith. May 2013.
Linguistic Structure Prediction with the Sparseptron. Noah A. Smith and André F. T. Martins. ACM Crossroads 19(3):44–48, April 2013.

2012

Mapping the Geographical Diffusion of New Words. Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. In Proceedings of the NIPS Workshop on Social Network and Social Media Analysis: Methods, Models and Applications, Lake Tahoe, NV, December 2012.
Automatic Categorization of Privacy Policies: A Pilot Study. Waleed Ammar, Shomir Wilson, Norman Sadeh, and Noah A. Smith. Pittsburgh, PA, December 2012.
pycdec: A Python Interface to cdec. Victor Chahuneau, Noah A. Smith, and Chris Dyer . Prague Bulletin of Mathematical Linguistics 98:51–61, October 2012.
Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning. Shay B. Cohen and Noah A. Smith. Computational Linguistics 38(3), September 2012.
Adversarial Evaluation for Models of Natural Language. Noah A. Smith. July 2012.
Transliteration by Sequence Labeling with Lattice Encodings and Reranking. Waleed Ammar, Chris Dyer, and Noah A. Smith. In Proceedings of the ACL Named Entities Workshop, Jeju, Korea, July 2012.
Word Salad: Relating Food Prices and Descriptions. Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily Scherlis, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP 2012), Jeju, Korea, July 2012. Also available: appendix.
Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study. Nathan Schneider, Behrang Mohit, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea, July 2012.
Discovering Factions in the Computational Linguistics Community. Yanchuan Sim, Noah A. Smith, and David A. Smith. In Proceedings of the ACL Workshop on Rediscovering Fifty Years of Discoveries, Jeju, Korea, July 2012.
A Probabilistic Model for Canonicalizing Named Entity Mentions. Dani Yogatama, Yanchuan Sim, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea, July 2012.
An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints. Dipanjan Das, André F. T. Martins, and Noah A. Smith. In Proceedings of the Joint Conference on Lexical and Computational Semantics (*SEM 2012), Montréal, Québec, June 2012.
Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties. Dipanjan Das and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012. Also available: talk slides.
Structured Ramp Loss Minimization for Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012.
Concavity and Initialization for Unsupervised Dependency Parsing. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012.
Textual Predictors of Bill Survival in Congressional Committees. Tae Yano, Noah A. Smith, and John D. Wilkerson. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), pages 793–802, Montréal, Québec, June 2012. Also available: talk slides.
The CMU-Oxford Translation System for the NIST Open Machine Translation 2012 Evaluation. Chris Dyer, Noah A. Smith, Graham Morehead, Phil Blunsom, and Abby Levenberg. May 2012.
Recall-Oriented Learning of Named Entities in Arabic Wikipedia. Behrang Mohit, Nathan Schneider, Rishav Bhowmick, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France, April 2012. Also available: extended technical report. Also available: appendix.
Censorship and Content Deletion in Chinese Social Media. David Bamman, Brendan O’Connor, and Noah A. Smith. First Monday 17(3), March 2012.

2011

Computational Text Analysis for Social Science: Model Complexity and Assumptions. Brendan O’Connor, David Bamman, and Noah A. Smith. In Proceedings of the NIPS Workshop on Computational Social Science and the Wisdom of Crowds, Sierra Nevada, Spain, December 2011.
Unsupervised Bilingual POS Tagging with Markov Random Fields. Desai Chen, Chris Dyer, Shay B. Cohen, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Unsupervised Learning in NLP (UNSUP 2011), Edinburgh, UK, July 2011.
Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance. Shay B. Cohen, Dipanjan Das, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
The CMU-ARK German-English Translation System. Chris Dyer, Kevin Gimpel, Jonathan H. Clark, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Statistical Machine Translation (SMT 2011), Edinburgh, UK, July 2011.
Structured Databases of Named Entities from Bayesian Nonparametrics. Jacob Eisenstein, Tae Yano, William W. Cohen, Noah A. Smith, and Eric P. Xing. In Proceedings of the EMNLP Workshop on Unsupervised Learning in NLP (UNSUP 2011), Edinburgh, UK, July 2011. Also available: talk slides.
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
Generative Models of Monolingual and Bilingual Gappy Patterns. Kevin Gimpel and Noah A. Smith. In Proceedings of the EMNLP Workshop on Statistical Machine Translation (SMT 2011), Edinburgh, UK, July 2011.
Dual Decomposition with Many Overlapping Components. André F. T. Martins, Noah A. Smith, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
Structured Sparsity in Structured Prediction. André F. T. Martins, Noah A. Smith, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
Predicting a Scientific Community’s Response to an Article. Dani Yogatama, Michael Heilman, Brendan O’Connor, Chris Dyer, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011. Minor revisions. Also available: extended technical report.
An Augmented Lagrangian Approach to Constrained MAP Inference. André F. T. Martins, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2011), Bellevue, WA, June/July 2011.
Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2011), Portland, OR, June 2011.
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates. Dipanjan Das and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
Unsupervised Word Alignment with Arbitrary Features. Chris Dyer, Jonathan H. Clark, Alon Lavie, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
Discovering Sociolinguistic Associations with Structured Sparsity. Jacob Eisenstein, Noah A. Smith, and Eric P. Xing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments. Kevin Gimpel, Nathan Schneider, Brendan O’Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2011), Portland, OR, June 2011.
Author Age Prediction from Text using Linear Regression. Dong Nguyen, Noah A. Smith, and Carolyn P. Rosé. In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LATECH 2011), Portland, OR, June 2011.
Linguistic Structure Prediction. Noah A. Smith. Morgan and Claypool, May 2011.
Online Learning of Structured Predictors with Multiple Kernels. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2011), Fort Lauderdale, FL, April 2011.
Products of Weighted Logic Programs. Shay B. Cohen, Robert J. Simmons, and Noah A. Smith. Theory and Practice of Logic Programming 11(2–3):263–296, January 2011.
Favor Short Dependencies: Parsing with Soft and Hard Constraints on Dependency Length. Jason Eisner and Noah A. Smith. In ed. Harry Bunt, Paola Merlo, and Joakim Nivre, Trends in Parsing Technology: Dependency Parsing, Domain Adaptation, and Deep Parsing, Text, Speech, and Language Technology 43, chapter 8, pages 121–150, 2011, Springer.

2010

Empirical Risk Minimization with Approximations of Probabilistic Grammars. Shay B. Cohen and Noah A. Smith. In Advances in Neural Information Processing Systems 23 (NIPS 2010), Vancouver, BC, December 2010. Also available: appendix.
Online Multiple Kernel Learning for Structured Prediction. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the NIPS Workshop on New Directions in Multiple Kernel Learning, Whistler, BC, December 2010.
Augmenting Dual Decomposition for MAP Inference. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the International Workshop on Optimization for Machine Learning (OPT 2010), Whistler, BC, December 2010.
Discovering Demographic Language Variation. Brendan O’Connor, Jacob Eisenstein, Eric P. Xing, and Noah A. Smith. In Proceedings of the NIPS Workshop on Machine Learning for Social Computing, Whistler, BC, December 2010.
Covariance in Unsupervised Learning of Probabilistic Grammars. Shay B. Cohen and Noah A. Smith. Journal of Machine Learning Research 11:3017–3051, November 2010.
A Latent Variable Model for Geographic Lexical Variation. Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), Cambridge, MA, October 2010.
Turbo Parsers: Dependency Parsing by Approximate Variational Inference. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), Cambridge, MA, October 2010.
Nonparametric Word Segmentation for Machine Translation. ThuyLinh Nguyen, Stephan Vogel, and Noah A. Smith. In Proceedings of the International Conference on Computational Linguistics (COLING 2010), Beijing, China, August 2010.
SEMAFOR: Frame Argument Resolution with Log-Linear Models. Desai Chen, Nathan Schneider, Dipanjan Das, and Noah A. Smith. In Proceedings of the International (ACL) Workshop on Semantic Evaluations (SemEval 2010), Uppsala, Sweden, July 2010.
Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization. Shay B. Cohen and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2010), pages 1502–1511, Uppsala, Sweden, July 2010.
Distributed Asynchronous Online Learning for Natural Language Processing. Kevin Gimpel, Dipanjan Das, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2010), Uppsala, Sweden, July 2010.
Visualizing Topical Quotations Over Time to Understand News Discourse. Nathan Schneider, Rebecca Hwa, Philip Gianfortoni, Dipanjan Das, Michael Heilman, Alan W. Black, Frederick L. Crabbe, and Noah A. Smith. Pittsburgh, PA, July 2010.
Variational Inference for Adaptor Grammars. Shay B. Cohen, David M. Blei, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Probabilistic Frame-Semantic Parsing. Dipanjan Das, Nathan Schneider, Desai Chen, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010. Also available: extended technical report.
Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions. Kevin Gimpel and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010. Also available: extended technical report.
Softmax-Margin Training for Structured Log-Linear Models. Kevin Gimpel and Noah A. Smith. Pittsburgh, PA, June 2010.
Good Question! Statistical Ranking for Question Generation. Michael Heilman and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010. Also available: extended technical report.
Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions. Michael Heilman and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010. Also available: appendix.
Rating Computer-Generated Questions with Mechanical Turk. Michael Heilman and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Creating Speech and Language Data With Mechanical Turk, Los Angeles, CA, June 2010.
Extracting Simplified Statements for Factual Question Generation. Michael Heilman and Noah A. Smith. In Proceedings of the AIED Workshop on Question Generation, Pittsburgh, PA, June 2010.
Movie Reviews and Revenues: An Experiment in Text Regression. Mahesh Joshi, Dipanjan Das, Kevin Gimpel, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Aggressive Online Learning of Structured Classifiers. André F. T. Martins, Kevin Gimpel, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. Pittsburgh, PA, June 2010.
Shedding (a Thousand Points of) Light on Biased Language. Tae Yano, Philip Resnik, and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Creating Speech and Language Data With Mechanical Turk, Los Angeles, CA, June 2010.
From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Brendan O’Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2010), pages 122–129, Washington, DC, May 2010.
What’s Worthy of Comment? Content and Comment Volume in Political Blogs. Tae Yano and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2010), Washington, DC, May 2010.
Text-Driven Forecasting. Noah A. Smith. March 2010.

2009

Leveraging Structural Relations for Fluent Compressions at Multiple Compression Rates. Sourish Chaudhuri, Naman K. Gupta, Noah A. Smith, and Carolyn P. Rosé. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, companion volume (ACL 2009), pages 101–104, Singapore, August 2009.
Variational Inference for Grammar Induction with Prior Knowledge. Shay B. Cohen and Noah A. Smith. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, companion volume (ACL 2009), pages 1–4, Singapore, August 2009.
Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition. Dipanjan Das and Noah A. Smith. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL 2009), pages 468–476, Singapore, August 2009.
Feature-Rich Translation by Quasi-Synchronous Lattice Parsing. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), pages 219–228, Singapore, August 2009.
Concise Integer Linear Programming Formulations for Dependency Parsing. André F. T. Martins, Noah A. Smith, and Eric P. Xing. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL 2009), pages 342–350, Singapore, August 2009.
Ranking Automatically Generated Questions as a Shared Task. Michael Heilman and Noah A. Smith. In Proceedings of the AIED Workshop on Question Generation, Brighton, UK, July 2009.
Question Generation via Overgenerating Transformations and Ranking. Michael Heilman and Noah A. Smith. Pittsburgh, PA, June 2009.
Polyhedral Outer Approximations with Application to Natural Language Parsing. André F. T. Martins, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2009), pages 713–720, Montréal, Québec, June 2009.
Summarization with a Joint Model for Sentence Extraction and Compression. André F. T. Martins and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Integer Linear Programming for Natural Language Processing, Boulder, CO, June 2009.
Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction. Shay B. Cohen and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 74–82, Boulder, CO, May/June 2009.
Predicting Risk from Financial Reports with Regression. Shimon Kogan, Dimitry Levin, Bryan R. Routledge, Jacob S. Sagi, and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 272–280, Boulder, CO, May/June 2009. Also available: talk slides.
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 236–244, Boulder, CO, May/June 2009.
Predicting Response to Political Blog Posts with Topic Models. Tae Yano, William W. Cohen, and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 477–485, Boulder, CO, May/June 2009.
From Episodes to Sagas: Understanding the News by Identifying Temporally Related Story Sequences. Ramnath Balasubramanyan, Frank Lin, William W. Cohen, Matthew Hurst, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2009), San Jose, CA, May 2009.
Nonextensive Information Theoretic Kernels on Measures. André F. T. Martins, Noah A. Smith, Eric P. Xing, Mário A. T. Figueiredo, and Pedro M. Q. Aguiar. Journal of Machine Learning Research 10:935–975, April 2009.
Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), pages 157–166, Athens, Greece, March/April 2009.

2008

Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction. Shay B. Cohen, Kevin Gimpel, and Noah A. Smith. In Advances in Neural Information Processing Systems 21 (NIPS 2008), pages 321–328, Vancouver, BC, December 2008.
Dynamic Programming Algorithms as Products of Weighted Logic Programs. Shay B. Cohen, Robert J. Simmons, and Noah A. Smith. In Proceedings of the International Conference on Logic Programming (ICLP 2008), Udine, Italy, December 2008. Also available: extended technical report.
The Shared Logistic Normal Distribution for Grammar Induction. Shay B. Cohen and Noah A. Smith. In Proceedings of the NIPS Workshop on Speech and Language: Unsupervised Latent-Variable Models, Whistler, BC, December 2008.
Stacking Dependency Parsers. André F. T. Martins, Dipanjan Das, Noah A. Smith, and Eric P. Xing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pages 157–166, Waikiki, HI, October 2008.
Wider Pipelines: N-Best Alignments and Parses in MT Training. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA 2008), Waikiki, HI, October 2008.
Question Generation as a Competitive Undergraduate Course Project. Noah A. Smith, Michael Heilman, and Rebecca Hwa. In Proceedings of the NSF Workshop on the Question Generation Shared Task and Evaluation Challenge, Arlington, VA, September 2008.
Review of Computational Approaches to Morphology and Syntax by Brian Roark and Richard Sproat. Noah A. Smith. Computational Linguistics 34(3):453–457, September 2008.
Nonextensive Entropic Kernels. André F. T. Martins, Mário A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2008), pages 640–647, Helsinki, Finland, July 2008.
Competitive Grammar Writing. Jason Eisner and Noah A. Smith. In Proceedings of the ACL Workshop on Issues in Teaching Computational Linguistics, pages 97–105, Columbus, OH, June 2008.
Rich Source-Side Context for Statistical Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the ACL Workshop on Statistical Machine Translation (SMT 2008), pages 9–17, Columbus, OH, June 2008.
SOUR CREAM: Toward Semantic Processing of Recipes. Dan Tasse and Noah A. Smith. Pittsburgh, PA, May 2008.
Relative Keyboard Input System. Daniel R. Rashid and Noah A. Smith. In Proceedings of the International Conference on Intelligent User Interfaces (IUI 2008), pages 397–400, Canary Islands, Spain, January 2008.

2007

Weighted and Probabilistic Context-Free Grammars Are Equally Expressive. Noah A. Smith and Mark Johnson. Computational Linguistics 33(4):477–491, December 2007.
Joint Morphological and Syntactic Disambiguation. Shay B. Cohen and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 208–217, Prague, Czech Republic, June 2007.
Probabilistic Models of Nonprojective Dependency Trees. David A. Smith and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 132–140, Prague, Czech Republic, June 2007.
Computationally Efficient M-Estimation of Log-Linear Structure Models. Noah A. Smith, Douglas L. Vail, and John D. Lafferty. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 752–759, Prague, Czech Republic, June 2007. Also available: talk slides.
What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA. Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 22–32, Prague, Czech Republic, June 2007.

2006

Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text. Noah A. Smith. Ph.D. thesis, Department of Computer Science, Johns Hopkins University, Baltimore, MD, October 2006.
Annealing Structural Bias in Multilingual Weighted Grammar Induction. Noah A. Smith and Jason Eisner. In Proceedings of the International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics (COLING-ACL 2006), pages 569–576, Sydney, Australia, July 2006.
Vine Parsing and Minimum Risk Reranking for Speed and Precision. Markus Dreyer, David A. Smith, and Noah A. Smith. In Proceedings of the Conference on Natural Language Learning (CoNLL 2006), pages 201–205, New York, NY, June 2006.

2005

Compiling Comp Ling: Practical Weighted Dynamic Programming and the Dyna Language. Jason Eisner, Eric Goldlust, and Noah A. Smith. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), pages 281–290, Vancouver, BC, October 2005.
Context-Based Morphological Disambiguation with Random Fields. Noah A. Smith, David A. Smith, and Roy W. Tromble. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), pages 475–482, Vancouver, BC, October 2005.
Parsing with Soft and Hard Constraints on Dependency Length. Jason Eisner and Noah A. Smith. In Proceedings of the International Workshop on Parsing Technologies (IWPT 2005), pages 30–41, Vancouver, BC, October 2005.
Guiding Unsupervised Grammar Induction Using Contrastive Estimation. Noah A. Smith and Jason Eisner. In Proceedings of the IJCAI Workshop on Grammatical Inference Applications, pages 73–82, Edinburgh, UK, July 2005.
Contrastive Estimation: Training Log-Linear Models on Unlabeled Data. Noah A. Smith and Jason Eisner. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2005), pages 354–362, Ann Arbor, MI, June 2005.

2004

Annealing Techniques for Unsupervised Statistical Language Learning. Noah A. Smith and Jason Eisner. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2004), pages 487–494, Barcelona, Spain, July 2004.
Dyna: A Declarative Language for Implementing Dynamic Programs. Jason Eisner, Eric Goldlust, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2004), pages 218–221, Barcelona, Spain, July 2004.
Bilingual Parsing with Factored Estimation: Using English to Parse Korean. David A. Smith and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pages 49–56, Barcelona, Spain, July 2004.

2003

The Web as a Parallel Corpus. Philip Resnik and Noah A. Smith. Computational Linguistics 29(3):349–380, September 2003.

2002

From Words to Corpora: Recognizing Translation. Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 95–102, Philadelphia, PA, July 2002.
An Adjective Analysis. Noah A. Smith. January 2002.

2001

Ellipsis Happens, and Deletion is How. Noah A. Smith. In ed. Andrea Gualmini, Soo-Min Hong, and Mitsue Motomura, University of Maryland Working Papers in Linguistics, pages 176–191, 2001, Department of Linguistics, University of Maryland.
Detection of Translational Equivalence. Noah A. Smith. T.R. Technical report 4253, Department of Computer Science, University of Maryland College Park, College Park, MD, May 2001.

20th Century

Statistical Machine Translation. Yaser Al-Onaizan, Jan Curin, Michael Jahr, Kevin Knight, John Lafferty, I. Dan Melamed, Noah A. Smith, Franz-Josef Och, David Purdy, and David Yarowsky. T.R. CLSP Research Notes 42, Johns Hopkins University, Baltimore, MD, 1999.
Cairo: An Alignment Visualization Tool. Noah A. Smith and Michael E. Jahr. In Proceedings of the Language Resources and Evaluation Conference (LREC 2000), pages 549–552, Athens, Greece, May/June 2000.