Noah A. Smith

Noah Smith is a computer scientist working at the junction of natural language processing (NLP), machine learning (ML), and computational social science. He recently wrote Language Models: A Guide for the Perplexed, a general-audience tutorial, and he co-directs the OLMo open language modeling effort with Hanna Hajishirzi.

His research spans core problems in NLP, general-purpose ML methods for NLP, methodology in NLP, and a wide range of applications. You can watch videos of some of his talks, read his papers, and learn about his research groups, Noah’s ARK and AllenNLP. Smith is most proud of his mentoring accomplishments: as of 2024, he has graduated 28 Ph.D. students and mentored 15 postdocs, with 25 alumni now in faculty positions around the world. 20 of his undergraduate/masters mentees have gone on to Ph.D. programs. His group’s alumni have started companies and are technological leaders both inside and outside the tech industry.

Appointments & Education

He is Amazon Professor of Machine Learning in the Paul G. Allen School of Computer Science & Engineering at the University of Washington (also Adjunct in Linguistics, Affiliate of the Center for Statistics and the Social Sciences, and Senior Data Science Fellow at the eScience Institute), as well as Senior Director of NLP Research at the Allen Institute for Artificial Intelligence. Previously, he was an Associate Professor of Language Technologies and Machine Learning in the School of Computer Science at Carnegie Mellon University. He earned his Ph.D. in Computer Science from Johns Hopkins University and his B.S. in Computer Science and B.A. in Linguistics from the University of Maryland.

Service

Smith was the general chair of EMNLP 2022. He has served on the editorial boards of the journals Computational Linguistics (2009–2011), Journal of Artificial Intelligence Research (2011–2019), and Transactions of the Association for Computational Linguistics (2012–present), as the secretary-treasurer of SIGDAT (2012–2015 and 2018–2019), and as program co-chair of ACL 2016. He co-organized the Ninth Annual Conference on New Directions in Analyzing Text as Data (TADA 2018), Language Technologies and Computational Social Science (a workshop at ACL 2014), and Twenty Years of Bitext (a workshop at EMNLP 2013).

Recognition

Smith was elected a Fellow of the Association for Computational Linguistics “for significant contributions to linguistic structure prediction, computational social sciences, and improving NLP research methodology” (2020). UW’s Sounding Board team, led by Profs. Mari Ostendorf, Yejin Choi, and Noah Smith, won the inaugural Amazon Alexa Prize in 2017. Smith’s research was recognized with a UW Innovation award “to stimulate innovation among faculty from a range of disciplines and to reward some of their most terrific ideas” (2016–2018), the Finmeccanica career development chair at CMU “to acknowledge promising teaching and research potential in junior faculty members” (2011–2014), an NSF CAREER award (2011–2016), a Hertz Foundation graduate fellowship (2001–2006). He has coauthored conference papers that were cited as outstanding, finalist, honorable mention, and sometimes even “best paper” or “best student paper” at ICLP 2008, ACL 2009, COLING 2010, NAACL 2013, ACL 2014, NAACL 2015, WWW 2016, EACL 2017, NAACL 2018, ACL 2018, ACL 2019 (twice), ACL 2020, ACL 2021, and NAACL 2022.

Teaching

Winter 2023: Ph.D.-level course on natural language processing

Smith’s recent teaching includes courses on NLP for undergraduates, professional masters students, and Ph.D. students as well as a “capstone” projects course on NLP. Lecture videos from 2021:

Personal

Smith lives in Seattle with his spouse, where they serve two felines. When he is not working, he is often playing clarinet or saxophone, swimming, running, dancing tango, or mixing cocktails. He was interviewed by Devi Parikh for Humans of AI: Stories, Not Stats on September 23, 2020:

Academic Genealogy

Smith’s main intellectual influences are his undergraduate mentors Philip Resnik (still a frequent collaborator) and Norbert Hornstein, his Ph.D. advisor Jason Eisner, Frederick Jelinek, and his many mentees.

Smith’s academic ancestors had varied interests and careers. A few examples (links discovered thanks to the Mathematics Genealogy Project):

In 1332, Nasir al-Din al-Tusi (34th degree ancestor) wrote Akhlāq-i Nāsirī, a treatise on philosophical ethics. He was the director of the Maragheh Observatory (in present-day Iran), said to be “the most advanced scientific institution in the Eurasian world” at the time.
In 1514, Nicolaus Copernicus (24th degree ancestor) wrote his Commentariolus, explicitly proposing a return to the heliocentric model. In 1543, he published De Revolutionibus Orbium Coelestium, and within about 50 years the Copernican Revolution had begun. Whether Copernicus was directly influenced by Islamic scholars like al-Tusi is a matter of debate. Copernicus completed his doctorate at Bologna University and worked mainly in Warmia (present-day Poland).
In 1542, Elia Levita (24th degree ancestor) published Shemot Devarim, a translation dictionary for German, Latin, Hebrew, and Yiddish. Levita also authored the Bovo-Bukh, the most popular chivalric romance written in Yiddish. Levita worked mainly in Italy (Padua, Rome, Venice) but spent some time in Bavaria (Isny), where he mentored Paul Fagius (23rd degree ancestor), who was the publisher of the Bovo-Bukh. Fagius later taught at the University of Strasbourg. When the counter-Reformation was heating up, he moved to the University of Cambridge at the invitation of Thomas Cranmer (24th degree ancestor, different line).
In 1636, Marin Mersenne (21st degree ancestor) published Harmonie Universelle, the most complete compendium on (western) music theory to that time. Composer Ottorino Respighi attributed an aria in the second suite of his Ancient Airs and Dances (1923) to Mersenne. You can listen to it here.
In 1684, Gottfried Wilhelm Leibniz (18th degree ancestor) published Nova Methodus pro Maximis et Minimis, the first published work on differential and integral calculus.
In 1713, Jacob Bernoulli (16th degree ancestor)’s Ars Conjectandi was posthumously published; it is a work on combinatorics and probability that includes the first statement of the law of large numbers. It was Siméon Denis Poisson (12th degree ancestor) who named it “the law of large numbers,” in 1837 (Probabilité des jugements en matière criminelle et en matière civile, précédées des règles générales du calcul des probabilitiés).
In 1832, according to Wikipedia, Carl Friedrich Gauss (12th degree ancestor) didn’t want his (biological) children to study math, but he didn’t want them to study languages, either. His son Eugene quarreled with him about this and left for America, where (among other accomplishments) he learned the Sioux language. Wikipedia includes a fascinating letter from Gauss’s grandson Robert to Felix Klein (9th degree ancestor).
In 1967, Francis Fan Lee (4th degree ancestor) published about a reading machine for the blind (“Automatic grapheme-to-phoneme translation of English”). Lee also founded a company called Lexicon, later sold to Harman International, now part of Samsung; Lexicon received an Emmy award for its Model 1200 Audio Time Compressor and Expander in 1984. Lee was also the recipient of a Hertz Foundation graduate fellowship.

Other academic ancestors include linguist Roman Jakobson (6th degree ancestor, founder of modern phonology), polymath Pierre-Simon Laplace (12th degree ancestor, developer of the so-called Bayesian interpretation of probability), Enlightenment figure Jean le Rond d’Alembert (14th degree ancestor), and theologian Desiderius Erasmus Roterodamus (29th degree ancestor).

There is, of course, much debate to be had about which of the scholars in the MGP really completed anything we would consider comparable to the modern Ph.D., and how similar the attested relationships are to modern Ph.D. advising.