John Thickstun

Assistant Professor - Cornell University - Computer Science.

prof_pic.jpg

I work on machine learning and generative models. I’m interested in methods that control the behavior of models, both from the perspective of a user who hopes to use a model to accomplish concrete tasks, and from the perspective of a model provider or policymaker who hopes to broadly regulate the outputs of a model. I am also interested in applications of generative models that push beyond the standard text and image modalities, including music technologies.

Previously I was a Postdoctoral Scholar at Stanford University, advised by Percy Liang. I completed my PhD in the Allen School of Computer Science & Engineering at the University of Washington, where I was co-advised by Sham Kakade and Zaid Harchaoui. I studied Applied Mathematics as an undergraduate at Brown University, advised by Eugene Charniak and Björn Sandstede.

The MusicNet dataset has moved to permanent hosting at Zenodo.

news

Apr 3, 2025 A retrospective conversation with Sophie Barthes on Stories for the Future: a workshop convened between filmmakers and AI researchers at Stanford University.
Mar 14, 2025 A response to the NSF’s Request for Information on the White House’s Development of an AI Action Plan.
Aug 6, 2024 I wrote an blog post on co-composing music using the Anticipatory Music Transformer.
Jul 1, 2024 I am joining Cornell University as an Assistant Professor of Computer Science, starting Fall 2024!
Jun 17, 2024 Hooktheory released Aria: an AI co-creator for chords and melodies powered by the Anticipatory Music Transformer. Read about it on the AudioCipher Blog.
Dec 7, 2023 Stanford HAI featured my recent work on the Anticipatory Music Transformer!

Oct 11, 2023 Megha and I released human-LM interaction data that we collected last year for HALIE. We wrote a blog post that documents the data release, and highlights some qualitative trends in the data that we found interesting
Jul 30, 2023 Rohith and I wrote a blog post about our recent work on Watermarking LLMs. I wrote a javascript implementation of the watermark detector, which is included in the post: try it out!
Jun 16, 2023 I wrote an introduction to the Anticipatory Music Transformer. This includes a summary of my generative music research program, samples of music generated by these models, and resources for using these models yourself.

selected publications

  1. TMLR
    Robust distortion-free watermarks for language models
    Kuditipudi, Rohith, Thickstun, John, Hashimoto, Tatsunori, and Liang, Percy
    Transactions on Machine Learning Research 2024
  2. TMLR
    Anticipatory music transformer
    Thickstun, John, Hall, David, Donahue, Chris, and Liang, Percy
    Transactions on Machine Learning Research 2024
  3. JMLR
    MAUVE Scores for Generative Models: Theory and Practice
    Pillutla, Krishna, Liu, Lang, Thickstun, John, Welleck, Sean, Swayamdipta, Swabha, Zellers, Rowan, Oh, Sewoong, Choi, Yejin, and Harchaoui, Zaid
    Journal of Machine Learning Research 2023
  4. ACL Outstanding Paper
    Backpack language models
    Hewitt, John, Thickstun, John, Manning, Christopher D., and Liang, Percy
    In Proceedings of the Association for Computational Linguistics 2023
  5. Neurips Oral Presentation
    Diffusion-LM improves controllable text generation
    Li, Xiang Lisa, Thickstun, John, Gulrajani, Ishaan, Liang, Percy, and Hashimoto, Tatsunori B.
    In Advances in Neural Information Processing Systems 2022
  6. Neurips Outstanding Paper
    MAUVE: measuring the gap between neural text and human text using divergence frontiers
    Pillutla, Krishna, Swayamdipta, Swabha, Zellers, Rowan, Thickstun, John, Welleck, Sean, Choi, Yejin, and Harchaoui, Zaid
    In Advances in Neural Information Processing Systems 2021
  7. ICML
    Parallel and flexible sampling from autoregressive models via Langevin dynamics
    Jayaram, Vivek, and Thickstun, John
    In International Conference on Machine Learning 2021
  8. ICML
    Source separation with deep generative priors
    Jayaram, Vivek, and Thickstun, John
    In International Conference on Machine Learning 2020
  9. ICASSP Oral Presentation
    Invariances and data augmentation for supervised music transcription
    Thickstun, John, Harchaoui, Zaid, Foster, Dean P, and Kakade, Sham M
    In International Conference on Acoustics, Speech and Signal Processing 2018
  10. ICLR
    Learning features of music from scratch
    Thickstun, John, Harchaoui, Zaid, and Kakade, Sham M
    In International Conference on Learning Representations 2017