John Thickstun

Assistant Professor - Cornell University - Computer Science.

prof_pic.jpg

I work on machine learning and generative models. I’m interested in methods that control the behavior of models, both from the perspective of a user who hopes to use a model to accomplish concrete tasks, and from the perspective of a model provider or policymaker who hopes to broadly regulate the outputs of a model. I am also interested in applications of generative models that push beyond the standard text and image modalities, including music technologies.

Previously I was a Postdoctoral Scholar at Stanford University, advised by Percy Liang. I completed my PhD in the Allen School of Computer Science & Engineering at the University of Washington, where I was co-advised by Sham Kakade and Zaid Harchaoui. I studied Applied Mathematics as an undergraduate at Brown University, advised by Eugene Charniak and Björn Sandstede.

Prospective students: I am recruiting students to join my group starting Fall 2025! Please apply to the Cornell CS PhD program. I plan to admit multiple PhD students this year interested in generative models, music technologies, or both.

The MusicNet dataset has moved to permanent hosting at Zenodo.

news

Jul 1, 2024 I am joining Cornell University as an Assistant Professor of Computer Science, starting Fall 2024!
Jun 17, 2024 Hooktheory released Aria: an AI co-creator for chords and melodies powered by the Anticipatory Music Transformer. Read about it on the AudioCipher Blog.
Mar 18, 2024 We released a new, 780M parameter Anticipatory Music Transformer. Additional discussion here.
Dec 7, 2023 Stanford HAI featured my recent work on the Anticipatory Music Transformer!

Oct 11, 2023 Megha and I released human-LM interaction data that we collected last year for HALIE. We wrote a blog post that documents the data release, and highlights some qualitative trends in the data that we found interesting

selected publications

  1. TMLR
    Robust distortion-free watermarks for language models
    Kuditipudi, Rohith, Thickstun, John, Hashimoto, Tatsunori, and Liang, Percy
    Transactions on Machine Learning Research 2024
  2. TMLR
    Anticipatory music transformer
    Thickstun, John, Hall, David, Donahue, Chris, and Liang, Percy
    Transactions on Machine Learning Research 2024
  3. JMLR
    MAUVE Scores for Generative Models: Theory and Practice
    Pillutla, Krishna, Liu, Lang, Thickstun, John, Welleck, Sean, Swayamdipta, Swabha, Zellers, Rowan, Oh, Sewoong, Choi, Yejin, and Harchaoui, Zaid
    Journal of Machine Learning Research 2023
  4. ACL Outstanding Paper
    Backpack language models
    Hewitt, John, Thickstun, John, Manning, Christopher D., and Liang, Percy
    In Proceedings of the Association for Computational Linguistics 2023
  5. Neurips Oral Presentation
    Diffusion-LM improves controllable text generation
    Li, Xiang Lisa, Thickstun, John, Gulrajani, Ishaan, Liang, Percy, and Hashimoto, Tatsunori B.
    In Advances in Neural Information Processing Systems 2022
  6. Neurips Outstanding Paper
    MAUVE: measuring the gap between neural text and human text using divergence frontiers
    Pillutla, Krishna, Swayamdipta, Swabha, Zellers, Rowan, Thickstun, John, Welleck, Sean, Choi, Yejin, and Harchaoui, Zaid
    In Advances in Neural Information Processing Systems 2021
  7. ICML
    Parallel and flexible sampling from autoregressive models via Langevin dynamics
    Jayaram, Vivek, and Thickstun, John
    In International Conference on Machine Learning 2021
  8. ICML
    Source separation with deep generative priors
    Jayaram, Vivek, and Thickstun, John
    In International Conference on Machine Learning 2020
  9. ICASSP Oral Presentation
    Invariances and data augmentation for supervised music transcription
    Thickstun, John, Harchaoui, Zaid, Foster, Dean P, and Kakade, Sham M
    In International Conference on Acoustics, Speech and Signal Processing 2018
  10. ICLR
    Learning features of music from scratch
    Thickstun, John, Harchaoui, Zaid, and Kakade, Sham M
    In International Conference on Learning Representations 2017