Mitchell Wortsman

I am a member of the technical staff on the pretraining team at Anthropic. Previously, I was a PhD student at the University of Washington, advised by Ali Farhadi and Ludwig Schmidt.

Email  /  Google Scholar  /  Twitter

profile photo
Select publications, preprints & projects
(*indicates equal contribution)

Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith
ICLR, 2024 (oral)

Replacing softmax with ReLU in Vision Transformers
Mitchell Wortsman, Jaehoon Lee, Justin Gilmer, Simon Kornblith
ArXiv, 2023

DataComp: In search of the next generation of multimodal datasets
Samir Yitzhak Gadre*, Gabriel Ilharco*, Alex Fang*, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, et al., Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
NeurIPS, 2023

Stable and low-precision training for large-scale vision-language models
Mitchell Wortsman*, Tim Dettmers*, Luke Zettlemoyer, Ari S. Morcos, Ali Farhadi, Ludwig Schmidt
NeurIPS, 2023

OpenFlamingo: an open-source framework for training large multimodal models
Anas Awadalla, Irena Goa, Joshua Gardner, Jack Hessel, et al., Mitchell Wortsman, Ludwig Schmidt
GitHub, 2023

lo-fi: distributed fine-tuning without communication
Mitchell Wortsman, Suchin Gururangan, Shen Li, Ali Farhadi, Ludwig Schmidt, Micheal Rabbat, Ari S. Morcos
TMLR, 2022

Patching open-vocabulary models by interpolating weights
Gabriel Ilharco*, Mitchell Wortsman*, Samir Yitzhak Gadre*, Shuran Song Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt
NeurIPS, 2022

CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration
Samir Yitzhak Gadre, Mitchell Wortsman, Gabriel Ilharco, Ludwig Schmidt Shuran Song
ArXiv, 2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon**, Simon Kornblith**, Ludwig Schmidt**
ICML, 2022

Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt
ICML, 2022

Robust fine-tuning of zero-shot models
Mitchell Wortsman*, Gabriel Ilharco*, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt
CVPR, 2022 (oral, best paper finalist)
arxiv / code

OpenCLIP: An open source implementation of CLIP
Gabriel Ilharco*, Mitchell Wortsman*, Ross Wightman*, Cade Gordon*, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Hannaneh Hajishirzi, Ali Farhadi, Ludwig Schmidt
GitHub, 2021

What's Hidden in a Randomly Weighted Neural Network?
Vivek Ramanujan*, Mitchell Wortsman*, Aniruddha Kembhavi, Ali Farhadi, Mohammad Rastegari
CVPR, 2020
arxiv / code

Template modified from here.