Here is a list of papers that I have authored; they are also available on my Semantic Scholar or Google Scholar profiles. α indicates equal contribution; ω indicates core contributors.

2026

2025

  • Olmo 3

  • FlexOlmo: Open Language Models for Flexible Data Use

  • The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

  • 2 OLMo 2 Furious

  • Tülu 3: Pushing Frontiers in Open Language Model Post-Training

  • OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

  • OLMoE: Open Mixture-of-Experts Language Models

  • Language models scale reliably with over-training and on downstream tasks

  • Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

2024

2023

  • Queer In AI: A Case Study in Community-Led Participatory AI

  • The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces

  • The Semantic Scholar Open Data Platform

2022

2021

2020

2018

2017

2016

2015

2014

The content of this website is licensed under CC BY 4.0