Publications
This page contains a list of manuscripts that I have authored. * indicates equal contribution.
2023
- Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang, Cassidy Trier, Chloe Anastasiades, Tal August, Russell Authur, Danielle Bragg, Erin Bransom, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Yen-Sung Chen, Evie Yu-Yen Cheng, Yvonne Chou, Doug Downey, Rob Evans, Raymond Fok, Fangzhou Hu, Regan Huff, Dongyeop Kang, Tae Soo Kim, Rodney Kinney, Aniket Kittur, Hyeonsu Kang, Egor Klevak, Bailey Kuehl, Michael Langan, Matt Latzke, Jaron Lochner, Kelsey MacMillan, Eric Marsh, Tyler Murray, Aakanksha Naik, Ngoc-Uyen Nguyen, Srishti Palani, Soya Park, Caroline Paulic, Napol Rachatasumrit, Smita Rao, Paul Sayre, Zejiang Shen, Pao Siangliulue, Luca Soldaini, Huy Tran, Madeleine van Zuylen, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Marti A. Hearst, Daniel S. Weld. The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces. ArXiv 2303.14334. PREPRINT.
- Sean MacAvaney* and Luca Soldaini*. “One-Shot Labeling for Automatic Relevance Estimation.” ArXiv 2302.11266. PREPRINT.
- Rodney Michael Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, Jason Dunkelberger, Oren Etzioni, R. Evans, Sergey Feldman, Joseph Gorney, D. Graham, F.Q. Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Christopher Newell, Smita Rao, Shaurya Rohatgi, P. Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, A. Tanaka, Alex D Wade, Linda M. Wagner, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, A. Zamarron, Madeleine van Zuylen, Daniel S. Weld. “The Semantic Scholar Open Data Platform.” ArXiv 2302.11266. PREPRINT.
- Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Andrew Head, Marti A. Hearst, and Daniel S. Weld. “SCIM: Intelligent Skimming Support for Scientific Papers.” IUI 2023.
- Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D’Arcy, Arman Cohan, and Doug Downey. “Embedding Recycling for Language Models.” Findings of EACL 2023.
2022
- John Giorgi, Luca Soldaini, Bo Wang, Gary Bader, Kyle Lo, Lucy Lu Wang, and Arman Cohan. “Exploring the Challenges of Open Domain Multi-Document Summarization.” ArXiv 2212.10526. PREPRINT.
- Matteo Gabburo, Rik Koncel-Kedziorski, Siddhant Garg, Luca Soldaini, and Alessandro Moschitti. “Knowledge Transfer from Answer Ranking to Answer Generation.” EMNLP 2022.
- Luca Di Liello, Siddhant Garg, Luca Soldaini, and Alessandro Moschitti. “Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection.” EMNLP 2022.
- Yoshitomo Matsubara, Luca Soldaini, Eric Lind, and Alessandro Moschitti. “Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems.” Findings of EMNLP 2022.
- Benjamin Muller, Luca Soldaini, Rik Koncel-Kedziorski, Eric Lind, and Alessandro Moschitti. “Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering.” AACL 2022.
- Luca Di Liello, Siddhant Garg, Luca Soldaini, and Alessandro Moschitti. “Paragraph-based Transformer Pre-training for Multi-Sentence Inference.” NAACL-HLT 2022.
2021
- Chao-Chun Hsu, Eric Lind, Luca Soldaini, and Alessandro Moschitti. “Answer Generation for Retrieval-based Question Answering Systems.” Findings of ACL 2021.
- Rujun Han, Luca Soldaini, and Alessandro Moschitti. “Modeling Context in Answer Sentence Selection Systems on a Latency Budget.” EACL 2021.
2020
- Mingda Li, Xinyue Liu, Weitong Ruan, Luca Soldaini, Wael Hamza, and Chengwei Su. “Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention”. COLING 2020.
- Luca Soldaini and Alessandro Moschitti. “The Cascade Transformer: Efficient Answer Sentence Selection”. ACL 2020.
- Subendhu Rongali, Luca Soldaini, Emilio Monti, and Wael Hamza. “Don’t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing”. WWW 2020.
- Sean MacAvaney, Luca Soldaini, and Nazli Goharian. “Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning”. ECIR 2020.
2018
- Sean MacAvaney, Andrew Yates, Arman Cohan, Luca Soldaini, Kai Hui, Nazli Goharian, and Ophir Frieder. “Overcoming Low-Utility Facets for Complex Answer Retrieval”. Information Retrieval Journal, 2018.
- Ziling Fan, Luca Soldaini, Arman Cohan, and Nazli Goharian. “Relation Extraction for Protein-Protein Interactions Affected by Mutation”. ACM-BCB 2018.
- Arman Cohan*, Bart Desmet*, Andrew Yates*, Luca Soldaini, Sean MacAvaney, and Nazli Goharian. “SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions”. COLING 2018.
- Luca Soldaini, Timothy Walsh, Arman Cohan, Julien Han, and Nazli Goharian. “Helping or Hurting? Predicting Changes in Users’ Risk of Self-Harm Through Online Community Interactions”. CLPsych Workshop, NAACL-HLT 2018.
- Luca Soldaini. “The Knowledge and Language Gap in Medical Information Seeking”. PhD Thesis, Georgetown University. 2018.
- Sean MacAvaney, Bart Desmet, Arman Cohan, Luca Soldaini, Andrew Yates, Ayah Zirikly, and Nazli Goharian. “RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses”. CLPsych Workshop, NAACL-HLT 2018.
- Sean MacAvaney, Andrew Yates, Arman Cohan, Luca Soldaini, Kai Hui, Nazli Goharian, and Ophir Frieder. “Characterizing Question Facets for Complex Answer Retrieval”. SIGIR 2018.
- Sean MacAvaney, Luca Soldaini, Arman Cohan, and Nazli Goharian. “Tree-LSTMs for Scientific Relation Classification”. SemEval Workshop, NAACL-HLT 2018.
2017
- Luca Soldaini, Andrew Yates, and Nazli Goharian. “Denoising Clinical Notes for Medical Literature Retrieval with Convolutional Neural Model”. CIKM 2017.
- Luca Soldaini, Andrew Yates, and Nazli Goharian. “Learning to Reformulate Long Queries for Clinical Decision Support”. JASIST 2017.
- Luca Soldaini and Elad Yom-Tov. “Inferring Individual Attributes from Search Engine Queries and Auxiliary Information”. WWW 2017.
- Luca Soldaini and Nazli Goharian. “Learning to Rank for Consumer Health Search: a Semantic Approach”. ECIR 2017.
2016
- Luca Soldaini and Nazli Goharian. “QuickUMLS: a Fast, Unsupervised Approach for Medical Concept Extraction”. MedIR workshop, SIGIR 2016.
- Arman Cohan, Luca Soldaini, and Nazli Goharian. “Identifying Significance of Discrepancies in Radiology Reports”. DMMH Workshop, SDM 2016.
- Luca Soldaini, Andrew Yates, Elad Yom-Tov, Ophir Frieder, and Nazli Goharian. “Enhancing Web Search in the Medical Domain via Query Clarification”. Information Retrieval Journal, 2016.
2015
- Arman Cohan, Luca Soldaini, and Nazli Goharian. “Matching Citation Text and Cited Spans in Biomedical Literature: a Search–Oriented Approach”. NAACL-HLT 2015.
- Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, and Ophir Frieder. “Retrieving Medical Literature for Clinical Decision Support”. ECIR 2015.
2014
- Arman Cohan, Luca Soldaini, Andrew Yates, Nazli Goharian, and Ophir Frieder. “On Clinical Decision Support”. ACM-BCB 2014.