An archive of all news posts:    2021   2020   2019


Oct 22, 2021: Paper to HTML Converter won the Best Artifact Award at this year's ASSETS conference!! Our demo paper can be found here.

Sep 14, 2021: We launched a public beta of Paper to HTML, which rerenders scientific papers as HTML on-demand.

Sep 7, 2021: I gave a talk at the Conference on AI and Theorem Proving (AITP) on "Mathematics in the Scholarly Literature," discussing some of our work on the S2ORC dataset.

Aug 26, 2021: Our paper on medical multi-document summarization and literature review automation has been accepted to EMNLP 2021! Led by intern Jay DeYoung.

Aug 13, 2021: My team won the AI2 Hackathon Common Good award with our project "A11y2: making research accessible, AI2 and beyond"! We developed a way to create accessible HTML renders of paper PDFs on request, which we're hoping to launch publicly soon. Awesome collab with teammates Alex Buraczynski, Daniel King, Matt Latzke, and Sam Skjonsberg.

Jul 29, 2021: I gave a talk at the Science of Science Summer School (S4) on NLP and scientific text mining.

Jul 16, 2021: Our demo paper "SciA11y: Converting Scientific Papers to Accessible HTML" has been accepted to ASSETS 2021!

Jun 22, 2021: I participated in a panel on "Biomedical Informatics Career Development" at the annual NLM Informatics Training Conference, which I last attended as an NLM informatics trainee in 2017!

Jun 22, 2021: New preprint out on arXiv: "Incorporating Visual Layout Structures for Scientific Text Classification, which investigates ways of injecting visual layout structure into language models to improve document understanding! Led by Zejiang (Shannon) Shen

Jun 10, 2021: I wrote a two part blog post discussing SciA11y. The first part, published in late May, discusses the current state of scientific PDF accessibility. The second part introduces a potential system solution for the problem as well as some highlights from our user study.

Jun 10, 2021: I helped co-organize a successful 2nd Scholarly Document Processing (SDP) Workshop, held this year at NAACL.

May 13, 2021: I summited Mount Rainier!

May 7, 2021: I gave a talk at the ICLR Machine Learning for Preventing and Combating Pandemics workshop.

May 4, 2021: We released a new preprint on creating accessible HTML renders of scientific papers from PDF! An accessible PDF of the preprint is available here. We describe and quantify the current state of scientific PDF accessibility, which is quite poor, and introduce the SciA11y system.

Apr 22, 2021: I participated in a panel discussing AI and COVID-19 with Relativity Media. Video here.

Apr 13, 2021: New preprint on medical multi-document summarization and literature review automation, "MS^2: multi-document summarization of medical studies" is out! Work led by intern Jay DeYoung.

Apr 1, 2021: I gave a Keynote at the BIR Workshop at ECIR on the topic of "Text mining insights from the COVID-19 pandemic."

Mar 11, 2021: Our paper "Gender Trends in Computer Science Authorship" is out in the March issue of the Communications of the ACM. Our late-breaking work submission to CHI on the citation diversity of accessibility publications has also been accepted! You can find that here.

Mar 8, 2021: I gave a talk on "Practical NLP for biomedicine" at the Northwestern University CS Colloquium.

Feb 18, 2021: I was a presenter and panelist at the 1st GTM2021 Virtual Forum on "Fast-Track learning: Growing insights from text-mining COVID-19 data."

Jan 14, 2021: Our mixed methods survey paper on accessibility research has been accepted to CHI 2021. A preprint is available on arXiv.


Dec 7, 2020: Our survey paper of COVID-19 text mining resources is now out in Briefings in Bioinformatics (doi: 10.1093/bib/bbaa296)! Another paper on CORD-19 dataset biases (collaborative work with Microsoft Academic) is also now available online (doi: 10.3389/frma.2020.596624).

Nov 16, 2020: I'm attending EMNLP virtually, where my co-authors are presenting "Fact or Fiction: Verifying Scientific Claims" at the main conference and "MedICaT: A Dataset of Medical Images, Captions, and Textual References" at the Scholarly Document Processing workshop.

Nov 13, 2020: I presented our work on CORD-19 and COVID-19 text mining at the Global Tech Mining Conference.

Nov 11, 2020: Really enjoyed being on a panel about open publishing and open data with bioRxiv/medRxiv's Richard Sever and eLife's Anna Akhmanova. Thanks to The Neuro and Gairdner Foundation for hosting me at the Open Science in Action Symposium!

Nov 3, 2020: The main round of the EPIC-QA shared task is open and accepting submissions. We are looking for submissions of systems that answer questions about COVID-19 for expert and non-expert lay audiences.

Oct 25, 2020: We are looking for research interns for the Semantic Scholar team in 2021. Apply here! Deadline for this cycle: Nov 15, 2020.

Oct 19, 2020: Thanks to CMU Libraries for hosting me at the Artificial Intelligence for Data Discovery and Reuse Symposium. I presented work on reuseable data resources for COVID-19 and participated in a panel on community data ecosystems, with Ross Epstein (SafeGraph), Alison Specht (U. Queensland), and Keith Webster (CMU).

Oct 7, 2020: Our review paper "Text mining approaches for dealing with the rapidly expanding literature on COVID-19" has been accepted to Briefings in Bioinformatics (doi: 10.1093/bib/bbaa296)!

Sep 16, 2020: Our ontology review paper for the KPMP "Modelling kidney disease using ontology: insights from the Kidney Precision Medicine Project" is now published at Nature Reviews Nephrology (doi: 10.1038/s41581-020-00335-w)!

Aug 3, 2020: The final round of TREC-COVID (Round 5) is complete!

Jul 29, 2020: Thanks to the AWS Education Research Seminar Series for hosting me as part of a session on CORD-19 Search!

Jul 5, 2020: I am attending ACL virtually, where my co-authors and I will be presenting "S2ORC: The Semantic Scholar Open Research Corpus" and "SUPP.AI: finding evidence for supplement-drug interactions" at the main conference and "CORD-19: The COVID-19 Open Research Dataset" at the NLP-COVID workshop.

Jun 25, 2020: We had a very successful first SciNLP workshop at AKBC (which I helped co-organize), with nine excellent invited speakers and an engaging panel discussion on "The role of scientific NLP during an epidemic."

Jun 18, 2020: Thanks to Microsoft Research Project Hanover for inviting me to their research group meeting! I gave a talk on "Improving access to scientific literature for NLP."

Jun 12, 2020: Spoke about CORD-19 at "Connected Health and COVID-19: Now and Beyond the Great Lockdown."

Jun 1, 2020: First round results for TREC-COVID are out in this SIGIR paper.

May 27, 2020: Gave a talk about CORD-19 at the Center for Science and Technology Studies at Leiden University.

May 4, 2020: We've released a paper explaining the rationale and structure of the TREC-COVID shared task in JAMIA.

Apr 27, 2020: Thanks to Seth Grimes for inviting Kyle and I to give a talk on CORD-19 at a meetup of the NY-NLP, A2D-NLP, DC-NLP, Hungarian NLP, and London Text Analytics groups!

Apr 22, 2020: Our preprint on CORD-19 is now available on arXiv! "CORD-19: The COVID-19 Open Research Dataset"

Apr 20, 2020: The podcast I recorded on CORD-19 with Practical AI is now out!

Apr 15, 2020: The TREC-COVID shared task is open and accepting submissions! The shared task aims to understand how ad hoc retrieval methods work best in a landscape of changing queries and a changing corpus.

Apr 14, 2020: I gave a talk about CORD-19 at the Semantic Indexing and Information Retrieval for Health Workshop at ECIR.

Apr 3, 2020: Our paper on the S2ORC corpus was accepted to ACL. The SUPP.AI paper was accepted to ACL Demo.

Apr 2, 2020: Our paper on "Gender trends in computer science authorship" has been accepted to the CACM!

Mar 16, 2020: Washington state has officially begun its COVID-19 closures. AI2 offices are closed and everyone at the house is now working remotely.

Mar 13, 2020: We've launched the first version of the CORD-19 dataset!

Jan 29, 2020: I gave a guest lecture on SPARQL for BIME 550: Knowledge Representation.


Dec 12, 2019: I am attending the ML4H workshop at NeurIPS in Vancouver, BC, where I'll be presenting our paper on SUPP.AI.

Oct 28, 2019: The Semantic Scholar team has moved into our new offices at Pacific Pointe, one block away from the old offices.

Sep 20, 2019: I was interviewed on King5 about SUPP.AI! My first time on live TV.

Sep 17, 2019: The companion paper for SUPP.AI is now out on arXiv: "Extracting evidence of supplement-drug interactions from literature"

Sep 15, 2019: We launched SUPP.AI! What was just a prototype for the hackathon has been built-out and released as a public tool for discovering supplement and drug interactions. Check it out!

Jul 26, 2019: My team won the AI2 Hackathon with our project "WhatSUPP: identifying supplement-drug interactions from literature." We won first place and audience choice!

Jun 19, 2019: Our preprint "Gender trends in computer science authorship" is now on arXiv.

Jun 17, 2019: I'm attending the Allen Institute's Workshop on Ontology Frameworks.

Jun 16, 2019: I moved into ELS2, our new vegetarian biking stronghold, now at the top of Capitol Hill.

Jun 14, 2019: I graduated! Officially received my PhD in Biomedical and Health Informatics from the UW.

May 20, 2019: I am attending AKBC in Amherst, MA. I will be giving a talk at the Scientific Literature KB Workshop on "Ontology-based integration of biological pathway data."

May 20, 2019: Our paper on extending the Pathway Ontology "Predicting instances of pathway ontology classes for pathway integration" has been accepted to the Journal of Biomedical Semantics.

May 16, 2019: We presented our poster on Spinal MRI denoising at ISMRM in Montreal.

May 2, 2019: I attended the KPMP meeting at the NIH in DC. Lots of interesting discussions on the role of ontology in data management and how to expand current biomedical ontologies to represent concepts in kidney disease.

Apr 8, 2019: I started as a Young Investigator in the Semantic Scholar Research Team at AI2!

Apr 6, 2019: I moved back to Seattle! Took a short break between finishing my PhD and my new postdoc position at AI2 to visit family and friends on the East Coast and bike from SF to LA with B!

Feb 26, 2019: I successfully defended my PhD dissertation: "Ontology-based pathway data integration"!!