Research Projects

My primary research interests lie in the areas of shared resources for scientific NLP, NLP applications in the biomedical domain, biomedical ontologies and KR, and meta-science. Links to my work below.


Corpuses for Scientific NLP

2020. Lo K*, Wang LL*, Neumann M, Kinney R, Weld DS. S2ORC: The Semantic Scholar Open Research Corpus. ACL. ACL: 2020.acl-main.447. ArXiv: 1911.02782. GitHub.

2020. Wang LL*, Lo K*, Chandrasekhar Y, Reas R, Yang J, Burdick D, Eide D, Funk K, Katsis Y, Kinney R, Li Y, Liu Z, Merrill W, Mooney P, Murdick D, Rishi D, Sheehan J, Shen Z, Stilson B, Wade AD, Wang K, Wang NXR, Wilhelm C, Xie B, Raymond D, Weld DS, Etzioni O, Kohlmeier S. CORD-19: The Covid-19 Open Research Dataset. Preprint. PMC ID: PMC7251955. ArXiv: 2004.10706. Dataset.

2018. Ammar W, Groeneveld D, Bhagavatula C, Beltagy I, Crawford M, Downey D, Dunkelberger J, Elgohary A, Feldman S, Ha V, Kinney R, Kohlmeier S, Lo K, Murray T, Ooi H, Peters M, Power J, Skjonsberg S, Wang LL, Wilhelm C, Yuan Z, van Zuylen M, Etzioni O. Construction of the literature graph in Semantic Scholar. ACL. DOI: 10.18653/v1/N18-3011. ACL: N18-3011. ArXiv: 1805.02262.

NLP Applications

2020. Wang LL, Tafjord O, Cohan A, Jain S, Skjonsberg S, Schoenick C, Botner N, Ammar W. SUPP.AI: finding evidence for supplement-drug interactions. ACL: System Demonstrations. ACL: 2020.acl-demos.41. ArXiv: 1909.08135. Demo.

2020. Wadden D, Lin S, Lo K, Wang LL, van Zuylen M, Cohan A, Hajishirzi H. Fact or Fiction: Verifying Scientific Claims. Preprint. ArXiv: 2004.14974. Demo.


2020. Wang LL, Stanovsky G, Weihs L, Etzioni O. Gender trends in computer science authorship. To appear: CACM. ArXiv: 1906.07883.

Biomedical Ontology

2020. Ong E*, Wang LL*, Schaub J, O’Toole JF, Steck B, Rosenberg AZ, Dowd F, Hansen J, Barisoni L, Jain S, de Boer IH, Valerius MT, Waikar SS, Park C, Crawford DC, Alexandrov T, Anderton CR, Stoeckert C, Weng C, Diehl AD, Mungall CJ, Haendel M, Robinson PN, Himmelfarb J, Iyengar R, Kretzler M, Mooney SD, He Y, KPMP. Modeling kidney disease using ontology: perspectives from the KPMP. To appear: Nature Reviews Nephrology.

2019. Wang LL, Hayman GT, Smith JR, Tutaj M, Shimoyama ME, Gennari JH. Predicting instances of Pathway Ontology classes for pathway integration. Journal of Biomedical Semantics: 10(1):11. DOI: 10.1186/s13326-019-0202-8. PMID: 31196182.

2018. Wang LL, Bhagavatula C, Neumann M, Lo K, Wilhelm C, Ammar W. Ontology alignment in the biomedical domain using entity definitions and context. BioNLP at ACL. DOI: 10.18653/v1/W18-2306. ACL: W18-2306. ArXiv: 1806.07976.

2017. Wang LL, Gennari JH. Similarity metrics for determining overlap among biological pathways. ICBO. CEUR Workshop Proceedings.

2016. Wang LL, Gennari JH, Abernethy NF. An analysis of differences in biological pathway resources. ICBO and BioCreative. CEUR Workshop Proceedings.

2015. Wang LL, Grunblatt E, Jung H, Kalet IJ, Whipple ME. Biological model development as an opportunity to provide content auditing for the foundational model of anatomy ontology. AMIA Annual Symposium. PMID: 26958311.

Biomedical Data Modeling

2018. Wang LL, Lin H, Bao X, Sengupta S, Busby B, Butler RR III. PhenotypeXpression: sub-classification of disease states using public gene expression data and literature. Preprint. BioArXiv: 10.1101/461301v2.

2017. Kaminsky DA, Wang LL, Bates JH, Thamrin C, Shade DM, Dixon AE, Wise RA, Peters S, Irwin CG. Fluctuation analysis of peak expiratory flow and its associations with treatment failure in asthma. American Journal of Respiratory and Critical Care Medicine, 195(8): 993-9. DOI: 10.1164/rccm.201601-0076OC. PMID: 27814453.

2016. Jung H, Law AB, Grunblatt E, Wang LL, Kusano A, Mejino JLV Jr, Whipple ME. Development of a novel Markov chain model for the prediction of head and neck squamous cell carcinoma dissemination. AMIA Annual Symposium. PMID: 28269942.

2015. Zaidman CM, Wang LL, Connolly AM, Florence J, Wong BL, Parsons JA, Apkon S, Goyal N, Williams E, Escolar D, Rutkove SB, Bohorquez JL; DART-EIM Clinical Evaluators Consortium. Electrical impedance myography in Duchenne muscular dystrophy and health controls: a multi-center study of reliability and validity. Muscle & Nerve, 52(4): 592-7. DOI: 10.1002/mus.24611. PMID: 25702806.

2011. Wang LL, Spieker AJ, Li J, Rutkove SB. Electrical impedance myography for monitoring motor neuron loss in the SOD1 G93A amyotrophic lateral sclerosis rat. Clinical Neurophysiology, 122: 2505-11. DOI: 10.1016/j.clinph.2011.04.021. PMID: 21612980.

2011. Wang LL, Ahad M, McEwan A, Li J, Jafarpoor M, Rutkove SB. Assessment of alterations in the electrical impedance of muscle after experimental nerve injury via finite-element analysis. IEEE Transactions on Biomedical Engineering, 58(6): 1585-91. DOI: 10.1109/TBME.2011.2104957. PMID: 21224171.

Shared Tasks & Workshops

2020. EPIC-QA: Epidemic Question Answering track challenge

2020. SciNLP Workshop: Natural Language Processing and Data Mining for Scientific Text

2020. TREC-COVID Ad-hoc Retrieval Challenge

2020. Voorhees E, Alam T, Bedrick S, Demner-Fushman D, Hersh WR, Lo K, Roberts K, Soboroff I, Wang LL. TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection. SIGIR Forum: 54(1). ArXiv: 2005.04474.

2020. Roberts K, Alam T, Bedrick S, Demner-Fushman D, Lo K, Soboroff I, Voorhees E, Wang LL, Hersh WR. TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19. Journal of the American Medical Informatics Association: ocaa091. DOI: 10.1093/jamia/ocaa091. PMID: 32365190.

Talks & Tutorials

2020 Jul 29. "CORD-19 Search: Using Machine Learning to Explore COVID-19 Scientific Literature" AWS Education: Research Seminar Series. Online. Video.

2020 Jun 18. "Improving access to scientific literature for NLP." MSR Hanover Group. Online.

2020 Jun 12. "The COVID-19 Open Research Dataset." Connected Health and COVID-19: Now and Beyond the Great Lockdown. Online.

2020 May 27. "The COVID-19 Open Research Dataset." Centre for Science and Technology Studies, Leiden University. Online.

2020 Apr 27. "CORD-19: The COVID-19 open research dataset." NLP Meetup (NY-NLP, A2D-NLP, DC-NLP, Hungarian NLP, London Text Analytics). Online. Video.

2020 Apr 20. "Exploring the COVID-19 Open Research Dataset." Practical AI. Podcast.

2020 Apr 14. "The COVID-19 open research dataset." SIIRH at ECIR. Online. Video.

2019 May 22. "Ontology-based integration of biological pathway data." SLKB at AKBC. Amherst, MA.

2018 Oct 11. "Ontologies and algorithms for integrating biological pathway data." BIME 590: Departmental Seminar. Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA.

2018 Oct 10. "Learning from Biomedical Knowledge." The Allen Institute for AI, Seattle, WA. Video.

2018 Jul 24. "A Brief Introduction to Ontology." Kidney Precision Medicine Project Ontology Workshop. Seattle, WA.

2018 Jan 24. "A SPARQL Tutorial." BIME 550: Knowledge Representation. Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA.

2017. "Biological Pathway Analysis: Trends and Applications." BIME 591: Winter 2017 Seminar Course. Course website.

Selected Press Coverage

Roll Call: AI researchers seeking COVID-19 answers face hurdles
Geekwire: Software tools for mining COVID-19 research studies go viral among scientists
King5: Free online tool identifies dangerous drug/supplement combinations
Geekwire: How do drugs interact with supplements? Supp.AI search engine tracks down clues
VentureBeat: Supp AI uses machine learning to identify supplement interactions
Axios: Another century of gender inequality in computer science
NYTimes: The Gender Gap in Computer Science Research Won’t Close for 100 Years
Geekwire: A study about studies suggests men will still prevail in computer science in 2100