Authors: Sagi Zisman (National Renewable Energy Laboratory (NREL)), Gretchen Greene (National Institute of Standards and Technology (NIST)), Eric Stephan (Pacific Northwest National Laboratory (PNNL)), Marshall McDonnell (Oak Ridge National Laboratory (ORNL)), Stuart Chalk (University of North Florida), Ambarish Nag (National Renewable Energy Laboratory (NREL)), Dmitry Duplyakin (National Renewable Energy Laboratory (NREL)), Graham Johnson (National Renewable Energy Laboratory (NREL)), David Rager (National Renewable Energy Laboratory (NREL)), Harrison Goldwyn (National Renewable Energy Laboratory (NREL)), Nalinrat Guba (National Renewable Energy Laboratory (NREL)), Gabe Fierro (Colorado School of Mines)
Abstract: Diverse big data, interdisciplinary science, ML/AI applications and in-situ computations necessitate knowledge representation. Knowledge, organized for machine understanding in graph form known as knowledge graphs, augments large-scale science. For example, biology and semantic web utilize large knowledge graphs. Utilizing AI, knowledge graphs enable natural language querying of linked information, semantic recommendation systems, and knowledge completion. HPC challenges abound including parallelizing queries, retrieval-efficient knowledge representation, and knowledge graph context-exploiting AI. This BoF will introduce big ideas, as lightning talks followed by discussion, and engage a general audience in a discussion of emerging research topics aiming to seed a community for collaboration.
Long Description: The topic of this BoF session is knowledge graphs for science and their computational considerations. We have not held this BoF previously and are excited to share this topic with the community. The main goal is to spark the imagination of the HPC community around the usage of knowledge graphs for science. More granularly, we have three subgoals. First, we will present a brief introduction to knowledge graphs, ontologies, computational considerations, and their potential impact for science. To make the discussion as inclusive as possible, the introduction is designed to provide vital background so everyone can participate in the ensuing discussion. Second, we aim to focus the BoF discussion around emerging topics, standing challenges, and identification of opportunities. Third and most important, we aim to seed a scientific community interested in leveraging knowledge graphs in their respective scientific or research domains. We are particularly interested in the application of large knowledge graphs to scientific domains. The HPC community is the perfect group to tackle the computational challenges involved.
During the introduction, a series of brief lightning talks will be presented which will take roughly 15-20 minutes:
1. Introduction to knowledge graphs: a computational and HPC perspective
2. Given meaning to data using JSON-LD
3. Revolutionizing Science Knowledge Representation: Unleashing the Power of RDF.
4. Building Knowledge Discovery into the NIST Cross Disciplinary Research Infrastructure
Relevance to HPC Audience:
Knowledge graphs are key to accelerating science in the age of complex, cross-disciplinary, and data-driven modeling and analysis. Encoding domain knowledge in a graph rooted in semantic vocabularies opens the door to efficient human-to-human collaboration and imparts vast automatable knowledge to machines. Knowledge graphs can be large sometimes exceeding hundreds of millions of nodes, especially if data is incorporated as part of the graph. In such cases, HPC resources may be used for running parallel graph queries using the SPARQL language as well as introducing new capabilities such as LLMs (large language modeling) and domain conceptual design. Intelligent big data retrieval schemes may be required and have implications for the semantic representation of the domain hence tying computation and knowledge representation. Machine learning applications such as graph link prediction or further downstream tasks such as using graph context for recommender systems can be powered by HPC. Audience members will gain an appreciation of knowledge graphs in general and of how HPC can play a role in enabling their application.
Expected Outcome:
The intended outcome of this Birds of a Feather session is twofold: to inform the community about the promise of knowledge graphs and most importantly to seed a community of those interested in applying knowledge graphs to their domain or institution. We will invite the attendees to join a community of researchers that includes domain experts, computational scientists, data scientists and informaticians that are interested in tackling big scientific problems with the help of knowledge graphs. As part of seeding the community, we intend to compile a list of research questions and ideas that audience members are interested in discussing further.
Back to Birds of a Feather Archive Listing