Manara - Qatar Research Repository
Browse

RDFFrames: knowledge graph access for machine learning tools

Download (866.76 kB)
journal contribution
posted on 2022-11-22, 21:12 authored by Aisha Mohamed, Ghadeer Abuoda, Abdurrahman Ghanem, Zoi Kaoudi, Ashraf Aboulnaga

Knowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system. This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style. Machine learning tools work on data in tabular format and process it using an imperative programming style, while SPARQL is declarative and has as its basic operation matching graph patterns to RDF triples. We posit that a good interface to knowledge graphs from a machine learning software stack should use an imperative, navigational programming paradigm based on graph traversal rather than the SPARQL query paradigm based on graph patterns. In this paper, we present RDFFrames, a framework that provides such an interface. RDFFrames provides an imperative Python API that gets internally translated to SPARQL, and it is integrated with the PyData machine learning software stack. RDFFrames enables the user to make a sequence of Python calls to define the data to be extracted from a knowledge graph stored in an RDF database system, and it translates these calls into a compact SPQARL query, executes it on the database system, and returns the results in a standard tabular format. Thus, RDFFrames is a useful tool for data preparation that combines the usability of PyData with the flexibility and performance of RDF database systems.

Other Information

Published in: The VLDB Journal
License: https://creativecommons.org/licenses/by/4.0
See article on publisher's website: http://dx.doi.org/10.1007/s00778-021-00690-5

Funding

Open access funding provided by the Qatar National Library.

History

Language

  • English

Publisher

Springer Nature

Publication Year

  • 2021

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU
  • Qatar Computing Research Institute - HBKU

Usage metrics

    College of Science and Engineering - HBKU

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC