diff --git a/doc/report/endpoints.tex b/doc/report/endpoints.tex index db66855c11e44b6f1d0ccb15b5ed327a77172f3a..8fe03e5726260e85d6b1b8df09be27084cd79780 100644 --- a/doc/report/endpoints.tex +++ b/doc/report/endpoints.tex @@ -12,63 +12,63 @@ one based around the standardized SPARQL query language and the other on the RDF4J Java library implemented by various vendors. Both approaches have unique advantages. -\subsection{Available Application Interfaces} +\begin{description} + \item[SPARQL] is a standardized query language for RDF triplet + data~\cite{sparql}. The specification includes not just syntax + and semantics of the language itself, but also a standardized + REST interface for querying databases. -\begin{itemize} - \item SPARQL is a standardized query language for RDF triplet - data~\cite{sparql}. The spec includes not just syntax and - semantics of the language itself, but also a standardized REST - interface for querying databases. Various implementations of - this standard, e.g.~\cite{gosparql}, are available so using - SPARQL has the advantage of making us independent of a specific - programming language or environment. - - SPARQL is inspired by SQL and as such the \texttt{SELECT} - \texttt{WHERE} syntax should be familiar to many software - developers. A simple query that returns all triplets in the - store looks like - \begin{verbatim} - SELECT * WHERE { ?s ?p ?o } - \end{verbatim} + \textbf{Syntax} SPARQL is inspired by SQL and as such the + \texttt{SELECT} \texttt{WHERE} syntax should be familiar to many + software developers. A simple query that returns all triplets + in the store looks like + \begin{lstlisting} + SELECT * WHERE { ?s ?p ?o } + \end{lstlisting} where \texttt{?s}, \texttt{?p} and \texttt{?o} are query - variables. The result of a query are valid substitutions for the - query variables. In this case, the database would return a table - of all triplets in the store sorted by subject~\texttt{?o}, - predicate~\texttt{?p} and object~\texttt{?o}. + variables. The result of any query are valid substitutions for + the query variables. In this particular case, the database would + return a table of all triplets in the store sorted by + subject~\texttt{?o}, predicate~\texttt{?p} and + object~\texttt{?o}. - Of course, queries might return a lot of data. Importing just - the Isabelle exports into GraphDB results in more than 200 - million triplets. For practical applications it will be - necessary to limit the number of result or use pagination - techniques~\cite{sparqlpagination}. + \textbf{Advantage} Probably the biggest advantage is that + SPARQL is ubiquitous. As it is the de facto standard for + querying triplet stores, lots of literature and documentation is + available~\cite{sparqlbook, sparqlimpls, gosparql}. - \item RDF4J is a Java API for interacting with triplet stores, - implemented based on a superset of - {SPARQL}~\cite{rdf4j}. GraphDB supports RDF4J, in fact it is the - recommended way of interacting with GraphDB - repositories~\cite{graphdbapi}. Instead of formulating textual - queries, RDF4J allows developers to query a repository by - calling Java API methods. Above query that returns all triplets - in the store looks like - \begin{verbatim} - connection.getStatements(null, null, null); - \end{verbatim} - in RDF4J. Method \texttt{getStatements(s, p, o)} returns all - triplets that have matching subject~\texttt{s}, - predicate~\texttt{p} and object~\texttt{o}. If any of these - arguments is \texttt{null}, it can be any value, i.e.\ it is a - query variable that is to be filled by the call to - \texttt{getStatements}. + \item[RDF4J] is a Java API for interacting with triplet stores, + implemented based on a superset of the {SPARQL} REST interface~\cite{rdf4j}. + GraphDB supports RDF4J, in fact it is the recommended way of + interacting with GraphDB repositories~\cite{graphdbapi}. - Using RDF4J does introduce a dependency on the JVM family of - languages, but also offers some conveniences. For example, we - can generate Java classes that contain all URIs in an OWL - ontology as constants~\cite{rdf4jgen}. In combination with IDE - support, we found this to be very convenient when writing - applications that interface with ULO data sets. + \textbf{Syntax} Instead of formulating textual queries, RDF4J + allows developers to query a repository by calling Java API + methods. Previous query that requests all triplets in the store + looks like + \begin{lstlisting} + connection.getStatements(null, null, null); + \end{lstlisting} + in RDF4J. \texttt{getStatements(s, p, o)} returns all triplets + that have matching subject~\texttt{s}, predicate~\texttt{p} and + object~\texttt{o}. Any argument that is \texttt{null} can be + replace with any value, i.e.\ it is a query variable to be + filled by the call to \texttt{getStatements}. -\end{itemize} + \textbf{Advantage} Using RDF4J does introduce a dependency on + the JVM and its languages. But in practice, we found RDF4J to be + quite convenient, especially for simple queries, as it allows us + to formulate everything in a single programming language rather + than mixing languages and awkward string literals. -\subsection{Comparison} + We also found it quite helpful to generate Java classes from + OWL ontologies that contain all definitions of the ontology and + make it readable by any IDE~\cite{rdf4jgen}. +\end{description} -\emph{TODO} +We see that both SPARQL and RDF4J have unique advantages. While SPARQL +is an official W3C standard and implemented by more database systems, +RDF4J can be more convenient when dealing with JVM-based code bases. +For \emph{ulo-storage}, we played around with both interfaces and +chose whatever seemed more convenient at the moment. We recommend any +implementors to do the same. diff --git a/doc/report/references.bib b/doc/report/references.bib index 4a0ec1110f4df2214c7c6b160b156fe820a39052..9560db2112eae989deff1487f8120d3598370be1 100644 --- a/doc/report/references.bib +++ b/doc/report/references.bib @@ -14,6 +14,13 @@ url = {https://godoc.org/github.com/knakk/sparql}, } +@online{sparqlimpls, + title = {SparqlImplementations}, + organization = {W3C}, + url = {https://www.w3.org/wiki/SparqlImplementations}, + urldate = {2020-07-06}, +} + @online{uloisabelle, title = {Isabelle: Libraries of the Isabelle proof assistant in OMDoc/MMT representation}, organization = {MathHub}, @@ -140,3 +147,10 @@ urldate = {2020-07-06}, url = {https://kwarc.info/people/frabe/Research/GKKMR_alignments_17.pdf}, } + +@book{sparqlbook, + title={Learning SPARQL: querying and updating with SPARQL 1.1}, + author={DuCharme, Bob}, + year={2013}, + publisher={" O'Reilly Media, Inc."} +}