Despite some notable shortcomings in today’s technology landscape, XML is still a powerful language that offers key advantages in complex data storage scenarios.
Compared to a popular data interchange format like JSON, for example, XML’s syntax places more emphasis on machine readability over human readability, making its error-checking process more efficient. Most importantly, XML excels at storing unique data types with multiple variables, while JSON is optimized for relatively simple and concise object storage. Both the advantages and disadvantages of XML stem from the fact that it is not at all a dedicated data exchange format like JSON; rather, it is a complex markup language (more similar to HTML) with powerful data exchange capabilities.
Once we choose XML to store complex data relationships, we can begin to use powerful XML technologies such as XQuery to extract and manipulate complex data for various purposes. With a good understanding of XQuery and a consistent, reliable environment for executing our queries against XML files, we can extend the usefulness of complex XML data.
In this article, we’ll briefly review how XQuery works (with a basic example), then learn how to query one or more XML files through a pair of free APIs using complementary Java code examples.
Understanding XQuery
Thanks to XML’s characteristic hierarchical data structure, querying content from complex XML files is not that complicated.
At a high level, relationships in XML data are neatly represented in a parent-child structure. This creates an easy path to navigate through well-formed, targeted query expressions – even when those relationships are highly unique. Any element in XML syntax (structured as HTML with opening and closing tags, e.g <example>hello world</example>
) can have its own specific attributes and include multiple additional elements or data types within it, and we can use XQuery to efficiently access all of this.
Whether we’re trying to filter and reuse data from an XML file, aggregate specific XML file data for calculations and reports, or simply search for matching data across multiple XML files stored in a single database, XQuery is up to the task. It is a core part of the XML family of technologies (along with its cousins XPath and XSLT) and offers versatility in a wide range of XML data handling scenarios.
A simple XQuery example
The most basic use case for XQuery is data retrieval, so let’s look at a simple example that shows how we can use XQuery to find specific data within an XML file.
Below we have an example XML file that stores information about popular movies broken down by genre. Within each genre element we have information including the title of the film, the director of the film, the year of release and the price of the cinema ticket.
<?xml version="1.0" encoding="UTF-8"?>
<movies>
<movie category="ACTION">
<title lang="en">Inception</title>
<director>Christopher Nolan</director>
<year>2010</year>
<ticket_price>10.00</ticket_price>
</movie>
<movie category="COMEDY">
<title lang="en">The Grand Budapest Hotel</title>
<director>Wes Anderson</director>
<year>2014</year>
<ticket_price>12.50</ticket_price>
</movie>
<movie category="DRAMA">
<title lang="en">The Shawshank Redemption</title>
<director>Frank Darabont</director>
<year>1994</year>
<ticket_price>8.00</ticket_price>
</movie>
<movie category="SCIFI">
<title lang="en">Blade Runner 2049</title>
<director>Denis Villeneuve</director>
<year>2017</year>
<ticket_price>15.00</ticket_price>
</movie>
</movies>
By writing declarative expressions in XQuery (using basic constructs like For
, Let
, Where
, Order
, Return
, etc.), we can easily filter through parent-child relationships in one or more XML files and get the data we need. XQuery uses XPath to navigate through the files in question while applying the above constructs to take specific actions against matching elements.
Let’s say we want to query our example file above to retrieve data about movies that cost more than $10. We could write a simple XQuery array like this:
for $movie in /movies/movie
where number($movie/ticket_price) > 10
return $movie
This would return information about “The Grand Budapest Hotel” and “Blade Runner”.
How and where to run XQuery expressions
To execute XQuery statements against one or more XML files, we need to run our statement through the XQuery processor. There are several different processors.
We can, for example, enter our XQuery expressions directly into a compatible database that stores XML files. For local development projects, we can run XQuery expressions with embedded technologies when using popular programming languages like Java. We can even look for and use online tools to run XQuery expressions against XML files on a one-time basis.
In addition to these options, we can leverage specialized web APIs to abstract XQuery expression processing entirely away from our local server. If our XML files are stored outside of an XQuery-compatible database, this option represents a highly scalable, low-maintenance, and cost-effective solution for querying our XML content.
Further down the page, we’ll look at two free APIs that we can use to query one or more XML files with a single XQuery statement via a multipart/form-data request.
Demonstration
Using the complementary ready-to-run Java code examples provided below, we can leverage two separate APIs optimized to query one or more XML files by passing our file paths and XQuery expressions together in a single request. To authorize our requests, we will only need a free API key for a limit of 800 requests per month.
Installing the client SDK
We can start structuring our API calls by installing the client SDK.
- In our Maven POM file, let’s add a reference to the repository (JitPack is used to dynamically compile the library):
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
- Next, let’s add a reference to the dependency:
<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>
Now we can start calling each individual API function.
Adding import classes and calling functions
- Using the first set of code examples below, we can call an API optimized to query a single XML document as input. Provided that the XML document is automatically loaded as the default context; to access elements in a document, simply refer to them without a document reference:
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertDataApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
ConvertDataApi apiInstance = new ConvertDataApi();
File inputFile = new File("/path/to/inputfile"); // File | Input XML file to perform the operation on.
String xquery = "xquery_example"; // String | Valid XML XQuery 3.1 or earlier query expression; multi-line expressions are supported
try
XmlQueryWithXQueryResult result = apiInstance.convertDataXmlQueryWithXQuery(inputFile, xquery);
System.out.println(result);
catch (ApiException e)
System.err.println("Exception when calling ConvertDataApi#convertDataXmlQueryWithXQuery");
e.printStackTrace();
- Using the following examples below, we can call an API optimized for querying multiple XML documents as input. Note that we can refer to the content of a specific document by name (eg “movies.xml” or “books.xml”) if we include two named documents. If our input files do not contain a filename, it will default to filenames like “input1.xml”, “input2.xml”, etc.
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertDataApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
ConvertDataApi apiInstance = new ConvertDataApi();
File inputFile1 = new File("/path/to/inputfile"); // File | First input XML file to perform the operation on.
String xquery = "xquery_example"; // String | Valid XML XQuery 3.1 or earlier query expression; multi-line expressions are supported
File inputFile2 = new File("/path/to/inputfile"); // File | Second input XML file to perform the operation on.
File inputFile3 = new File("/path/to/inputfile"); // File | Third input XML file to perform the operation on.
File inputFile4 = new File("/path/to/inputfile"); // File | Fourth input XML file to perform the operation on.
File inputFile5 = new File("/path/to/inputfile"); // File | Fifth input XML file to perform the operation on.
File inputFile6 = new File("/path/to/inputfile"); // File | Sixth input XML file to perform the operation on.
File inputFile7 = new File("/path/to/inputfile"); // File | Seventh input XML file to perform the operation on.
File inputFile8 = new File("/path/to/inputfile"); // File | Eighth input XML file to perform the operation on.
File inputFile9 = new File("/path/to/inputfile"); // File | Ninth input XML file to perform the operation on.
File inputFile10 = new File("/path/to/inputfile"); // File | Tenth input XML file to perform the operation on.
try
XmlQueryWithXQueryMultiResult result = apiInstance.convertDataXmlQueryWithXQueryMulti(inputFile1, xquery, inputFile2, inputFile3, inputFile4, inputFile5, inputFile6, inputFile7, inputFile8, inputFile9, inputFile10);
System.out.println(result);
catch (ApiException e)
System.err.println("Exception when calling ConvertDataApi#convertDataXmlQueryWithXQueryMulti");
e.printStackTrace();
We now have an easy method to query content from our XML files on a file-by-file basis or in bulk using multiple files at once.
Abstract
By using an API to process our XQuery strings instead of built-in functions or open source libraries, we’ve abstracted the query operation away from our servers, minimizing the amount of code we need to run. We’ve introduced a simple solution to locate and collect XML data that we don’t have to worry about updating or maintaining in the future.
Of course, it’s important to note that APIs won’t be viable for every project. As such, we must use our own judgment to determine when and where web API calls are appropriate in our application architecture.