Seshu Kumar Adiraju

Subscribe to Seshu Kumar Adiraju: eMailAlertsEmail Alerts
Get Seshu Kumar Adiraju: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: XML Magazine

XML: Article

XQuery: Reporting XML Data

Using XQuery and templates

Reporting is an important functionality in software business applications and is now increasingly required for XML data. Applications typically generate reports by extracting relevant information from a database. Applications that don't use the underlying database for information storage adopt XSL/FO and proprietary techniques. Other solutions rely on migrating data to a database to enable reporting. These techniques are inflexible and expensive for reporting XML data.

A cost-effective and flexible reporting framework allows quick report creation and keeps the reporting independent of the existing code base and the XML data source. This article examines a reporting framework that uses Report Templates and XQuery to fit the bill. The report templates contain static content and formatting information for the report (in HTML) and embedded XQuery constructs to generate dynamic content.

The reporting framework provides an execution context and also some value-added services like converting the XML report to desired formats such as PDF, RTF, and HTML. XQuery is based on XML with powerful but easy to use syntax like SQL. It is system-independent and applicable across all sorts of XML data from files to databases. The use of Report templates can result in faster development cycles, improving turnaround time for report creation and offering greater flexibility.

Context
As an increasing amount of information is stored, exchanged, and presented using XML, the ability to intelligently query XML data sources and report this data becomes more important.

While using XML for data representation brings a lot of flexibility, extensibility, and openness, the lack of a high-level query language until now meant that reporting XML data was complex. The XQuery specification provides a powerful and convenient language designed for processing of XML data.

In this article, we propose an XML reporting framework using XQuery. The framework allows creation of reports that are not merely a transformation of XML data but involve querying, extracting and analysis as well. XQuery aims to provide a standard querying mechanism for all kinds of XML data sources. Therefore, a reporting framework using XQuery can be used in a variety of applications using XML data.

The reporting framework was developed for and used successfully in a UML-based modeling tool developed at Infosys.

Current Approaches
Most existing solutions use XSL files for reporting, but there are many disadvantages in this approach. For one, XSL is mainly a transformation, not a query, language. The syntax is difficult for nonprogrammers. And, this technique is file based and cannot work with other XML data sources. Other approaches are proprietary and come with an additional cost. Some solutions use a template-based approach for reporting, but use proprietary tags in the template.

This requires a standards-based solution that will allow querying from disparate data sources. We require a reporting/analysis framework that:

  • Uses a query language that is easy to use (by domain experts)
  • Uses a query language that supports arithmetic and Boolean operations, conditional logic, recursion, aggregation, sorting, filtering and access to environment information
  • Supports disparate XML data sources including streamed XML data
  • Allows easy formatting of the report, preferably through templates
  • Facilitates easy customization of reports
Introduction to XQuery
Let's take a high-level look at XQuery and cover some aspects relevant for this article. (For more details on XQuery see References.)

XQuery introduces an easy-to-use, high-level query language syntax for XML. XQuery is to XML what SQL is to databases. It is designed to be a language in which queries are concise, easily understood, and human readable. It is also flexible enough to query a broad spectrum of XML information sources, including both databases and documents. Like SQL for relational databases, XQuery will become a system-independent standard, applicable across all sorts of XML data.

XQuery offers different types of expressions - called FLWR (pronounced "flower") expressions. FLWR is an acronym for the for-let-where-return XQuery keywords. Also, it supports path-expressions and element/attribute constructers. Using these features we can use XQuery to Query, Transform or even construct XML documents.

XQuery can be used in different usage scenarios, such as,

  • XML Transformation: XQuery provides a powerful (extensible) mechanism for nontrivial transformations of XML documents from one format into another .
  • XML Data Extraction: XQuery can be used to extract XML data from different sources like XML documents, relational databases or Native XML Databases.
  • HTML/XHTML Generation: HTML/XHTML pages can be generated dynamically using embedded XQuery, similar to other techniques like JSP, ASP, and PHP, etc.
Proposed Framework: Report Templates Using XQuery
In this section we present a new Reporting framework using Report templates and XQuery. This reporting framework can be used by any application that stores application data in XML.

Reporting Framework
The Application "XML Data" can be available in different XML formats. The report content and structure are defined in an "XQuery Report Template". The "Template Runner"' runs the Report Template against XML Data to generate a report in XHTML format.

Optionally this report can be converted into other formats like PDF or RTF. Also, for applications storing data in non XML formats, an additional component "XML Generator" can create XML representation of its data for reporting purposes (see Figure 1).

The Reporting Framework contains two main components, "XQuery Report Template" and "Template Runner," described below.

XQuery Report Template
The Report Templates are essentially HTML documents with embedded XQuery. This approach is very similar to dynamic scripting languages like JSP or ASP.

The HTML part of the template defines the overall report layout and formatting information. The embedded XQuery expressions define the dynamic content that should be queried from the project XML. In fact, HTML and XQuery can be combined in two ways. One, we can embed XQuery expressions in HTML tags. Also, we can embed HTML within XQuery. This provides a lot of flexibility in structuring the report template.

Template Runner
The template runner provides the execution context for the XQuery execution. It binds the XML Data and the user input required for Report generation to the XQuery processor. It also handles the output from XQuery processor and optionally converts the report into formats like PDF, RTF, and HTML (see Figure 2).

We used the XQuery library Qexo to process the XQuery report.

Using a similar approach, we have also published the model data as a browse-able Web site. This has been found to be an effective way of disseminating model data among the project team.

This approach provides an efficient mechanism to query model data from XML and then use model data for analysis and reporting. The key benefits of this approach are:

  • Easy-to-write XQuery language that is human readable and has high-level constructs similar to SQL.
  • Easy-to-extend process as new models are introduced.
  • Templates can be created independent of the tool without change in tool code. Hence, there are fewer maintenance problems and greater productivity.
An Example
Let's illustrate the XQuery template concept with the simple example of a book catalogue.

Step 1: XML Data
Consider an XML representation of a book catalogue with details of the name of the book and the author. Listing 1 shows an XML for the book catalogue.

The XML has one node for each book and subnodes containing the details like title, author, and price information for each book.

Step 2: Report Template
Suppose that we want to create a simple report showing the list of books. We need to create a Report template that queries the XML file for that data and present the data in a tabular column. Listing 2 shows a report template that does this.

This template is a simple HTML document that mainly contains a <table> tag with a header row and also an embedded XQuery segment. The XQuery segment (enclosed in flower braces '{'and '}') is:

  • let clause: The document command used to load data from an XML document
  • for clause: For every "book" in the "catalogue"
  • return clause: Book title and author
The enclosing HTML provides the required formatting.

Step 3: Template Execution
Next, the Template runner executes the Report template (Listing 2) against the XML data (Listing 1).

Template runner creates the execution context for the XQuery processor. The execution context provides access to the XML data and also executes the xquery against it. The context can also be used to bind any other variables that might be provided during execution of report. For example, the report might require user inputs like author name as a filtering criterion. Template runner performs this binding during the report execution.

The output from Template Runner is a XHTML document which can be converted to PDF/RTF/HTML formats. Figure 3 shows an RTF report generated from the Report template shown in Listing 2.

Step 4: Template Modification
One of the main advantages of the template-based approach is that the report templates can be modified easily, without effecting the rest of the application. For example, assume that we need to modify the report created above to show the book price in addition to the title and author. Also, let's say that we require this information only for the books available in paperback format. The modified template would be as shown in Listing 3.

The template introduced an extra column in the HTML table tag for displaying the price information. A "where clause" is introduced to query for only those books available in "paperback" format. The third change is the expression to query and display the price. The generated report is shown in Figure 4.

This simple example shows how easy it is to change both the content and the formatting of the report with the XQuery template approach.

Conclusion
XQuery is based on XML and has powerful but easy-to-use syntax like SQL. This makes it easy for authoring templates by domain experts. Using Report templates can result in faster development cycles and keep application and reports independent of each other. This improves the overall turn around time for report creation and also offers greater flexibility.

XQuery is a system-independent standard, applicable across all sorts of XML data. The majority of XML database vendors, tool vendors and even some relational database vendors, have developed or announced plans for XQuery support. Therefore, XQuery Template based reporting is a promising approach.

References

  • XQuery 1.0 W3C Working Draft, Nov. 2002: www.w3.org/TR/xquery
  • XQuery Requirements, May 2003: www.w3.org/TR/xquery-requirements/
  • XQuery-Intro, Per Bothner, 2002: www.gnu.org/software/qexo/XQuery-Intro.html
  • InFlux Methodology,Infosys Technologies Limited: www.infosys.com/technology/influx.asp
  • Cheng, Alex. "XQuery: A Flexible Query Language for XML": www.ddj.com/documents/s=7067/ddj0250a/cheng.htm.
  • Bothner, Per. (Dec 2003) "Using XQuery to generate HTML." www.xml.com/pub/a/2002/12/23/xquery.html.
  • Qexo: The GNU Kawa XQuery implementation: www.gnu.org/software/qexo.
  • More Stories By Seshu Kumar Adiraju

    Seshu Kumar Adiraju is a technical architect with SETLabs, an applied research unit of Infosys Technologies Limited (www.infosys.com). His areas of interest include J2EE, EAI, and methodologies. He is currently part of the development team of InFlux Workbench, a business process modeling application developed by Infosys. Seshu Kumar has worked on various J2EE software development and Enterprise Application Integration projects. He holds a master of technology degree in chemical engineering from the Indian Institute of Technology, Bombay.

    More Stories By Srinivas Thonse

    Srinivas Thonse: is a principal architect with SETLabs, the applied research unit of Infosys Technologies Limited (www.infosys.com). His areas of interest include object technology, methodologies, business process management, and software architecture. Srinivas has directed the architecture and design of many enterprise projects. He regularly mentors and trains architects and designers in software design and object methodologies. Srinivas holds a bachelor of engineering degree in computer science from the Bangalore University.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.