Chapter 2. Content Handling

An important goal in the design of the XOE Framework was to simplify handling the diverse array of content encountered by most users on a daily basis. Content on the Web can be thought of as simple blobs of data which might potentially be converted into something useful to the user. The key piece of information regarding any data blob is it's type. Once the type of the data is discovered, a handler is required. The handler is a service which claims to understand the relevant data type and, given the data blob, will return the desired content, be it an HTML page, MP3 file or any other of the myriad types of potentially encountered.

Content Type

XOE Content Types consist of three pieces of information:

Given a URL, there are various ways to tell the type of data to which the URL refers. With HTTP, the most obvious way is to ask the web server for the URL's Content-Type, which will be returned in the form of an Internet Media Type and Subtype, collectively known as the MIME Type.

Another option is to look up the MIME Type mapping based on the filename suffix in the URL. For the URL: http://www.xoe.org/xoe.pdf, the filename suffix pdf would be used to find the corresponding MIME Type, if one has been registered. In this example, the MIME Type application/pdf would be returned, indicating that the data was encoded in the Portable Document Format.

When using XOE, one of the most commonly encountered MIME Types is text/xml which refers to a text file containing XML data. In many cases, this MIME Type is too broad and doesn't provide the necessary granularity for properly handling the content. For example, the data used by the XOE Calendar application has a MIME Type of text/xml. How then, when handling this data, does the Framework know to launch the Calendar? So one more piece of data is needed to further narrow the scope of this type. This extra data is found by parsing the XML file and extracting the document element's name and namespace. These values evaluated together provide what we call the Document Type Name and Namespace.

The Stash

The Stash is a virtual read/write file system provided as the primary storage mechanism for the XOE Framework. The Stash is implemented as a DOM tree of content elements which are essentially wrappers around the content they represent. Content from the Stash is accessed by using the stash:/ protocol.

A listing of the Stash reveals:

An example of a Stash URL would be: stash:/home/default/data/tvt-addressbook which refers to the directory where the default user's address entry data might be stored.

To fetch the content programmatically:


    ContentElement adressDirectory = Stash.fetch ("stash:/home/default/data/tvt-addressbook");
  

Content Elements

When accessing content over the web, in most cases the data should be retrieved only when there is a means to handle it once it is downloaded to the device. To facilitate this, a content element is used to provide a proxy to the content, which allows for discovery of the type and verification that a handler exists before the binary data is downloaded and stored locally.

A content element consists of:

The Content Cache

The XOE Content Cache is a local cache which encompasses all the data in the Framework, including content pulled from the Web as well as local content stored in the Stash. Using this single entry point to all cached data, services can pass arbitrary URLs to the Content Cache without having to know where the content resides, greatly simplifying data handling.

The following is a programmatic example in which we fetch some content via it's URL and look at it's type before actually getting the binary data.


// fetch some content by URL
ContentElement content = ContentCache.fetch (url);

// did we find some content?
if (content != null)
  {
    // get it's type
    ContentType type = content.getType ();

    // is this the type we want?
    if (type.equals (theTypeWeWant))
      {
        // read the binary data
        readTheData (content.getInputStream ());
      }
  }