Home

Visit Planet PDFVisit PDF StoreVisit Planet PDF Forum PDF file editing with Nitro PDF Software
Search
 Advanced       
Home
News
Articles
Tools
Nitro PDF
Planet PDF
PDF Software
ARTS PDF



Inside the Open eBook Publication Structure
Fuelling the eBook revolution

Should all go as planned for the Open eBook (OEB) Authoring Group, then the Open eBook Publication Structure (OEPS) will be the catalyst that fuels the eBook revolution. The 50-company-strong group has devised a non-proprietary specification structure that details an XML-based eBook file format and structure. Files in the new format, commonly known as OEB documents, are available for use by all purveyors of electronic book content. The major reason for the use of the OEB format is based on the premise that "...in order for electronic-book technology to achieve widespread success in the marketplace, reading systems must have convenient access to a large number and variety of titles...". Vendors seem to be off to a solid start with software-based eBook readers such as the Microsoft Reader and MobiPocket Reader already providing OEB-compliancy. Time will tell as to how this format compares with its more print-oriented competitor, Adobe's Portable Document Format (PDF). The remainder of this article will give a comprehensive outline of this new file format. (Note: Planet Publish's sister site is soon publishing all it's free eBooks for download as EPUB and PDF formats.)

Overview

The publishers and authors, referred to as 'content providers' by the OEPS, provide publications to one or more reading systems in a form defined by the OEPS specification. A publication is a set of files/documents comprising of various media types, text, and graphics to be published, and a reading system is a combination of hardware and software used to view the OEB document. OEB documents are essentially XML documents which conform to the OEPS.

  • Basic OEB Document
  • Extended OEB Document

The OEB specification is based on XML and hence ensures that for any basic OEB document, there is a syntax form that:

  • is a valid XML document
  • conforms fully to the OEB document DTD
  • is expected to conform to XHTML 1.0 when that specification is issued
  • is effectively previewable in typical version 4 HTML browsers

A publication that conforms to the above specification should include exactly one OEB package file. This is necessary for the reading system to recognize the objects within the publication.

OEB Package

An OEB package is a file that has the description about an OEB publication, namely its associated files and the access information. Simply stated, the OEB package specifies the OEB documents, images, bookmarks and other objects that make up the OEB publication and how they relate to each other.

It is also important to note that it is highly recommended that all package files use the extension ".OPF", to distinguish them from the other files making a publication. Package files are of MIME (Multipurpose Internet Mail Extensions) media type "text/xml". This specification does not define means for physically bundling files together to make a single data transfer object (such as zip or tar).

Whilst an OEB package must be a valid XML document conforming to the OEB package Document Type Definition (DTD), it is not required to physically include the OEB package DTD in every publication.

Inside the package file (".OPF")

The major parts of the OEB package file include:

  • Package Identity—a unique identifier for the OEB publication as a whole.
  • Metadata—Publication metadata (title, author, publisher, etc.).
  • Manifest—A list of files (documents, images, style sheets, etc.) that make up the publication. The manifest also includes fallback declarations for files of types not supported by this specification.
  • Spine—An arrangement of documents providing a linear reading order.
  • Tours—A set of alternate reading sequences through the publication, such as selective views for various reading purposes, reader expertise levels, etc.
  • Guide—A set of references to fundamental structural features of the publication, such as table of contents, foreword, bibliography, etc.

Cascading Style Sheets (CSS)

Cascading Style Sheets (CSS) are mechanisms that enable both authors and readers to append style (e.g. fonts, colors, spacing) to HTML and XML documents. That is, CSS are needed to define the appearance of XML documents. They use common desktop publishing terminology that should make it easy for professional as well as untrained designers to make use of its features. There are two ways to create CSS. Firstly, you can use a normal text editor to write the style sheets entirely by yourself. Or secondly, you may prefer to use a tool that assists you in creating the CSS, such as SoftQuad’s XMetal. To stay in the eBook race and produce OEB-compliant eBooks, understanding CSS is a definite necessity.

How it applies to OEB

For the highly technical user who wants to know exactly how cascading styles apply to the Open eBook specification, they define a style language based on the style sheet mechanisms CSS1 and CSS2 with a MIME media type of "text/x-oeb1-css". Stylesheets of other MIME media types may be substituted for the text/x-oeb1-css stylesheets at the discretion of the reading system.

Not all properties of the CSS1 and CSS2 mechanisms have been included in the OEB format. The inclusion of the CSS-based stylesheet constructs is to define a baseline rendering functionality. Apart from the default properties of CSS1 and CSS2, few other properties and values have been added to support page layout, headers, and footers.

Additionally, this specification supports the inline style attribute, the style element, and externally linked stylesheets. In the event of processing stylesheets, the reading system is not required to handle XML namespaces. Reading systems that implement only the OEB CSS subset may ignore any stylesheets using other style languages, whereas, those that support extended stylesheet functionality may choose among any of the other external stylesheets. There exists an option of adding non-OEB elements to the OEB document as long as such elements are provided with style definitions in accompanying style sheets.

Not (immediately) a deliverable format

This early version of the specification does not address issues such as Digital Rights Management (DRM) and compressed distribution packaging, this means that OEB is unlikely to be seen as suited to secure and timely delivery over the Internet (note, this was an intentional exclusion from the first release). The fact is that software developers and e-reading device manufacturers are still likely to use their own digital wrapper for their end-user distributable file. This includes formats such as Microsoft's ".LIT" and MobiPocket's ".PCF".

For a publisher to make its OEB-compliant content available to different target devices, they must 'wrap' their content to comply with each specific reader-associated format. For example, if the publication is intended to reach a large number of recipients using a wide variety of readers, then a file must be created for each of Microsoft Reader MobiPocket Reader, REB 1100 and REB 1200. Not to mention Primer, goReader and Cybook who are also releasing OEB-compliant readers in the near future.

What about PDF and OEB?

Technically, the eBook specification allows embedding of PDF files (or any other non-OEB file) into a publication, as long as that publication contains an alternate representation of the content to be used by the reading systems that lack support for that filetype. However, since Open eBook reading systems are not required to support PDF, in reality, this means that this is technically possible, but highly unlikely to be implemented in the real world. It seems much more likely that PDF content will be converted to an Open eBook document by format converters such as BCL's GoHTM. This applies to all other file formats as well such as Quark and PageMaker (see our article on avenue.quark for more information on converting Quark to XML/OEB.

The Future of ePublishing

Electronic books are the next logical step for the publishing industry over the coming years. The Open eBook Publication Specification details a non-proprietary content format that may well provide a mechanism to facilitate this transformation. The major threat to the success of this initiative is going to be the integration of distribution formats and DRM initiatives from vendors who are seeking to provide what is not yet available. This will be a challenge that needs to be met if eBooks are in fact to be as interoperable as their paper cousins.

More Info


Glossary of XML Terms
* A-D
API - Dublin Core
* E-H
Electronic Data Interchange (EDI) - Hypertext Linking
* I-N
Information and Content Exchange protocol (ICE) - NewsML
* 0-R
OASIS - RSS
* S-W
SAX - Well-formed
* X-Z
Xerces - XSLTC




Our home for PDF software
PDF Store - an extensive range of the key PDF software for creating, editing and delivering PDF files.
Our home for ebook news
Planet eBook - our site focused on eBooks and eBook-related technologies and devices.
Our home for PDF news
Planet PDF - exclusively focused on Adobe Acrobat and PDF users and uses.
Our home for creative news in publishing
Planet Publish - focused on creative design and publishing.

© 2007 Nitro PDF, Inc.. All Rights Reserved. Refresh page.