Monday, November 20, 2006

SOA for viewable documents

As customers extend their SOA strategies more and more, a question seems to be arising - is SOA a good fit for documents (like PDFs, TIFFs, Word Docs) and other binary content? Of course SOA powered by an Enterprise Service Bus (ESB) or some other mechanism for composing services can handle binary data, its just a question whether it really makes sense to push all of this data through it.

My background is in document imaging. Imaging and high volume document management systems typically have built up extremely functional image and document viewing capabilities over the course of their often lengthy existences. These viewing capabilities were built with the following design criteria:

  • Responsiveness - how fast can a user be presented the specific information they are requesting so they can continue working.
  • Server performance - do not request more data from the server than is really required by the user to view the document. Don't send unnecessary resolution, color or pages, dependent on predefined user requirements.
  • Network performance - in the days before even 100Mb networks were commonplace, managing network usage was important. In large scale, or distributed implementations it still is.
  • Seamless presentation of multiple types - documents come in many different types, and it does not make sense for easy processing of them for the user to have to navigate different native applications, let alone deal with the load time for some of them.
  • Onion-skin annotation and redaction - the ability to mark up any type of document you can view is essential in some environments, without damaging the original document.

Viewer technology was largely based on a thick-client paradigm to allow it to meet most of these requirements. Stellent (to be acquired by Oracle) offers a range of image viewer technology as its Outside In product line, which has been the backbone of many thick-client image viewer apps. Spicer offers a range of viewers, especially focusing on complex CAD formats. There are others as well, but their number is limited.

Even now, there are very few applications that can present thin-client views of documents and meet the previous design criteria. Daeja is one third-party Java applet that can be integrated to meet this type of requirement for image and PDF documents. Global 360 has powerful thin-client image viewing, annotation and capture to support its BPMS and Content products.

The thing with all of these options is that they have traditionally been designed to plug directly into their imaging repository, either through TCP-IP proprietary file transfers, or for the thin-client versions as standard HTTP GET requests. And that is just for the image viewing. The upload of annotations was specific to the application. This does not fit SOA well, where organizations take the approach to extremes and insist that the ESB sits between all end user applications and their servers, using pure SOAP web services.

Many of the advantages of the viewers is the smart ways they access their servers to get best performance. It seems like a poor use of resources to build an SOA layer between a viewer application and its related repository server, just to enforce a dogmatic approach to SOA. Unless it is really justifiable to be able to reuse any viewer technology (which as I say there is limited choice), or allow a single viewer to access any repository (a complex proposition to do well).

Even if ESBs can handle this type of binary files effectively and efficiently, I am struggling to see if this is really a pragmatic approach to SOA. There must be some value to doing this that I have missed, probably based on my outdated background in this area.

Technorati tags:

No comments: