DocEvents Platform means event-driven iteration of generic documents as nested sets of items.
UnionStation framework refers to a highly abstracted way of treating various types of documents in a similar way. Based on object oriented methodology including various design patterns, the resultant system architecture is able to enable a number of vastly different applications and in varying environment such as web and desktop. The universal data explorer dives into the internals of files of many different formats from documents to records, from fonts to images. It is a great research and diagnosis tool for developers.
DocEvents is the name for a document processing platform that builds on UnionStation framework. It is designed for processing multi-page documents or multi-record datasets. There are two key concept involved. Firstly, a document is viewed as a sequence of objects, and the document itself takes care of how to iterate these objects. The platform doesn’t dice into the document to discover things. Secondly, at each step of iteration, a set of custom-defined actions is executed. In short, a document is like a “process”, while the processing actions are like “objects”.
Comparison with prior arts
To begin with, let’s look at how people have been doing the job. At first, each specific operation requires a different tool: tasks like merging documents into one, extracting indices about pages, splitting a document into smaller ones, etc. People tried hard to use many different tools to collaborate, and the makers of such tools created tens of parameters to offer flexibility — yet the more they went to that direction, the more obstacle they created for themselves. Such an old-fashioned workflow has these drawbacks:
- Discrete workflow, rigid restrictions, limited functionalities each
- Loss of information during repeated re-processing
- Maintenance issues between new and old tools
- Lots of tweaking of parameters and managing intermediate products
Later on people proposed a more unified production line model, involving multiple components running in a hosted space / platform, while all of these components are created by the same vendor, and are kin to each other. Each component still plays its own predefined roles, however data sharing and processing control is better done. The model has these characteristics:
- Documents are broken into pages to be passed around
- Each component runs independently but can’t be controlled individually
- Based on pages only; can’t deal with data records or page groups etc.
- Communication among components are poorly maintained
Now let us show off DocEvents
In short, it is like a car that drives by itself. A document is no longer a static stack of pages waiting to be processed by however intelligently weaved workflows. Rather, the document is blessed with the ability to iterate and process its pages on its own; in fact, tools are provided not for the workflow designers, but are given to this document.
- A document is an intelligent process
- Document iterates itself as a series of pages, records, or anything
- Document controls the workflow via event-based dynamic logics
- A business logic is a custom-built gadget
- Multiple paths of processing logic can be hooked on the same document
- Each path can work on a different type of items for different purposes
Efficient in-situ operations
Minimized data migration or transformation
Maximized preservation and usage of original data
Not susceptible to loss in translation
In this drawing, the “components” 1, 2, 3, 4, etc. are not true gadgets from a so-called toolkit or widget collections you would see elsewhere. As described above, each is a process that is created blessed from XML into live actions. Simplicity is power.