UnionStation Explorer: a unique tool for exploring the internals of diversified documents in a common way.
UnionStation can decode multiple types of files: images such as Jpeg, Tiff, Png, Ico, Gif; page-based documents such as PDF and AFP; font files such as TrueType, Type-1, CID fonts; zip based formats, Office documents, and other commonly seen files. Its strongest advantages are in PDF and AFP technologies and font/image handling. The following introduction uses PDF as the example, but the concepts and practices are applicable to all other data formats as the explorer itself is neutral to such specific knowledge.
|Supported File Types: AFP, BMP, CFF, CID, CSO, CSV, DB, DOC, DOCX, EMF, FDF, GIF, ICC, ICM, JPG, MNG, OTF, PDF, PFB, PNG, PPS, PPT, PPTX, PUB, RAT, SVG, TAR, TIF, TTC, TTF, VSD, WDP, WMF, WOFF, WPD, XLS, XLSX, XML, ZIP|
The explorer is a utility to drill inside a given document, say a PDF document, as if it were a collection of folders and items, revealing and visualizing as much information as it can. It can also dive into the internals of files embedded in a PDF such as a font. The tool is a pure C#.NET creation, sporting a clean, intuitive, yet powerful interface. As the program evolves, the interface might be slightly different from the screenshots.
* The majority of the function does not expire at all, so you can keep using it. However, the object actions (often used to create disk files) and the abilities to save as new files (so far featuring PDF) are disabled until a license is obtained. Please use the form below to contact.
Part 1: Basic Use
Once you open a file, you can see a collection of items in the initial screen, including virtual “folders” and “items”. For example if you are opening a PDF file, there would be a virtual folder called “Pages” and an entry for the document information. Some of the items are expandable, such as the “Resources” item, which can be expanded into a folder on demand — double-click to see exactly what resources are carried by this PDF. It is not expanded unless you do so, because it is a time-consuming operation.
When a “folder” or “expandable item” is expanded, you have a choice of expanding it in a separate card (table) that is displayed side-by-side, where each card can be independently explored.
Viewing Multiple Collections
The program supports loading multiple files at the same time, while viewing multiple folders of a same document in a slider fashion. It would be very convenient to examine the internals of several different data items simultaneously, without having to browse vigorously in a plain text editor.
Double-click a collection will lead you in; press Ctrl key while double-clicking will open a new slide tab. Or, you can select a collection or an expandable item then press the arrow button in the bottom toolbar for same effect. Each of these additional tabs can be closed.
Radial Menu and Visualizers
When you select an item or folder, the radial menu for visualizers will appear, providing you many choices of visualizing the corresponding data object, be it xml, image, or a renderable page. The available ones are marked with a circular arc. Some items support custom visualizations, which is a submenu accessible by clicking on the arrow part of the border. However, before we talk more about the visualizations, there is a powerful feature we should introduce first.
Did you notice the little eye symbol at the bottom left corner that is highlighted? It controls whether the radial menu appears or not. If it is checked, then yes, and the radial menu in the form of an eye will be floating above. It is expanded into full view when you click on it.
Item Property Grid
A list view item can be further examined for properties if you click on the magnifying glass icon. This is where you can explore the details of an object to help diagnose whatever problem you are working on.
This is one of the most wanted features: you can view the raw PDF data in a convenient way, without having to load the file in a proper text editor that can deal with binary parts in the PDF, and hoping when you save the file it won’t become invalid piece of garbage. If your job involves debugging PDF then you would appreciate the hierarchical representation of raw data, especially the ability to load referenced objects in the same view in a pop-up window — just click on the drop-down arrow.
Data Exploring and Editing
In fact, when browsing PDF, you can edit a dictionary, an array, a stream object’s dictionary, or even a string in a GUI, each in a different way due to the nature. Just click on the drop-down arrow that would appear when the corresponding data entry is selected. Note the green check button – it means you changes can be saved. The modified content can then be saved into a new PDF file using the “Save” button at the top right corner. Be extremely careful when you do such modifications — PDF is a very sensitive data structure, you must know what you are doing.
Part 2: Data Visualization
As mentioned above, there are several different ways you can visualize an item. One typical way is to view the content of a page — not just a page, but an XObject, an annotation’s appearance, a form field’s appearance, etc. The viewer comes with convenient functions such as highlighting the bounding boxes of items, showing background as transparent, showing the names of XObjects or annotations on screen, and even printing. All of them are valuable tools for developers.
Some times you want to view the raw bytes of a stream object, or the decoded raw PDF operation commands, or the content of an XML item such as the meta data or form definition — all these wishes are granted, with more than you asked for. The text editor can highlight the content according to syntax; the XML viewer can show tags in a tree; and the byte viewer can show base64-encoded version, or convert from base64 – good for sending the data via email between colleagues. Here are some examples:
To format text, click on the drop-down arrow first to select a language, then click on the “Format” button. You can switch between fixed- or variable-width fonts. When you change the text, you should reapply formatting. The text viewer automatically converts inline binary data, such as binary values in strings or inline image data, into hex-encoded text form, so that you can view them correctly.
A very important feature is that you can actually save changed content, if you are editing a page or an XObject. Again, be sure about the syntactic and semantic validity of the text. All output files are for research purpose only.
Right clicking on the ellipsis of the toolbar will expand it to reveal a couple of buttons, one of the them is to visualize the image-visualizable items in an album fashion. This can be quite useful when you are examining the glyphs in an embedded Type-1 or TrueType font, viewing all the shared images found in document, or the inline images found on a page.
What’s more, you can save selected images into disk files. A similar function is found in the single image visualizer, however the visualizer shows the image in its original dimension and can zoom, flip, or rotated images, while the album view shows but a thumbnail that would not exceed a predefined dimension.
Some items or collections have custom visualization methods. For example, the “Pages” collection of a PDF or “Page Group” of an AFP can be shown with thumbnails, easy to browse and switch. Other visualization methods include: incoming or outgoing reference map of objects, document structure tree map, or, the data structure itself.