PDF 2 HTML with Comment

Converting PDF to HTML with comments via either a desktop application of a web service.

pdf2hml5PDF2HTML5 scheme creates HTML version of PDF pages in print layout and format. The web page looks exactly the same as the PDF pages. The conversion can be done on document level, creating thumbnails and outlines along with a navigator interface, or one page level, creating AJAX-based single output that hosts multiple embedded pages. It can even work on sub-page level, i.e. converting a cropped part of the page (a “snip”) into HTML or pure image. Link targets can be converted to HTML targets, a feature that is critical for multi-page documentation.

Html5 Conversion

  • Single page, selected pages, or entire document; choice of monolithic or Ajax-friend output.
  • Page can be freely scaled with all elements in proportion, even font sizes.
  • Original drawing order is preserved, with vector graphics automatically grouped and converted into transparent PNG.
  • Choice of Canvas + JavaScript or embedded PNG image for representing vector graphics.
  • Creation of embedded web font for high fidelity.
  • Text can be created with absolute or relative positioning, and can be reflowed for readability.
  • Advanced CSS settings such as for rotated text.
  • Built-in, highly polished Ajax reader.

Advanced Features

  • Create thumbnails for all pages, as well as an auto-folding outlines page.
  • Form fields can be converted to equivalent Html input elements, whose data can be submitted online and merged with original PDF.
  • Comments can be loaded from, or saved to, a separated Xml or Adobe SharedReview data file.
  • Comments can be created, edited, dropped, and exported with built-in JavaScript.
  • PDF properties can be preserved in Html Meta.

Online Demo

Comments are an integral part of the PDF2HTML5 framework — annotations in PDF can be saved into XML that is dynamically loaded by JSON commands and then organized into live HTML elements via jQuery magic. Moreover, user can add new comments by drawing on page directly, a feature heavily dependent on JavaScript programming.

A simplified demo is accessible here. UnionStation, the toolset behind it, is able to handle Adobe Shared Review XML data as well as the more traditional XFDF data. The hierarchy and relationships between the comments are recognized and preserved.



Idealy, comments can be collected and submitted to a server, then the server can dispatch the comments to online users in real time, as well as depositing into a database. This collaboration is beyond the HTML conversion itself.

Desktop PDF/HTML Program

We hereby provide a desktop program called PDF2HTML5 Experience that shows off the HTML conversion part of the project. It is free to use but carries no warranty nor any support, and it has an expiry date. Its purpose is to help explore the various possibilities of HTML conversion, such as zooming level, font substitution, rasterization, use of canvas, thumbnails, etc.

Use the Open button to load a PDF into the page viewer, choose font mapping (use “Auto” button there), and then click “Make” button to start conversion. The result is one or more HTML files plus a folder for image resources, some CSS files and JavaScript files. You can use one of the drop-down menu items on the “View” button to examine the output.



The above screenshot shows a conversion output, enhanced with thumbnail-based navigation.



About the Author: Cyphia