Converting PDF Files to HTML

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this session we'll discuss how to convert a PDF file into an editable HTML file while maintaining all the images tables hyperlinks and even the table of contents now there have been a few improvements in this version of Acrobat with the PDF HTML conversion process but you may find that you still need to clean up the HTML file in a tool like Adobe Dreamweaver after it's been exported if you're following along i've got the board manual dot PDF file open and before we convert this file to HTML i just wanted to point out a few things about the file this particular file contains an interactive table of contents where I can click any of the entries it has headers and footers on some of the pages you'll find bulleted lists or numbered lists and this page has a table as well as graphics and this particular file also contains bookmarks that you can select to navigate to various sections of the document now to convert this file to HTML simply select the file save as other HTML web page command in the save as dialog there's a really important button that you need to select first called settings which I'll go ahead and do and in the settings dialog you can choose whether you want to produce a single HTML page or multiple pages and for each of these options your existing headers and bookmarks can play a key role for example if you choose single HTML page you have the option to create a separate HTML navigation frame based on your headings or bookmarks now since I know this file has bookmarks I'm going to go ahead and choose the bookmark option now if you wanted multiple HTML pages which is much easier to navigate if you have a long document you can also break apart the file they on your headings or bookmarks the multiple HTML page selection will also create a separate folder on your system with all the individual HTML pages and images I'm going to select single HTML page and then under the content settings section you can choose to include your images and to remove headers or footers which is important since the page numbering will no longer be the same I'll keep these both checked and the last setting is helpful if your PDF content originated as a scanned document I'll click OK and save and now Acrobat will run through the process of converting the file to HTML now I have opened the resulting file in Firefox it produces an XHTML 1.0 or HTML 4.01 conforming document you'll see that there is a separate navigation pane on the left with all the links that were created from the PDF bookmarks I can click on any one of these to navigate in the file the table of contents at the top is still completely interactive various headings have been converted to the HTML styles numbered list are converted to ordered lists in HTML so they're very easy to update the table up here is still completely formatted and then of course the headers and footers have been removed and the graphic came across as well in the center here now you may have noticed that there are no specific controls for the graphics so if you need more precise control over the format and the DPI of a graphic you might consider using the export all images command this is located in the document processing panel and here you can choose various formats for export specifically for the web you can choose jpg PNG or JPEG 2000 and then under the Settings button you have additional options now if you want to set the defaults for the HTML export like whether you choose single or multiple HTML pages all the time this can be found under the Edit preferences or just preferences on the Mac choose the convert from PDF category and then select HTML and click on the Edit Settings button here you'll find the same settings as we found in the save as dialog you can also selectively export content in your file if you don't need everything to do this simply make sure the select tool is selected in Acrobat grab the content right-click or control-click on the Mac and choose export selection as and then in the dialog make sure that hTML is selected in the save as type I'm going to cancel out of here now I want a final note if you have the pro version of Acrobat you can convert PDF files to HTML in a batch process by using an action just open up the action wizard select create a new action button choose the save and export panel click on save and add it to the right column click on specify settings click on the export file button and choose HTML also remember that it's not possible to convert a PDF to HTML if document security has been set to disallow changes and it's also not possible to convert forms that have been designed in Adobe LiveCycle designer to HTML
Info
Channel: Lori Kassuba
Views: 184,371
Rating: undefined out of 5
Keywords:
Id: ZTqtPginCEI
Channel Id: undefined
Length: 6min 27sec (387 seconds)
Published: Tue May 21 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.