HerGeekness Says: Convert Any File Part III

If you’re new to this series, you may want to start from the beginning:
HerGeekness Says: Convert a File, Any File
HerGeekness Says: Convert Any File Part II

I Have a PDF. Now What?
Okay, you went through whatever rigmarole you had to go through to squeeze a PDF out of the client’s file. That may be all that’s necessary — if you don’t need to edit anything in the client’s file, and the colors are fine, simply import the PDF into your project and place it as artwork. You might need to rasterize the PDF in Photoshop or export it to EPS format from Acrobat Professional if your layout program can’t import PDF files.
More often, though, you want only some of the text and graphics, and you need them in editable format. There are myriad ways to approach the challenge. The ones I describe below work in both Acrobat 8 or 9 (probably in earlier versions, too, but I don’t have any installed to verify that).
First, to extract all the text in a PDF to an external file, suitable for placing into another document and formatting there, open the PDF in Acrobat Pro and go to the File > Export submenu (Figure 1). One or more of the options there should get you on your way.
Figure 1. A quick way to get all the formatted text out of a PDF and into a single file is to open it in Acrobat and let its Export commands do their job.
Export > Image command, which makes an image file out of each entire page in the PDF. Instead, choose Advanced > Document Processing > Export All Images, which results in a folder full of all the photos in the document, suitable for editing or re-using elsewhere (Figure 2).
Figure 2. You can extract every raster image in a PDF into separate image files with the Export All Images command, and then manipulate them and place them in any other program file.
Open command, effectively “converting” it to an editable Illustrator file. It can only open and convert one page of a PDF at a time, but that might be all you need. Use the Selection tool to select the elements you want to isolate, then copy and paste them into separate Illustrator files. Close the PDF without saving changes (Illustrator is not suited for editing PDFs, and it can really mess them up) or work on a copy of the PDF if you don’t trust yourself to stay away from Command/Ctrl-S.
Most often, I need to grab only one element in a PDF — a logo, perhaps, or a raster image. In that case, I use a less drastic approach with my best friend, Acrobat Pro’s TouchUp Object Tool (Figure 3).
Figure 3. The Advanced Editing Toolbar (top) in Acrobat is home to the powerful TouchUp Object tool. To use it, select the tool and then click on the object in the PDF you want to work with. A selection rectangle should appear. Right-click on the selection and choose Edit Object or Edit Image from the contextual menu (middle). Raster images open in Photoshop by default (bottom); vector images and blocks of text open in Illustrator. You can change those defaults in Acrobat’s preferences.
On the Horizon: Round-tripping PDFs
One problem with all of these “extract from PDF” approaches is that the PDF’s page geometry — the layout — can’t be converted along with the text and images. What if a client creates a company newsletter in a program you’ve never heard of? Yes, they can supply it as a PDF, but how is that going to help you convert the sixteen pages of articles, sidebars, rules, and caption treatments to an editable InDesign or QuarkXPress file — a reasonable running start in taking over production?
That’s when you should start looking for PDF conversion utilities that go the other way, from PDF to [your program here].
Recosoft, for example, sells products that convert PDFs to editable, true-to-the-layout-geometry Microsoft Office application formats or to InDesign’s INDD format (Figure 4).
Figure 4. Pages ’08 to InDesign! I exported one of Pages ’08 newsletter template files (top) to PDF using its own Export to PDF command. Then in a copy of InDesign CS3 with the PDF2ID plug-in installed, I opened the PDF, which converted to an editable InDesign layout file in the process (middle). It’s a little clunky, but I can fix the minor problems (such as the multiple text frames per column) in far less time than recreating it from scratch. The PDF2ID plug-in even took care of extracting all the graphics and linking them to the layout (bottom).
PDFToAll, you could create a PDF from the Web pages (in Acrobat Pro, choose File > Create PDF > From Web Page), then use the plug-in to convert the PDF to an Excel spreadsheet.
Recosoft and PDFToAll are the only two companies I know of that are tackling the “convert from PDF” challenge, but I’m hoping we’ll see more as the market demands it.
You’ve Vanquished Weirdo Files
With the help of this three-part series, you should be able to handle almost any unusual file format that comes your way.

Anne-Marie “Her Geekness” Concepción is the co-founder (with David Blatner) and CEO of Creative Publishing Network, which produces InDesignSecrets, InDesign Magazine, and other resources for creative professionals. Through her cross-media design studio, Seneca Design & Training, Anne-Marie develops ebooks and trains and consults with companies who want to master the tools and workflows of digital publishing. She has authored over 20 courses on lynda.com on these topics and others. Keep up with Anne-Marie by subscribing to her ezine, HerGeekness Gazette, and contact her by email at [email protected] or on Twitter @amarie
>