November 22, 2006

What's in a file?

Confusion abounds when people talk about file formats:
Which one should I use? But my printer told me to always use this format? I heard that the other format isn't good. Illustrator's native file format is PDF. InDesign can read native Illustrator files. EPS is dead to me. Always save your file as EPS. PDF solves all problems. PDF results aren't high quality...

But the real question you should be asking is:
What is actually *IN* a file anyway?

If you understand what a file is, and what's in it, you can answer all of the questions above, and then some. Several years back, I posted something like this to the PrintPlanet CTP list. But it's wonderful information to have in any case. You just might want to print this out, shrink it down, and tattoo it to the back of your hand, for easy reference :)

Illustrator Files
When you save files from Illustrator (either using the Save or the Save As command from the File menu), the three mainstream choices you have are Adobe Illustrator Document (.ai), Illustrator EPS (.eps), and Adobe PDF (.pdf).

- Adobe Illustrator Document (.ai)
This is a native Illustrator format. Only Illustrator is able to read native data. While this data is based on the PDF Langauge specification, it isn't a file that any other application can read (including Acrobat, InDesign, or Photoshop, etc). This format retains ALL editability in your file, and is the one you should always internally for your own use (for use, archiving, storage, etc.). You may have heard that native .ai files are PDF or that InDesign can read native .ai files. That isn't true. What *IS* true is that when you save a native Illustrator file, Illustrator also include a PDF 1.4 composite of that file inside the file as well. So a native .ai files isn't a PDF file, a native .ai file CONTAINS a PDF file.* The PDF 1.4 file supports transparency, so both the native .ai portion and the PDF 1.4 portion of the file are both in an unflattened state. It should also be noted that the native portion of the file that is saved is obviously saved for that version of Illustrator (each version of Illustrator has its own native version). So when you're saving your file out of Adobe Illustrator CS2 as an Adobe Illustrator Document (.ai), you get a single file that contains:
- Native Illustrator CS2 content (unflattened) - used when file is reopened in Illustrator
- PDF 1.4 content (unflattened) - used when file is opened or placed anywhere else*

*In the Illustrator Options dialog that appears when you save a native .ai file, a checkbox called Create PDF Compatible File, marked on by default, determines whether the PDF portion of the file is included when the file is saved. With the option turned on, your file size grows, but the file can be read by apps like InDesign and Photoshop. Turning it off will chop the file size in half (and speed up save time), but the file will only be able to be reopened in Illustrator.

- Illustrator EPS (.eps)
Standing for Encapsulated PostScript, EPS is a format that is supported by a majority of graphics applications. PostScript does not support transparent constructs, so an EPS is a flattened file format. Illustrator can interpret (or parse) EPS content into its own native format when opening files, and it can write or convert its own native format to EPS as well. In that translation, flattening occurs, and general editiablity is lost as well (effects are expanded, text is broken apart, etc.). Therefore, Illustrator will also include a native .ai version in the file, so that should you ever reopen the file in Illustrator CS2 again, all of your artwork will be fully editable.* So when you're saving your file out of Adobe Illustrator CS2 as an Illustrator EPS (.eps), you get a single file that contains:
- EPS content (flattened) - used when file is opened or placed anywhere else
- Native Illustrator CS2 content (unflattened) - used when file is reopened in Illustrator CS2*

*Remember that Illustrator saves its native content for the version that you specify. So if you save your file out as an EPS file compatible with Illustrator 8, then the native Illustrator data that is saved along with the file is Illustrator 8 data -- a format that didn't support transparency. Also, saving back to previous CS versions mean you're going back to the pre-new text engine versions, and text won't be editable, even if the file is reopened in Illustrator CS2.

- Adobe PDF (.pdf)
While Illustrator's native file format is based on the PDF language specification, there are many constructs that Illustrator uses that aren't supported in PDF directly (reflowable text, styles, effects like 3D, blends, etc.). So when you save your file as a PDF, Illustrator writes its data out so that it can be read in a format that any PDF reader (or app that can place PDF) understands. To retain full editability upon reopening the file in Illustrator, a native CS2 version of the file is also saved inside the file.* So when you're saving your file out of Adobe Illustrator CS2 as an Adobe PDF (.pdf), you get a single file that contains:
- PDF content (flattened if you choose PDF 1.3, unflattened otherwise) - used when file is opened or placed anywhere else
- Native Illustrator CS2 content (unflattened) - used when file is reopened in Illustrator*

*In the Save Adobe PDF dialog that appears when you save a PDF file, a checkbox called Preserve Illustrator Editing Capabilities, marked on by default, determines whether the native CS2 portion of the file is included when the file is saved. With the option turned on, your file size grows, but the file can be reopened in Illustrator. Turning it off will chop the file size in half (and speed up saving time), but the file won't be very editable if reopened in Illustrator. As an aside, since we usually send PDF files to clients anyway, and not only do we want smaller file sizes so they email faster, but we have no interest in giving the client the ability to open the file themselves for editing in Illustrator, turning this option off makes a lot of sense. However, when sending files to printers, leaving the option on means that they can reopen the file in Illustrator to make tweaks or adjustments if necessary.

PDF Versions
While you can save PDF files from Illustrator, there are different versions of PDF. In reality, there are two variables here. There are versions of Acrobat, and there are versions of the PDF Language specification (referred to by us techie people as PDFL). Just in case you get confused, an easy way to remember which PDFL goes with which version of Acrobat, is to just add up the numbers (PDFL 1+4 = Acrobat 5, etc.)

- Acrobat 4 (PDFL 1.3) - predated the days of transparency. So in terms of flattening, think of PDF 1.3 as though it were EPS (with a few added benefits). PDF 1.3 files are always flattened. PDF 1.3 does support CMYK and Spot Colors. This version also introduced smooth shading technology into PDF and digital signatures.

- Acrobat 5 (PDFL 1.4) - first version of PDF to support transparency. When you save a native .ai file from Illustrator, the PDF that Illustrator embeds in the file so that other apps can read the file, is PDF 1.4. Besides support for transparency, this version also introduced XML-tagging and metadata support. A PDF 1.4 is not a flattened format. The only way to get flattened content into a PDF 1.4 file is to manually flatten the content within Illustrator, on the artboard before you save it.

- Acrobat 6 (PDFL 1.5) - probably the most popular version of Acrobat reader and app installations today. Obviously, an unflattened format, PDF 15. introduced the concept of PDF layers, and allows JPEG2000 compression.

- Acrobat 7 (PDFL 1.6) - Same as 1.5 but has added object-level metadata support, and AES encryption.

- Acrobat 8 (PDFL 1.7) - Hot off the press, Acrobat 8 only started shipping a short time ago. Illustrator CS2 can't save in this format, but future versions of Illustrator should be able to.

PDF Standards
In an effort to achieve some kind of level playing field for PDF files, several standards have been established. These are all regular PDF files, but a PDF can be validated to meet the requirements defined by these standards.

- PDF/X-1a - has become the standard for PDF file within the printing and publishing community. Among other things, PDF/X-1a requires that all fonts are embedded, transparency is flattened, colorspace is CMYK and/or spot, and that the file is PDF 1.3 compatible.

- PDF/X-2 - created specifically for OPI workflows, where hi-res data is swapped in for FPO data at print time. This is problematic in transparency workflows, as hi-res data is needed at flattening time. Highly specialized, this format is used mainly in packaging workflows.

- PDF/X-3 - seen as the "next" step for printing workflows, PDF/X-3's main difference from PDF/X-1a is that RGB data is allowed, on the condition that an intent profile is present. This allows printers to attain greater control over their own color conversions and color integrity, as the conversion from RGB to CMYK happens on their watch -- not the designer's.

- PDF/X-4 (Draft) - still in draft form, the main attribute of PDF/X-4 is that it will allow transparency. This would obviously enable the printer to handle the flattener settings.

- PDF/A - I doubt anyone reading this will ever have use for this, but PDF/A is a standard that has been introduced to assist in the archiving of data. If you think about it, there are some files that we may have worked on 10 years ago, which aren't supported in any apps today. While the file may be sitting on a CD somewhere (or most likely, a Syquest cartridge), you don't have any app that can open it. PDF/A is a standard that was established to ensure that today's files will be accessible in the future. The largest group of people who utilize this standards are those in the government, and in medical and insurance fields, who need to archive huge amounts of data electronically.

So that brings us to the end of our discussion. Hopefully we've all learned a little something today. Now go save some files!

20 comments:

Naina Redhu said...

Thanks. I don't know what else to say! I'm still learning how to send correct files to the printer/client. There are just too many variables to allow for one best solution!

printingworld said...

nice site
http://www.printingworld.org

A process in which an image is reproduced on a surface, such as paper. There are five general classes of printing processes:
relief printing, which includes letterpress and flexography; planographic printing, which includes offset lithography,
screenless lithography, collotype, and waterless printing; intaglio, which includes gravure, steel-die, and copper-plate engraving;
stencil and screen printing; and electronic printing, which includes electrostatic, magnetographic, ion or electron deposition, and
ink-jet printing.

O.K. Sauceman said...

This is really helpful and useful. I must admit I'd never realised the obvious point about adding the PDF version numbers together to get the Acrobat version before.
Just one question, when you savea Tiff it asks if you want Mac or PC encoding. Does this really matter in practice?

Scott Citron said...

Great explanation of an otherwise confusing topic. Thanks, Mordy!

Scott

Mordy Golding said...

Sauceman,

Good question about the Mac/PC setting when saving a TIFF. While I'm aware that Macs and PCs store data in opposite ways (think of reading right to left and left to right), I also think that today's modern operating systems are capable of dealing with both equally. So maybe this setting is for compatiblity with older systems.

Anonymous said...

I learned a lot in this article, but I still can't figure out how to cut out something that I've created in Illustrator and paste it into MSWord or Publisher. Will rasterizing the artwork accomplish this?

Mordy Golding said...

Your best bet is to choose File > Save for Microsoft Office.

Dean Kennedy said...

@anonymous: believe it or not, I use Illustrator for MS Word graphic needs quite a bit, for templates, newsletters etc. And I don't have to save or export a graphic (I just keep the .ai file if I need it for later reference).

If there are fonts in the ai file design, I outline them first, just copy to the clipboard and paste straight into MS Word. Viola! Comes into Word as an inline graphic, straight off the clipboard.

That's using Illustrator CS2 to Word 2003 on a PC, although used to work file too with AI 10.

It works perfectly -- and much easier than taking .ai to Photoshop and creating an LZW compressed tif to place in Word.

Generally, MS Word art isn't designed to be extracted again once it is placed (another serious long term downfall against a now sadly insignificant but far superior WordPerfect) -- without some specialised knowledge.

Actually, if you want to do this, I have two ways to do it: (1) enlarge to 100% and then copy to Freehand and then extract from Freehand or (2) use DocRepair utility to "repair" a working .doc file, but ask it to "support embedded images retrieval."

Putranto Sangkoyo said...

Very good blog ! We have a catalogue created in AI and its pages consists of many AI files. Each AI file has 5 to 6 images in it. I am new to AI and is there any way I can batch convert images within AI to JPEG. Do I have to open each single AI file and then pick/select an image object and then convert to JPEG ? Thank you.

Anonymous said...

Yeah, useful explainations.

I'm using illustrator to fine-tune graphs produced by matlab in eps format, before inserting them in a LaTeX document, again as eps. Original eps files are around 12 Kb, and after just saving as, they are almost 1Mb !!! I understand that an ai representation is imbedded in the illustrator eps, but this size increase is ridiculous!

Isn't there a way to save the "pure" eps only? Or to strip off the ai part afterwards?

Thanks,
mitch

Watch Beijing 2008 Olympics Online said...

i get a lot of this explanation. thank you for sharing

drwitt said...

Thank you very much - this is very useful :-) What about considering PDF/X-4 as a newer format suporting "real" transparencies?
Regards, Carsten.

Scott said...

"EPS is dead to me" I have to say when I read this it caught me a bit off guard. Almost every logo file we send and receive is EPS. I have, though, noticed the limitations you refer to but just never thought about using anything else. So my question is, what do you recommend be the replacement to eps? Ai files? PDF? Is the industry itself already starting to move to another filetype? It needs to be something that people not using the creative suite can still open.

Mordy Golding said...

PDF is getting there. My "EPS it dead to me" statement is obviously applied when you have the option to use a different format. If you need to use EPS in order to support someone else's workflow, then so be it. But where EPS was once the "defacto" format I always used, it's now just a format that I use only when I have to. I stick to native AI and PDF these days.

Scott said...

Thanks for the quick response! I have to say this is one of the most helpful blogs I've ran across to date. The more I think about pdf replacing eps 1. the more it seems like it would work and 2. the more I'm liking the idea or opening up all the pdf features. One problem we have using eps is that I'm the only one in my whole company that can view eps files. So I'm constantly sending eps files and having to create a jpeg preview to go with it just so my supervisors can see what I'm sending them.

Again, thanks for your willingness to help.

Scott said...

When you save for pdf, it brings up the dialog box asking you to plug in all your export settings. Okay so when you save the .ai file, what settings does it save the "pdf side" at since the only option you have is "save a pdf compatible file"?

Anonymous said...

I recently switched to CS4 and found a problem with my files from the older version. For a project I do weekly, i reuse the previous week's work as my template and just re save under a new name. Trouble is the old version's document was about 22MB and now when I resave in CS4 (with no difference or changes yet made) it doubles in size to about 44MB. Is there something I am missing in the saving process? Any help you can give me would be great as this is starting to cause issues because of the large file size.
Thanks!

Anonymous said...

This blog is very helpful! Thank you. One question: In Illustrator, does saving a file as a PDF Adobe 4.0 (With Illustrator editting capability on) lower the quality of the file verses saving it as an EPS? Do most vendors prefer PDF or EPS?

Anonymous said...

I'd like to point out that many other programs can read a native .ai file (without included .pdf content).....

I use .ai (ver. 3 or 8 export from CS4) for use in 3D programs to obtain a specific path or paths (including if paths are set as "compound"). In the 3D program one can extrude path(s) to 2.5D (parallel sides/constant thickness 3D) and use these as true 3D objects.

Of course the 3D program does not have the same info for use as does Illustrator, but the file used is indeed a native .ai file format; read for specific uses in 3D programs.

Similarly, other programs' native file formats can be used in other programs; i.e some programs can read a .xls file, others can read and use .doc, and some 3D programs can read a SketchUp native .skp file and see/use all the 3D info.

So be careful to NOT say that any native file format is only for use by the program that created it natively.

Doug.S

Anonymous said...

When saving as a pdf in Illustrator CS5, the default presets are Standard: None, Compatibility: Acrobat 6. Why does it not save as Acrobat 8 if that's the most recent? And does it matter to set the Standard to None or PDF/X-4?