Trazar como mapa de bits en PDF

I am currently working on CGH array results, which involve several plots of dozens of thousands of points, and i would like to benefit from the multiple page feature of the PDF device and the lightness of the PNG image format.

The problem is that the PDF device stores the plots as vectorial drawings, so the PDF files are huge and take several minutes to open. I wonder if R can plot as multiple bitmaps embedded in a single PDF file, as i know the PDF format able to handle it.

Here is a simple example, the PDF file is about 2 Mo while the png ones are about 10 Ko, so I'd like a PDF file of about 20 Ko.

png("test%i.png")
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
dev.off()

pdf("test.pdf", onefile=TRUE)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
dev.off()

preguntado el 08 de noviembre de 11 a las 10:11

I saw this post, it suggests basically what the other said : use a raster. Raster files are lighter and faster to open, but are still far heavier than the classic PNG-embedded PDF files. -

It was because of the R version, in R 2.14.0 it works fine. Shame on me. -

4 Respuestas

Use the png driver to create a PNG file of an acceptable resolution. Make your plot to that. Close the png device.

Then use readPNG from package:png to read it in.

Next open a PDF driver, create a blank plot with no margins and bounds at (0,0) (1,1) and draw the png to that using rasterImage. Add extra pages by creating fresh plots. Close PDF driver.

That should give you a PDF with bitmapped versions of the plots. There's a few tricky bits in getting the plots set up right, and the png resolution is crucial, but I think the above has all the ingredients.

> png("plot.png")
> makeplot(100000) # simple function that plots 100k points 
> dev.off()
X11cairo 
       2 
> plotPNG = readPNG("plot.png")
> pdf("plot.pdf")
> par(mai=c(0,0,0,0))
> plot(c(0,1),c(0,1),type="n")
> rasterImage(plotPNG,0,0,1,1)
> dev.off()

Then check plot.pdf...

respondido 08 nov., 11:17

I used your code on the "test1.png" image created before, it produced a PDF file of 1362 Ko. It seems faster to open, but is still far too heavy. As a comparison, i produced a PDF of 13 Ko with OpenOffice.org Draw including the test1.png manually (without compression). - maressyl

Ugh yeah, looks like pdf() uses a really inefficient method of encoding the pixmap. - Hombre espacial

If I run the generated PDF through 'convert' from ImageMagick it re-encodes the PDF to about a tenth of the size - all you do is "convert file.pdf file2.pdf" and the magick happens. - Hombre espacial

It seems that the problem was corrected in R 2.14.0. When i run your example in R 2.13.1 i get a PDF of 1362 Ko, in R 2.14.0 the PDF file is about 18 Ko, which is fine to me. So your solution is fine, but i prefer O'Brien's one as it involve standard R functions and no intermediate file. Thanks again for your help. - maressyl

Here's a solution that gets you close (50kb) to your desired file size (25kb), without requiring you to install LaTeX and/or learn Sweave. (Not that either of those are undesirable in the long-run!)

Utiliza el grid funciones grid.cap() y grid.raster(). More details and ideas are in a recent R-Journal article by Paul Murrell (warning : PDF):

require(grid)
# Make the plots
dev.new()  # Reducing width and height of this device will produce smaller raster files
    plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
    cap1 <- grid.cap()
    plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6, col="red")
    cap2 <- grid.cap()
dev.off()

# Write them to a pdf
pdf("test.pdf", onefile=TRUE)
     grid.raster(cap1)
     plot.new()
     grid.raster(cap2)
dev.off()

La resultante pdf images appear to retain more detail than your files test1.png y test2.png, so you could get even closer to your goal by trimming down their resolution.

respondido 08 nov., 11:23

I tried grid.cap() and grid.raster(), but it produced also huge PDF files so i gave up. Actually, it was because i was not using the last version of R, in which this solution is exactly what i was looking for. Thanks a lot. - maressyl

Ooh nice. The man himself says "An alternative approach would be produce a PNG file and read that in, but grid.cap() is more convenient for interactive use". I'll be using that in future. The only + for png() is that you can't do grid.cap without an open graphic device, and hence you are stuck to screen resolutions. But neat. - Hombre espacial

@Spacedman -- Yeah, I was glad for this question, as it gave me an excuse to go read that article. At first I thought there might be a solution using base::cairo_pdf, which has this intriguing note in its help file: "Note that unlike ‘postscript’ and ‘pdf’, cairo_pdf’ and ‘cairo_ps’ sometimes record mapas de bits and not vector graphics: a resolution of 72dpi is used." For cairo_ps, you can force that behaviour by using transparency, but I couldn't find an easy way to 'trip the switch' for cairo_pdf. - Josh O'Brien

To include multiple plots in your pdf, set onefile = TRUE.

pdf("test.pdf", onefile = TRUE)
plot(1:5)
plot(6:10)
dev.off()

To make those plots PNGs rather than native PDF plots will require a tiny bit more effort. Create all your plots as PNGs, like so:

png("test%01d.png")
plot(1:5)
plot(6:10)
dev.off()

Then create a LaTeX document that includes those PNGs. You can do that from R by using Sweave (but how to do that is big enough to be its own question). There's a decent introductory example aquí.

respondido 08 nov., 11:14

Thanks, i am not familiar with LaTeX and working on a Windows workstation where i can not install it, so i would prefer a R only solution, but i keep yours in mind. - maressyl

How abouta Sweave solution?

\documentclass[a4paper]{article}
\usepackage[OT1]{fontenc}
\usepackage{Sweave}
\SweaveOpts{pdf = FALSE, eps = FALSE}
\DeclareGraphicsExtensions{.png}

\begin{document}

\title{Highly imaginative title}
\author{romunov}

\maketitle

<<fig = TRUE, png = TRUE, echo = FALSE>>=
    plot(1:10, 1:10)
@

\end{document}

respondido 08 nov., 11:16

Thanks a lot for this complete example, as i said avoiding the pdflatex installation would make my life far easier. If i cannot avoid it, i think i will invest some of my time into sweave, which seems to be an interesting feature. - maressyl

If you take the plunge, here's a little head start: r-bloggers.com/… - Roman Luštrik

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.