Available in pre-release 3.7
Usage:
SumatraPDF draw [options] file [pages]See all-options below.
You can use
draw to:- convert PDF to other formats (image, html, svg)
- extract, delete pages
- extract text
Example usage #
Let’s assume you have
foo.pdf with 8 pages.Extract 2nd page as png image #
SumatraPDF draw -o foo-page-2.png -F png foo.pdf 2Rotate pages #
Use
-R 90|180|270 option to rotate pages.Extract each page as PNG image #
SumatraPDF draw -o "foo-%d.png" foo.pdfConvert PDF to PNG #
SumatraPDF draw -o "foo-%d.png" foo.pdfChange output image size #
Each page in a PDF has a given width / height.
When converting to an image (like PNG), you can change the size of the output image:
SumatraPDF draw -o "foo-%d.png" -w 400 -h 800 foo.pdfConvert PNG to PDF #
SumatraPDF draw -o foo.pdf foo.pngConvert PDF to self-contained HTML file #
SumatraPDF draw -o foo.html foo.pdfConvert PDF to self-contained SVG file #
SumatraPDF draw -o foo.svg foo.pdfExtract all text from PDF #
SumatraPDF draw -o foo.txt foo.pdfStructured text #
In PDF text really consists of characters positioned in a page.
If you want to see detailed information about text in PDF, especially for further programmatic processing, you can extract structured text which is: font, glyph (character), position.
Extract structured text from PDF in XML format #
SumatraPDF draw -o foo.stext foo.pdfIn XML format it might look like:
<font name="CharisSIL" size="7.9701">
<char quad="187.4652 295.9985 191.96033 295.9985 187.4652 301.9683 191.96033 301.9683" x="187.4652" y="301.871" bidi="0" color="#000000" alpha="#ff" flags="16" c="d"/>
Here it shows that letter
d in font CharisSIL is at a given x/y position in the page.Extract structured text from PDF in JSON format #
SumatraPDF draw -o foo.stext.json -F stext.json foo.pdfIt’s the same information but in JSON format.
All options #
usage: SumatraPDF draw [options] file [pages]
-p - password
-o - output file name (%d for page number)
-F - output format (default inferred from output file name)
raster: png, pnm, pam, pbm, pkm, pwg, pcl, ps, pdf, j2k
vector: svg, pdf, trace, ocr.trace
text: txt, html, xhtml, stext, stext.json
ocr'd text: ocr.txt, ocr.html, ocr.xhtml, ocr.stext, ocr.stext.json (disabled)
bitmap-wrapped-as-pdf: pclm, ocr.pdf
-q be quiet (don't print progress messages)
-s - show extra information:
m - show memory use
t - show timings
f - show page features
5 - show md5 checksum of rendered image
-R - rotate clockwise (default: 0 degrees)
-r - resolution in dpi (default: 72)
-w - width (in pixels) (maximum width if -r is specified)
-h - height (in pixels) (maximum height if -r is specified)
-f fit width and/or height exactly; ignore original aspect ratio
-b - use named page box (MediaBox, CropBox, BleedBox, TrimBox, or ArtBox)
-B - maximum band_height (pXm, pcl, pclm, ocr.pdf, ps, psd and png output only)
-T - maximum number of rendering threads (banded mode only, limited to number of bands per page)
-W - page width for EPUB layout
-H - page height for EPUB layout
-S - font size for EPUB layout
-U - file name of user stylesheet for EPUB layout
-X disable document styles for EPUB layout
-a disable usage of accelerator file
-c - colorspace (mono, gray, grayalpha, rgb, rgba, cmyk, cmykalpha, filename of ICC profile)
-e - proof icc profile (filename of ICC profile)
-G - apply gamma correction
-I invert colors
-A - number of bits of antialiasing (0 to 8)
-A -/- number of bits of antialiasing (0 to 8) (graphics, text)
-l - minimum stroked line width (in pixels)
-K do not draw text
-KK only draw text
-D disable use of display list
-i ignore errors
-m - limit memory usage in bytes
-L low memory mode (avoid caching, clear objects after each page)
-P parallel interpretation/rendering
-N disable ICC workflow ("N"o color management)
-O - Control spot/overprint rendering
0 = No spot rendering
1 = Overprint simulation (default)
2 = Full spot rendering
-t - Specify language/script for OCR (default: eng) (disabled)
-d - Specify path for OCR files (default: rely on TESSDATA_PREFIX environment variable) (disabled)
-k -{,-} Skew correction options. auto or angle {0=increase size, 1=maintain size, 2=decrease size} (disabled)
-y l List the layer configs to stderr
-y - Select layer config (by number)
-y -{,-}* Select layer config (by number), and toggle the listed entries
-Y List individual layers to stderr
-z - Hide individual layer
-Z - Show individual layer
pages comma separated list of page numbers and ranges