pdf - PDFClown image extraction images inverted -
i'm working pdfclown , i'm trying extract images pdf file. use example code provided source code can found @ http://pdfclown.org.
imageextractionsample.java.
the problem images negative , flipped horizontally. know how resolve problem?
check other pdf files see if other pdf files giving rotated or flipped images. imageextractionsample.java not checking rotation or matrix defined transformations image object writes content file (so work jpg images not ccit encoded images example).
so there things consider when extract image pdf:
- image can rotated using attached transformation matrix (ctm);
- image can rotated/transformed part of form transformed;
- image can placed without transformation on page page rotated;
- image may contain overlaid mask on top of (and mask can rotated , transformed);
- jpg image stored pretty there other formats supported pdf ccit compression, lzw compressed images etc;
but general suggestion when extract jpg image pdf using pdfclown should flip , rotate extracted images suggested on sourceforge project discussion page.
if point particular pdf sample file easier suggest solution.
if you're on windows may use free pdf multitool utility compare non-transformed , transformed images pdf using "extract raw images (without transformation)" option in images extraction dialog.
disclaimer: work bytescout, pdf multitool utility free both commercial , non-commercial purposes.
Comments
Post a Comment