Extract PDF embedded images using iText.
Hi All, I am trying to extract images from pdf document using iText library.
I can able to find the image input streams from entire document input stream(It return the PdfReader object).
I am trying to create the Instance of image input stream to get the images information embedded in pdf document.
however I am able to create the instance of only JPEG format(.jpg, .jpeg, .jpe).
*** Image imageObject = Image.getInstance(image); **
Not other format images are embedded in PDF document.
Below is the method for extracting images from pdf document.
public void extractImagesInfo(){
try{
PdfReader chartReader = new PdfReader("MyPdf.pdf");
for (int i = 0; i < chartReader.getXrefSize(); i++) {
PdfObject pdfobj = chartReader.getPdfObject(i);
if (pdfobj != null && pdfobj.isStream()) {
PdfStream stream = (PdfStream) pdfobj;
PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE);
//System.out.println("Stream subType: " pdfsubtype);
if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.IMAGE.toString())) {
byte[] image = PdfReader.getStreamBytesRaw((PRStream) stream);
Image imageObject = Image.getInstance(image);
System.out.println("Resolution" imageObject.getDpiX());
System.out.println("Height" imageObject.getHeight());
System.out.println("Width" imageObject.getWidth());
}
}
}
}catch(Exception e){
e.printStackTrace();
}
}
Thank in advance.