private void InitializeIgnoredObjectPath(CleanUpImagesCompareTool.PageImageObjectsPaths cmpPageObjects, CleanUpImagesCompareTool.PageImageObjectsPaths outPageObjects) { try { IList <PdfIndirectReference> cmpIndirects = cmpPageObjects.GetIndirectReferences(); IList <PdfIndirectReference> outIndirects = outPageObjects.GetIndirectReferences(); PdfIndirectReference baseCmpIndirect = cmpIndirects[0]; PdfIndirectReference baseOutIndirect = outIndirects[0]; ObjectPath baseObjectPath = new ObjectPath(baseCmpIndirect, baseCmpIndirect); for (int i = 1; i < cmpIndirects.Count; i++) { baseObjectPath.ResetDirectPath(cmpIndirects[i], outIndirects[i]); baseCmpIndirect = cmpIndirects[i]; baseOutIndirect = outIndirects[i]; } foreach (Stack <LocalPathItem> path in cmpPageObjects.GetDirectPaths()) { ignoredObjectPaths.Add(new ObjectPath(baseCmpIndirect, baseOutIndirect, path, baseObjectPath.GetIndirectPath ())); } } catch (Exception) { throw new ArgumentException("Out and cmp pdf documents have different object structure"); } }
private IDictionary <int, CleanUpImagesCompareTool.PageImageObjectsPaths> ExtractImagesFromPdf(String pdf, String outputPath) { using (PdfReader readerPdf = new PdfReader(pdf)) { using (PdfDocument pdfDoc = new PdfDocument(readerPdf)) { IDictionary <int, CleanUpImagesCompareTool.PageImageObjectsPaths> imageObjectDatas = new Dictionary <int, CleanUpImagesCompareTool.PageImageObjectsPaths >(); for (int i = 1; i <= pdfDoc.GetNumberOfPages(); i++) { PdfPage page = pdfDoc.GetPage(i); CleanUpImagesCompareTool.PageImageObjectsPaths imageObjectData = new CleanUpImagesCompareTool.PageImageObjectsPaths (page.GetPdfObject().GetIndirectReference()); Stack <LocalPathItem> baseLocalPath = new Stack <LocalPathItem>(); PdfResources pdfResources = page.GetResources(); if (pdfResources.GetPdfObject().IsIndirect()) { imageObjectData.AddIndirectReference(pdfResources.GetPdfObject().GetIndirectReference()); } else { baseLocalPath.Push(new DictPathItem(PdfName.Resources)); } PdfDictionary xObjects = pdfResources.GetResource(PdfName.XObject); if (xObjects == null) { continue; } if (xObjects.IsIndirect()) { imageObjectData.AddIndirectReference(xObjects.GetIndirectReference()); baseLocalPath.Clear(); } else { baseLocalPath.Push(new DictPathItem(PdfName.XObject)); } bool isPageToGsExtract = false; foreach (PdfName objectName in xObjects.KeySet()) { if (!xObjects.Get(objectName).IsStream() || !PdfName.Image.Equals(xObjects.GetAsStream(objectName).GetAsName (PdfName.Subtype))) { continue; } PdfImageXObject pdfObject = new PdfImageXObject(xObjects.GetAsStream(objectName)); baseLocalPath.Push(new DictPathItem(objectName)); if (!useGs) { String extension = pdfObject.IdentifyImageFileExtension(); String fileName = outputPath + objectName + "_" + i + "." + extension; CreateImageFromPdfXObject(fileName, pdfObject); } else { isPageToGsExtract = true; } Stack <LocalPathItem> reversedStack = new Stack <LocalPathItem>(); reversedStack.AddAll(baseLocalPath); Stack <LocalPathItem> resultStack = new Stack <LocalPathItem>(); resultStack.AddAll(reversedStack); imageObjectData.AddLocalPath(resultStack); baseLocalPath.Pop(); } if (useGs && isPageToGsExtract) { String fileName = "Page_" + i; ghostscriptHelper.RunGhostScriptImageGeneration(pdf, outputPath, fileName, i.ToString()); } CleanUpImagesCompareTool.ImageRenderListener listener = new CleanUpImagesCompareTool.ImageRenderListener(); PdfCanvasProcessor parser = new PdfCanvasProcessor(listener); parser.ProcessPageContent(page); ignoredImagesAreas.Put(i, listener.GetImageRectangles()); imageObjectDatas.Put(i, imageObjectData); } return(imageObjectDatas); } } }