4

我正在使用Ghostscript.NET,这是一个方便的用于 Ghostscript 功能的 C# 包装器。我有一批从客户端发送的 PDF 文件被转换为 ASP .NET WebAPI 服务器上的图像并返回给客户端。

public static IEnumerable<Image> PdfToImagesGhostscript(byte[] binaryPdfData, int dpi)
{
    List<Image> pagesAsImages = new List<Image>();

    GhostscriptVersionInfo gvi = new GhostscriptVersionInfo(AppDomain.CurrentDomain.BaseDirectory + @"\bin\gsdll32.dll");

    using (var pdfDataStream = new MemoryStream(binaryPdfData))
    using (var rasterizer = new Ghostscript.NET.Rasterizer.GhostscriptRasterizer())
    {
        rasterizer.Open(pdfDataStream, gvi, true);

        for (int i = 1; i <= rasterizer.PageCount; i++)
        {
            Image pageAsImage = rasterizer.GetPage(dpi, dpi, i); // Out of Memory Exception on this line
            pagesAsImages.Add(pageAsImage);
        }
    }
    return pagesAsImages;
}

这通常可以正常工作(我通常使用 500 dpi,我知道这很高,但即使降至 300 我也可以重现此错误)。但是,如果我从客户端提供许多 PDF(例如 150 个 1 页 PDF),它通常会在 Ghostscript.NET Rasterizer 中遇到内存不足异常。我该如何克服呢?这应该穿线吗?如果是这样,那将如何工作?使用 64 位版本的 GhostScript 会有帮助吗?提前致谢。

4

2 回答 2

0

I'm new to this myself, on here looking for techniques.

According to the example in the documentation here, they show this:

for (int page = 1; page <= _rasterizer.PageCount; page++)
{
    var docName = String.Format("Page-{0}.pdf", page);
    var pageFilePath = Path.Combine(outputPath, docName);
    var pdf = _rasterizer.GetPage(desired_x_dpi, desired_y_dpi, pageNumber);
    pdf.Save(pageFilePath);
    pagesAsImages.Add(pdf);
}

It looks like you aren't saving your files.

I am still working at getting something similar to this to work on my end as well. Currently, I have 2 methods that I'm going to try, using the GhostscriptProcessor first:

private static void GhostscriptNetProcess(String fileName, String outputPath)
{
    var version = Ghostscript.NET.GhostscriptVersionInfo.GetLastInstalledVersion();
    var source = (fileName.IndexOf(' ') == -1) ? fileName : String.Format("\"{0}\"", fileName);
    var gsArgs = new List<String>();
    gsArgs.Add("-q");
    gsArgs.Add("-dNOPAUSE");
    gsArgs.Add("-dNOPROMPT");
    gsArgs.Add("-sDEVICE=pdfwrite");
    gsArgs.Add(String.Format(@"-sOutputFile={0}", outputPath));
    gsArgs.Add(source);
    var processor = new Ghostscript.NET.Processor.GhostscriptProcessor(version, false);
    processor.Process(gsArgs.ToArray());
}

This version below is similar to yours, and what I started out using until I started finding other code examples:

private static void GhostscriptNetRaster(String fileName, String outputPath)
{
    var version = Ghostscript.NET.GhostscriptVersionInfo.GetLastInstalledVersion();
    using (var rasterizer = new Ghostscript.NET.Rasterizer.GhostscriptRasterizer())
    {
        rasterizer.Open(File.Open(fileName, FileMode.Open, FileAccess.Read), version, false);
        for (int page = 0; page < rasterizer.PageCount; page++)
        {
            var img = rasterizer.GetPage(96, 96, page);
            img.Save(outputPath);
        }
    }
}

Does that get you anywhere?

于 2015-12-22T22:14:26.393 回答
0

您不必在同一个 GhostscriptRasterizer 实例中栅格化所有页面。在每个页面上使用一次性光栅器并在 List Image 或 List byte[] 中收集结果。结果示例 Jpeg 编码的字节数组列表。

List<byte[]> result = new List<byte[]>();

for (int i = 1; i <= pdfPagesCount; i++)
{
    using (var pageRasterizer = new GhostscriptRasterizer())
    {
        pageRasterizer.Open(stream, gsVersion, true);

        using (Image tempImage = pageRasterizer.GetPage(dpiX, dpiY, i))
        {
            var encoder = ImageCodecInfo.GetImageEncoders().First(c => c.FormatID == System.Drawing.Imaging.ImageFormat.Jpeg.Guid);
            var encoderParams = new EncoderParameters() { Param = new[] { new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 95L) } };

            using (MemoryStream memoryStream = new MemoryStream())
            {
                tempImage.Save(memoryStream, encoder, encoderParams);
                result.Add(memoryStream.ToArray());
            }
        }
    }
}

如果您不知道 PDF 中的页数,您可以调用 rasterizer 一次,并获取 PageCount 属性。

于 2020-05-22T13:49:45.730 回答