Skip to content

Export images from a PDF file, or generate a PDF file from images. Base on sharp, PDF.js(for parsing PDFs) and jsPDF(for generate PDFs).

License

Notifications You must be signed in to change notification settings

ssnangua/sharp-pdf

Repository files navigation

sharp-pdf

Export images from a PDF file, or generate a PDF file from images.

Base on sharp, PDF.js(for parsing PDFs) and jsPDF(for generate PDFs).

Install

npm install sharp-pdf

Export images from a PDF file

PDF.sharpsFromPdf(src, options?): Promise<ImageData[]>

  • src GetDocumentParameters - String containing the filesystem path to a PDF file, or a DocumentInitParameters object.
  • options Object (optional)
    • sharpOptions Object (optional) - Sharp constructor options.
    • delay Number (optional) - Number of milliseconds to delay (setTimeout) after an image is parsed. If you need to show progress on the UI (electron/nwjs), you can use this option to avoid blocking. Default by -1 (no delay).
    • workerSrc Boolean (optional) - Set GlobalWorkerOptions.workerSrc to pdf.worker.entry. Default by false.
    • handler (event, data) => void (optional)
      • "loading" - PDF file loading progress, data is an object containing total number of bytes and loaded number of bytes.
      • "loaded" - PDF file loaded, data is an object containing pages info.
      • "image" - Image parsing complete, data is the ImageData.
      • "skip" - Skip an invalid image.
      • "error" - An image parsing error occurs, data is an object containing the error info.
      • "done" - All images are parsed, data is an array containing all ImageData.

Returns Promise<ImageData[]> - Resolve with an array of object containing the following info:

ImageData

  • image Sharp - Instance of sharp.
  • name String - Image name.
  • width Number - Image width in pixels.
  • height Number - Image height in pixels.
  • channels Number - Number of channels.
  • size Number - Total size of image in bytes.
  • pages Number - Number of pages.
  • pageIndex Number - Page index.
  • pageImages Number - Number of images in page.
  • pageImageIndex Number - Image index in page.
const PDF = require("sharp-pdf");

PDF.sharpsFromPdf("./input.pdf").then((images) => {
  images.forEach(({ image, name, channels }) => {
    const ext = channels > 3 ? ".png" : ".jpg";
    image.toFile(`./${name}${ext}`);
  });
});

// progress
PDF.sharpsFromPdf("./input.pdf", {
  handler(event, data) {
    if (event === "loading") {
      console.log("loading PDF:", (data.loaded / data.total) * 100);
    } else if (event === "loaded") {
      console.log("PDF loaded");
    } else if (event === "image" || event === "skip" || event === "error") {
      console.log("parsing images:", (data.pageIndex / data.pages) * 100);
    } else if (event === "done") {
      console.log("done");
    }
  },
});

// load a password protected PDF
PDF.sharpsFromPdf({
  url: "./input.pdf",
  password: "ssnangua",
});

Generate a PDF file from images

PDF.sharpsToPdf(images, output, options?): Promise<Object>

  • images Array<Sharp | Object>
    • image Sharp - Sharp instance.
    • options ImageOptions (optional) - Image options.
  • output String | { type, options } - The path to write the PDF file to, or an object contains jsPDF.output(type, options) arguments.
  • options Object (optional)
    • pdfOptions Object (optional) - jsPDF constructor options
    • imageOptions ImageOptions (optional) - Global image options.
    • autoSize Boolean (optional) - Set page size to image size. pdfOptions.format and fit option will not work. Default by false.
    • init (params) => void (optional)
      • params Object
        • doc jsPDF - jsPDF instance.
        • pages Number - Number of images.
        • pageWidth Number - Page width in pixels.
        • pageHeight Number - Page height in pixels.

Returns Promise<Object> - Resolve with an object containing the PDF file size info or PDF document data.

ImageOptions

  • format String (optional) - Format of image, e.g. 'JPEG', 'PNG', 'WEBP'.
  • x Number (optional) - Image x Coordinate in pixels. If omitted, the image will be horizontally centered.
  • y Number (optional) - Image y Coordinate in pixels. If omitted, the image will be vertically centered.
  • width Number (optional) - Image width in pixels. If omitted, fill the page if fit, otherwise use the image width.
  • height Number (optional) - Image height in pixels. If omitted, fill the page if fit, otherwise use the image height.
  • compression "NONE" | "FAST" | "MEDIUM" | "SLOW" (optional) - Compression of the generated JPEG. Default by "NONE".
  • rotation Number (optional) - Rotation of the image in degrees (0-359). Default by 0.
  • fit Boolean (optional) - Image fit to page size. Default by false.
  • margin Number (optional) - Image margin (pixels). Default by 0.
  • handler (params) => void (optional) -
    • params Object
      • doc jsPDF - jsPDF instance.
      • pages Number - Number of images.
      • pageWidth Number - Page width in pixels.
      • pageHeight Number - Page height in pixels.
      • index Number - Page index.
      • image Sharp - Sharp instance.
      • options ImageOptions - Image options.
      • imageData Buffer - A buffer containing image data.
      • format String - Format of image, e.g. 'JPEG', 'PNG', 'WEBP'.
      • x Number - Image x Coordinate in pixels.
      • y Number - Image y Coordinate in pixels.
      • width Number - Image width in pixels.
      • height Number - Image height in pixels.
const fs = require("fs");
const sharp = require("sharp");
const PDF = require("sharp-pdf");

PDF.sharpsToPdf(
  [
    sharp("./image1.jpg"),
    sharp("./image2.jpg"),
    { image: sharp("./image3.jpg"), options: {} },
  ],
  "./output.pdf"
).then(({ size }) => {
  console.log(size);
});

// options
PDF.sharpsToPdf(
  fs
    .readdirSync("./Comic")
    .map((file) => sharp(`./Comic/${file}`).jpeg({ quality: 20 })),
  "./Comic.pdf",
  {
    pdfOptions: {
      format: "b5",
      encryption: {
        userPassword: "ssnangua",
      },
    },
    imageOptions: {
      format: "JPEG",
      compression: "FAST",
      fit: true,
      handler({ index, pages }) {
        console.log(index + 1, "/", pages);
      },
    },
  }
);

// handler
PDF.sharpsToPdf(
  [
    sharp("./image1.jpg"),
    sharp("./image2.jpg"),
    {
      image: sharp("./image3.jpg"),
      options: {
        // override the global handler
        handler() {},
      },
    },
  ],
  "./output.pdf",
  {
    imageOptions: {
      handler({ doc, ...params }) {
        // add page number
        const { index, pageWidth, pageHeight } = params;
        doc.text(`- ${index + 1} -`, pageWidth / 2, pageHeight - 10, {
          align: "center",
          baseline: "bottom",
        });

        // return or resolve with `false`,
        // will skip the default add image operation,
        // and you can add image by yourself.
        const { imageData, format, x, y, width, height } = params;
        doc.addImage(imageData, format, x, y, width, height);
        return false;
        // or
        // return new Promise(resolve => setTimeout(() => resolve(false), 100));
      },
    },
  }
);

// output types
PDF.sharpsToPdf(
  [ sharp("./image1.jpg") ],
  { type: "arraybuffer" }
).then((arraybuffer) => {
  const buffer = Buffer.from(arraybuffer);
  fs.writeFileSync("output.pdf", buffer);
});

Reference

PDF Export Images

Change Log

0.1.3

  • sharpsFromPdf()
    • Added delay and workerSrc options
    • Added skip event
  • sharpsToPdf()
    • Added autoSize option
    • Supported promise handler
    • supported output types

About

Export images from a PDF file, or generate a PDF file from images. Base on sharp, PDF.js(for parsing PDFs) and jsPDF(for generate PDFs).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published