PDF Extract Image
Description
Extracts all images out of PDF files from the file package.
Each image extracted is numbered consecutively in the following format: <Image prefix name>_0001.jpg
A new column is created for the results.
A common use for this function is when a PDF contains small images that need to be keep separately, such as photos from a brochure or graphs from a report.
- Requires the user to specify the PDF as binary data.
- If there is no image information in the PDF's metadata, no images can be extracted.
- In this case, Automation Hero recommend using the PDF to Image function to capture the desired data.
- Support for PDF file size up to 100MB. Larger PDF files may lead to out-of-memory issues by exceeding the default runner memory limit.
- Unsupported for JPEG2000 image compression files libraries by default.
Use
- PDF Content:
- Select an argument. (Binary)
- Images prefix name:
- Enter an image prefix name as the base name for each image file.
- Output field name:
- Enter an output field name.
The output for the function is a single record as a list. Use a function like Flatten List to return each element in the list as a separate record.