I needed to find a way to convert over 100 PDF files to PNG and make sure the files PNG were compressed and as small as possible. After doing some research and asking ChatGPT, I was able to get it working by using the below bash Linux script. There are online tools available to assist with conversions, but I desired more control over the process, particularly for a large number of files requiring conversion. To summarise, these were the Linux tools that were needed to complete this task:

  • install pdftoppm (sudo apt-get install poppler-utils)
  • install convert (sudo apt-get install imagemagick)
  • install pngquant (sudo apt-get install pngquant)

Some of the requirements were:

  • convert to the PDF file to a max resolution of 2384×3370
  • if the filesize was greater than 2 MB then run the compressions
  • use a quality of 80 for the compression
  • ability to enter the folder name as an input argument
#!/bin/bash

# Check if the folder name is provided as an argument
if [ -z "$1" ]; then
    echo "Usage: $0 <folder_name>"
    exit 1
fi

# Set input folder and output folder based on the argument
input_folder="$1"
output_folder="${input_folder}-new"

# Create output directory if it doesn't exist
mkdir -p "$output_folder"

# Initialize variables
failed_files=()
conversion_summary=()
total_converted=0
max_filesize=2097152  # 2 MB in bytes

# Loop through all PDF files in the input folder
for pdf_file in "$input_folder"/*.pdf; do
    # Get the base filename without extension
    filename=$(basename "$pdf_file" .pdf)
    
    # Display status message before conversion
    echo "Converting $pdf_file to $output_folder/$filename.png..."
    
    # Convert the PDF to PNG using pdftoppm (multiple files are generated per page)
    pdftoppm "$pdf_file" "$output_folder/$filename" -png
    
    # Check if the conversion was successful
    if [ $? -eq 0 ]; then
        # Rename first page if present (remove the -1 suffix)
        first_page_png="$output_folder/$filename-1.png"
        final_png="$output_folder/$filename.png"
        
        if [ -f "$first_page_png" ]; then
            mv "$first_page_png" "$final_png"
        fi
        
        # Handle all the PNG files (including the renamed one)
        for png_file in "$output_folder/$filename"-*.png "$final_png"; do
            # Check if file exists before processing
            if [ -f "$png_file" ]; then
                # Get the PNG image size and file size
                dimensions=$(identify -format "%wx%h" "$png_file")
                filesize=$(stat -c%s "$png_file")

                # Check if the dimensions exceed 2384x3370
                if [[ $(echo "$dimensions" | awk -Fx '{print $1}') -gt 2384 || $(echo "$dimensions" | awk -Fx '{print $2}') -gt 3370 ]]; then
                    echo "Resizing $png_file to 2384x3370..."
                    convert "$png_file" -resize 2384x3370 "$png_file"
                    echo "Resized: $png_file"
                fi
                
                # Adjust quality only if file size exceeds 2 MB
                if [ "$filesize" -gt "$max_filesize" ]; then
                    echo "Setting quality to 80 for $png_file (File size: $filesize bytes)..."
                    convert "$png_file" -quality 80 "$png_file"
                    # Update filesize after adjusting quality
                    filesize=$(stat -c%s "$png_file")
                fi

                # Compress PNG using pngquant
                echo "Compressing $png_file using pngquant..."
                pngquant --quality=80-80 --ext .png --force "$png_file"
                
                # Report final dimensions and file size
                dimensions=$(identify -format "%wx%h" "$png_file")
                filesize=$(stat -c%s "$png_file")
                
                echo "Successfully converted: $filename.pdf"
                echo "Output: $png_file | Dimensions: $dimensions | File size: ${filesize} bytes"
                
                # Add to conversion summary
                conversion_summary+=("$png_file | Dimensions: $dimensions | File size: ${filesize} bytes")
                total_converted=$((total_converted + 1))
            fi
        done
    else
        echo "Failed to convert: $filename.pdf"
        failed_files+=("$pdf_file")
    fi
done

# Display summary of failed conversions
if [ ${#failed_files[@]} -ne 0 ]; then
    echo "The following files failed to convert:"
    for failed_file in "${failed_files[@]}"; do
        echo "$failed_file"
    done
else
    echo "All files converted successfully!"
fi

# Summary of all conversions
echo -e "\n--- Conversion Summary ---"
for summary in "${conversion_summary[@]}"; do
    echo "$summary"
done

# Display total number of files converted
echo "Total files converted: $total_converted"

The end result was very surprising since I thought convert was compressing the file really well and that pngquant wouldn’t do much. Here’s a file size comparison before and after using pgnquant:

Marco Tran The Simple Entrepreneur PNG Compression file size savings

It’s important to note that most of the images saw savings of over 50%. You might be concerned about the quality, given that it was adjusted to 80, but I couldn’t notice any difference. I might modify the requirements to convert all files instead of checking if the file size is greater than 2 MB.

Subscribe to my newsletter where I will share my journey in affiliate marketing, business, technology, fitness and life in general. Hopefully, this motivates you to also change your journey in life.

This field is required.

Subscribe to my newsletter where I will share my journey in affiliate marketing, business, technology, fitness and life in general. Hopefully, this motivates you to also change your journey in life.

This field is required.

If this article helped you in any way and you want to show your appreciation, I am more than happy to receive donations through PayPal. This will help me maintain and improve this website so I can help more people out there. Thank you for your help.

HELP OTHERS AND SHARE THIS ARTICLE


0Shares

LEAVE A COMMENT