Skip to content

Instantly share code, notes, and snippets.

@erikvullings
Last active May 4, 2026 19:25
Show Gist options
  • Select an option

  • Save erikvullings/eedb9e4364f52429dbbaf17a17ff5a85 to your computer and use it in GitHub Desktop.

Select an option

Save erikvullings/eedb9e4364f52429dbbaf17a17ff5a85 to your computer and use it in GitHub Desktop.
Converting digital DRM protected ebook to PDF and epub for offline reading

Converting digital DRM protected ebook to PDF and epub for offline reading

I recently bought a digital, DRM protected, ebook. Since it is protected, you are not allowed to download it, so I always have to use a browser. Since I prefer an ereader, I looked for legal ways to convert my book to a PDF or epub.

A simple approach and, according to ChatGPT legal approach, is the following:

  1. Create screenshots of the book pages in the online reader
  2. Bundle the screenshots to epub or pdf

Taking screenshots

Taking screenshots of 300+ pages is no fun, even on a Mac. I needed to press for each page:

  • CMD + SHIFT + 5: It opens the screenshot tool. The first time, select the page area and specify the output folder <OUTPUT_FOLDER>. On subsequent times, the same area is selected.
  • Press "ENTER" to save the page
  • Press "Page Down" to move to the next page.

AppleScript to the rescue:

Press CMD+SPACE and type "Script Editor".

  1. Create a new script
  2. Paste below code
  3. Press RUN
  4. Quickly click the browser (so it receives keyboard focus for Page Down)
delay 3
tell application "System Events"
	repeat 322 times
		
		-- Open screenshot tool
		key code 23 using {command down, shift down} -- Cmd+Shift+5
		delay 1
		
		-- Press Enter (take screenshot)
		key code 36
		delay 1
		
		-- Go to next page (Page Down)
		key code 121
		delay 1
		
	end repeat
end tell

TIP: If you have the option to display the page in landscape format, e.g. rotate page 90 degrees counter-clockwise, the image resolution is better.

Tools you need (on Mac)

brew install imagemagick ocrmypdf ghostscript
brew install --cask calibre

Post-processing

cd <OUTPUT_FOLDER>

# Rename to page_000.png
i=0; for f in *.png(.n); do mv "$f" "$(printf "page_%03d.png" "$i")"; ((i++)); done

# Prepare outputs
mkdir -p rotated/cropped

# Rotate
mogrify -path rotated/ \
  -rotate 90 \
  -density 300 \
  -format png \
  *.png

cd rotated

# Create regular PDF
magick *.png \
  -units PixelsPerInch \
  -density 300 \
  -resize 1158x1851! \
  input.pdf

# Add searchable text using ocrmypdf
ocrmypdf --optimize 3 --rotate-pages input.pdf ocred.pdf

# Compress
gs -sDEVICE=pdfwrite \
   -dCompatibilityLevel=1.4 \
   -dPDFSETTINGS=/ebook \
   -dNOPAUSE -dQUIET -dBATCH \
   -sOutputFile=output.pdf \
   ocred.pdf
 
# Do the same for the cropped version
mogrify -path cropped -crop 1158x1851+95+115 +repage *.png

cd cropped

# Create regular PDF
magick *.png \
  -units PixelsPerInch \
  -density 300 \
  -resize 1158x1851! \
  input.pdf

# Add searchable text using ocrmypdf
ocrmypdf --optimize 3 --rotate-pages input.pdf ocred.pdf

# Compress
gs -sDEVICE=pdfwrite \
   -dCompatibilityLevel=1.4 \
   -dPDFSETTINGS=/ebook \
   -dNOPAUSE -dQUIET -dBATCH \
   -sOutputFile=output.pdf \
   ocred.pdf

# To epub using calibre
ebook-convert output.pdf book.epub \
  --enable-heuristics \
  --no-default-epub-cover  

Alternatively, use docling to convert your pdf to markdown first:

PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.8 \
PYTORCH_MPS_LOW_WATERMARK_RATIO=0.5 \
docling --from pdf --to md --image-export-mode embedded ocred.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment