I recently bought a digital, DRM protected, ebook. Since it is protected, you are not allowed to download it, so I always have to use a browser. Since I prefer an ereader, I looked for legal ways to convert my book to a PDF or epub.
A simple approach and, according to ChatGPT legal approach, is the following:
- Create screenshots of the book pages in the online reader
- Bundle the screenshots to epub or pdf
Taking screenshots of 300+ pages is no fun, even on a Mac. I needed to press:
- CMD + SHIFT + 5: It opens the screenshot tool. The first time, select the page area and specify the output folder <OUTPUT_FOLDER>. On subsequent times, the same area is selected.
- Press "ENTER" to save the page
- Press "Page Down" to move to the next page.
TIP: If you have the option to display the page in landscape format, e.g. rotate page 90 degrees counter-clockwise, the image resolution is better.
brew install imagemagick ocrmypdf ghostscript
brew install --cask calibrecd <OUTPUT_FOLDER>
# Rename to page_000.png
i=0; for f in *.png(.n); do mv "$f" "$(printf "page_%03d.png" "$i")"; ((i++)); done
# Prepare outputs
mkdir -p rotated/cropped
# Rotate
mogrify -path rotated/ \
-rotate 90 \
-density 300 \
-format png \
*.png
cd rotated
## Create regular PDF
magick *.png input.pdf
ocrmypdf --optimize 3 --rotate-pages input.pdf ocred.pdf
# Crop
mogrify -path cropped -crop 1158x1851+95+115 +repage *.png
cd cropped
# Concatenate
magick *.png input.pdf
# Add searchable text using ocrmypdf
ocrmypdf --optimize 3 --rotate-pages input.pdf ocred.pdf
# Compress
gs -sDEVICE=pdfwrite \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/ebook \
-dNOPAUSE -dQUIET -dBATCH \
-sOutputFile=output.pdf \
ocred.pdf
# To epub using calibre
ebook-convert output.pdf book.epub \
--enable-heuristics \
--no-default-epub-cover