Skip to content

Instantly share code, notes, and snippets.

@Rourke101
Forked from hubgit/README.md
Created February 2, 2018 19:45
Show Gist options
  • Select an option

  • Save Rourke101/cc98a0dc37780710dc893de6c5c67858 to your computer and use it in GitHub Desktop.

Select an option

Save Rourke101/cc98a0dc37780710dc893de6c5c67858 to your computer and use it in GitHub Desktop.
Remove metadata from a PDF file, using exiftool and qpdf. Note that embedded objects may still contain metadata.
#!/bin/bash
FILE=example.pdf
# read tags from the original PDF
#exiftool -all:all $FILE
# remove tags (XMP + metadata) from the PDF
exiftool -all:all= $FILE
# linearize the file to remove orphan data
qpdf --linearize $FILE
# read XMP from the modified PDF
#exiftool -all:all $FILE
# read all strings from the modified PDF
#cat $FILE | strings > $FILE.txt
# read XMP from embedded objects in the modified PDF
#exiftool -extractEmbedded-all:all $FILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment