Skip to content

Instantly share code, notes, and snippets.

@OrangeJuce82
Created May 21, 2025 06:29
Show Gist options
  • Select an option

  • Save OrangeJuce82/e68d87ca935dcc0eb99ca0a4c47b4cfc to your computer and use it in GitHub Desktop.

Select an option

Save OrangeJuce82/e68d87ca935dcc0eb99ca0a4c47b4cfc to your computer and use it in GitHub Desktop.

Revisions

  1. OrangeJuce82 renamed this gist May 21, 2025. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  2. OrangeJuce82 created this gist May 21, 2025.
    24 changes: 24 additions & 0 deletions remove_watermark
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,24 @@
    from pathlib import Path

    from pypdf import PdfWriter
    from pypdf.generic._data_structures import StreamObject

    books_directory = Path("books")
    export_directory = Path("export")
    for input_filepath in books_directory.glob("*.pdf"):
    output_filepath = export_directory / input_filepath.name
    print("=====", output_filepath, "=====")
    writer = PdfWriter(clone_from=str(input_filepath))

    for page in writer.pages:
    contents = page._get_contents_as_bytes()

    watermark = contents.find(b'/Watermark')
    if watermark > -1:
    contents = contents[:watermark - 1]
    stream = StreamObject()
    stream.set_data(contents)
    page.replace_contents(stream)

    with open(output_filepath, "wb") as f:
    writer.write(f)