davidhavard/gist:10a246aa90f808d5f2fc87134a18a280

Created April 20, 2018 14:35

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/davidhavard/10a246aa90f808d5f2fc87134a18a280.js"></script>
Save davidhavard/10a246aa90f808d5f2fc87134a18a280 to your computer and use it in GitHub Desktop.

Download ZIP

Recursively extract image urls from html files, sort them and remove duplicates

Raw

gistfile1.txt

grep -roP '<img.+src="\K[^"]+' * | sed 's/.*://g' | sort | uniq > images.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment