-
-
Save benuski/f60c973424cb7c061ac103ea1a02ca96 to your computer and use it in GitHub Desktop.
| version: '3.2' | |
| services: | |
| changedetection: | |
| image: ghcr.io/dgtlmoon/changedetection.io:latest | |
| container_name: changedetection | |
| hostname: changedetection | |
| volumes: | |
| - changedetection-data:/datastore | |
| environment: | |
| - PORT=5000 | |
| - PUID=1000 | |
| - PGID=1000 | |
| # Ensure this URL matches the name of the playwright service | |
| - PLAYWRIGHT_DRIVER_URL=ws://playwright-chrome:3000/?headless=false | |
| ports: | |
| - 5000:5000 | |
| restart: unless-stopped | |
| depends_on: | |
| - playwright-chrome | |
| playwright-chrome: | |
| hostname: playwright-chrome | |
| image: ghcr.io/browserless/chrome | |
| restart: unless-stopped | |
| environment: | |
| - SCREEN_WIDTH=1920 | |
| - SCREEN_HEIGHT=1024 | |
| - SCREEN_DEPTH=16 | |
| - ENABLE_DEBUGGER=false | |
| # Increased connection timeout to reduce chances of timeout errors | |
| - TIMEOUT=600000 # Now 10 minutes | |
| # Increased concurrent sessions for better parallel processing | |
| - CONCURRENT=15 | |
| volumes: | |
| changedetection-data: |
Hi. Thanks for this, does this not work for sites using javascript?
Upd: Here's the docker-compose.yml I used. Works great. Most sites run without a proxy, only a few require one. No warnings or errors.
https://pastebin.com/pATyMgu0
I also used a proxies.json as described in the manual here: https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration (you can leave this out if you don't use proxies or prefer to set them in the web UI instead).
Edit: Updated the docker-compose.yml. This version has been running without any issues for over a year now.
- If you're running it behind a reverse proxy, comment out the
ports:section and adjust the networks accordingly. - Browserless chromium is the way to go. I tried every other compatible browser and this is the only one that works reliably with browser steps and the visual filter.
- You may want to tune
shm_size,FETCH_WORKERS,andCONCURRENTto your needs. It runs fine on a cheap VPS, but expect around 500 MB RAM idle. If you have a beefier server, feel free to crank those up. - Replace
your-very-secure-tokenwith something randomly generated. I used "openssl rand -hex 32". BASE_URLis only relevant if you set up notifications and want the correct server name displayed there, so don't worry too much about it.- The
HEALTH,MAX_CPU_PERCENT,andMAX_MEMORY_PERCENTsettings are there so browserless doesn't crash under load. Instead of dying, it'll return an error and keep running. Useful on a low-end VPS.
Thank you. It is working for most pages I want to monitor, but not for https://www.smythstoys.com/at/de-at
I think the problem is Imperva. Do you know any solution to that problem?
What error are you getting? If it's only one website, I'd recommend setting up a proxy for that specific domain. The site likely doesn't like your IP.
I use Bright Data's datacenter proxies for this. There's an official guide for setting that up. I had similar issues where certain sites wouldn't work without a proxy and this solved it.
If that doesn't work, they also have a Browser API, though it's expensive. I'd only use that for sites that absolutely require it.
It's likely Imperva detecting the scraping pattern rather than blocking your IP specifically.
Try going to Settings → Fetching and set Random jitter seconds ± to 10 (or any value other than default) to add variability between requests.
If that doesn't resolve it, proxies are your only solution.

Shame that I can't add an emoji to a gist but same here, super helpful (and the comments) thanks!
I'd love an example of how to leverage browserless.io's unblock as part of changedetection too