Cleaning the harvested URLs

Hi,

I'm sure there would be an option, but I do not know which one or how that would be done. As we collect a lot of URLs, and the process involves deleting the duplicates.

Then I want to delete the URL with some words like;

Youtube.
wiki
CNN
bbc

So, what I want is maybe to create a file or I found a word on the blacklist and edited, put these words in and deleted them, but the url remained, so there may be -being something wrong in my way of doing it.

It would also be good to know if you could indicate how I can harvest so that these URLs containing these empty words are not harvested.

thanks again

You want to put these words in a file, 1 per line.

Then place your URLs in the grid of urls collected in the top right quadrant of the scrapebox.

Then go to remove / filter >> remove urls containing entries from. Then select your file.

(Yesterday, 05:17)loopline writes: You want to put these words in a file, 1 per line.

Then place your URLs in the grid of urls collected in the top right quadrant of the scrapebox.

Then go to remove / filter >> remove urls containing entries from. Then select your file.

Perfect companion. Thank you