There are any number of articles on how to build a search engine. Most of them, however, focus on the infrastructure and the scraper whilst glossing over the ‘discovery’ part – actually find the information to scrape.
I’d like to build a search engine around RSS feeds. I have experimented with Bing Search API but when limiting the search to just RSS (search operators) the results were patchy – RSS feeds I know of were not in the results.
Can anyone provide some directions, websites, books or advice based on experience around how to find RSS feeds without users having to submit them?