The smart solution is to generate two Sitemaps. The first is for the benefit of your competitors, the second for the benefit of your favorite search engines. In military language, this first plan of the site is a sham.
The & # 39; feint & # 39; contains the basic structure of your website, your home page, contact us, about us, the main categories. This looks like the real deal and will work fine in obscure search engines that you do not care about. It will also be of no use to your competitors. Let him be indexed so that he finds it, give him an obvious name such as sitemap.xml.
Now create your real sitemap with code. Give it a name such as "product-information-sitemap.xml" for that it's a reasonable name, but not easier to guess than your password.
In your Apache configuration for the Sitemap folder, put something in place so that search engines can access this second sitemap, but not the index:
X-Robots-Tag Header Set "noindex"
Now, create the code to keep it up to date, consider a third sitemap for the images. Dowwgrade it if necessary to create the "feint". Also pay attention to timestamps, Google pays attention. This is important if your sitemap is large.
Now create a cron job to regularly submit your product plan to Google. In your crontab entry, add something like this to submit your true sitemap every week:
0 0 * * 0 wget www.google.com/webmasters/tools/ping?sitemap=http%3A%2F%2Fwww.example.com%2Fsitemaps%2Fproduct-information-sitemap.xml
Note that the URL is an encoded URL.
You can also gzip your sitemap if size is a problem, but your web server should serve it if gzipped is enabled.
Your robots.txt file does not have to be special, but it should not stop you from entering your sitemaps. There is really no need to send different robots.txt files based on user agent strings or any other complex element. Simply extract your valuable content into an unannounced extra file and send it to Google on a cron job (rather than waiting for the bot). Simple.