We have a weird scenario on which I would like to have opinions. A site from thisdomain.com is being created on the pantheon, using the staging / dev URL of thisdomain.pantheon.io *.
The Pantheon development platform injects a robots.txt file to avoid indexing the development site in Google: User-agent: * Disallow: /
Experienced SEOs know that this is not enough to prevent something from hiding in Google. At one point during development, a writer accidentally linked a page on the development site from the production site, which caused Google to index the domain thisdomain.pantheon.io.
Result: thisdomain.pantheon.io is now stuck in the index and moves the production site to Google # 23, even for its own trademark query. SEO guy is sad SEO.
We are verified in the CSS ** on development and production.
A normal advice would be:
Add the directives noindex & # 39; on the page, recover and wait
Add the password to page (403,), recover and wait.
Temporarily redirect the page to production (301,), search and wait.
Of course, none of these solutions works because gooblebot can not see these 403/301/404 / etc. answers, the page will remain in the index. With the robots.txt file "injected" from Pantheon, we are SOL.
Do you have any idea of how we could force this out of the index?
* It should be pointed out to non-Pantheon members that there is no way to change "thisdomian" in the transfer URL to something else. We have no control over the robots.txt file and we can not delete it.
** If your idea is a URL removal tool: URL deletion offers us a short term hiding place on thisdomain.pantheon.io site. However, this would only hide our efforts temporarily, and I have recommended against that for now. The removal tool will not work on 401.