Home »
FUDforum »
How To »
robots.txt for path-info
() 1 Vote
robots.txt for path-info [message #26997] |
Wed, 24 August 2005 12:59 |
!alex
Messages: 23 Registered: February 2004 Location: germany
Karma:
|
Junior Member |
|
|
Here`s a suggestion to prevent google & co from indexing ton`s of pages with no real content by robots.txt for the path-info theme:
User-agent: *
Disallow: /forum/index.php/r/reply_to/
Disallow: /forum/index.php/u/
Disallow: /forum/index.php/r/quote/
Disallow: /forum/index.php/mv/msg/
Disallow: /forum/index.php/m/
Disallow: /forum/index.php?
Disallow: /forum/index.php/sel/
Disallow: /forum/index.php/r/
Disallow: /forum/index.php/i/
Disallow: /forum/index.php/mn/tree/
Disallow: /forum/index.php/mv/tree/
Disallow: /forum/pdf.php
Disallow: /forum/index.php/pv/
Disallow: /forum/index.php/sp/
Disallow: /forum/index.php/rm/
Disallow: /forum/index.php/a/
I also use this in my htdig-config file in this way:
exclude_urls: /cgi-bin/ .cgi /forum/index.php/r/reply_to/ /forum/index.php/u/ /forum/index.php/r/quote/ /forum/index.php/mv/msg/ /forum/index.php/m/ /forum/index.php? /forum/index.php/sel/ /forum/index.php/r/ /forum/index.php/i/ /forum/index.php/mn/tree/ /forum/index.php/mv/tree/ /forum/pdf.php /forum/index.php/pv/ /forum/index.php/sp/ /forum/index.php/rm/ /forum/index.php/a/
Would be great to hear if i missed something or should remove some of these Directories from the disallow/exclusion list.
With htdig this works pretty well preventing double-listings, for google i updated my robots.txt today due to pretty much load on my server caused by a big crawl ...
Any Experiences/Suggestions welcome.
Peace,
Alex
[Updated on: Wed, 24 August 2005 13:00] Report message to a moderator
|
|
|
Goto Forum:
Current Time: Sun Nov 24 19:38:10 GMT 2024
Total time taken to generate the page: 0.04806 seconds