FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum » How To » robots.txt for path-info  () 1 Vote
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
robots.txt for path-info [message #26997] Wed, 24 August 2005 08:59 Go to next message
!alex is currently offline  !alex   Germany
Messages: 23
Registered: February 2004
Location: germany
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Here`s a suggestion to prevent google & co from indexing ton`s of pages with no real content by robots.txt for the path-info theme:

User-agent: *
Disallow: /forum/index.php/r/reply_to/
Disallow: /forum/index.php/u/
Disallow: /forum/index.php/r/quote/
Disallow: /forum/index.php/mv/msg/
Disallow: /forum/index.php/m/
Disallow: /forum/index.php?
Disallow: /forum/index.php/sel/
Disallow: /forum/index.php/r/
Disallow: /forum/index.php/i/
Disallow: /forum/index.php/mn/tree/
Disallow: /forum/index.php/mv/tree/
Disallow: /forum/pdf.php
Disallow: /forum/index.php/pv/
Disallow: /forum/index.php/sp/
Disallow: /forum/index.php/rm/
Disallow: /forum/index.php/a/


I also use this in my htdig-config file in this way:
exclude_urls:           /cgi-bin/ .cgi /forum/index.php/r/reply_to/ /forum/index.php/u/ /forum/index.php/r/quote/ /forum/index.php/mv/msg/ /forum/index.php/m/ /forum/index.php? /forum/index.php/sel/ /forum/index.php/r/ /forum/index.php/i/ /forum/index.php/mn/tree/ /forum/index.php/mv/tree/ /forum/pdf.php /forum/index.php/pv/ /forum/index.php/sp/ /forum/index.php/rm/ /forum/index.php/a/


Would be great to hear if i missed something or should remove some of these Directories from the disallow/exclusion list.

With htdig this works pretty well preventing double-listings, for google i updated my robots.txt today due to pretty much load on my server caused by a big crawl ...

Any Experiences/Suggestions welcome.

Peace,
Alex

[Updated on: Wed, 24 August 2005 09:00]

Report message to a moderator

Re: robots.txt for path-info [message #31535 is a reply to message #26997] Thu, 04 May 2006 10:27 Go to previous message
matthieu_phpmv is currently offline  matthieu_phpmv   France
Messages: 44
Registered: November 2004
Karma: 0
Member
add to buddy list
ignore all messages by this user
Here's mine on phpmyvisites.net/robots.txt
Quote:


User-agent: *
Disallow: /forums/pdf.php
Disallow: /forums/index.php/m
Disallow: /forums/index.php/sp/
Disallow: /forums/index.php/ef/
Disallow: /forums/index.php/mv/
Disallow: /forums/index.php/r/
Disallow: /forums/index.php/pmm/
Disallow: /forums/index.php/rm/
Disallow: /forums/index.php/sel/
Disallow: /forums/index.php/pv/
Disallow: /forums/index.php/ma/
Disallow: /forums/index.php/u/
Disallow: /forums/index.php/s/
Disallow: /forums/index.php/h/
Disallow: /forums/index.php/i/
Disallow: /forums/index.php/l/
Disallow: /forums/rdf.php


With this file bots will ONLY spider the messages, it is what I want. (no profile, no pdf messages, etc. only real and uniques message)

[Updated on: Fri, 05 May 2006 08:09]

Report message to a moderator

Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: Change Forum display style
Next Topic: Understanding templates and themes better
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Thu Oct 19 16:18:47 EDT 2017

Total time taken to generate the page: 0.00692 seconds