FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum » How To » Google sitemap of FUDforum
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
Google sitemap of FUDforum [message #28343] Wed, 19 October 2005 02:40 Go to next message
Hurry   India
Messages: 33
Registered: October 2005
Karma: 0
Member
add to buddy list
ignore all messages by this user
Hello! It will be really great if there can be a way to make a Google sitemap.xml file of our FUDforum's categories, topics and posts. My experience that all the sitemap.xml files I had loaded in the http://www.google.com/webmasters/sitemaps/ get very quickly and regularly spidered by google. I hope there is some way to do it or that Ilia mayadd this feature.
Re: Google sitemap of FUDforum [message #28348 is a reply to message #28343] Wed, 19 October 2005 09:16 Go to previous messageGo to next message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
use FUDAPI to make this file or just make it manually.

FUDforum Core Developer
Re: Google sitemap of FUDforum [message #41119 is a reply to message #28348] Sun, 25 May 2008 12:56 Go to previous messageGo to next message
Deimos is currently offline  Deimos   United Kingdom
Messages: 1
Registered: May 2008
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Hello, anyone can explain how to do this please, either manually or using fudapi?

I tried to look at fudapi but didn't found out how to use this file to generate a sitemaps.xml

Thanks for your help
Re: Google sitemap of FUDforum [message #41155 is a reply to message #41119] Thu, 29 May 2008 10:48 Go to previous messageGo to next message
rush2112 is currently offline  rush2112   United States
Messages: 15
Registered: April 2005
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Here's a little file I wrote to do that:

<?php
include ("GLOBALS.php");

$dbh=mysql_connect ("$DBHOST", "$DBHOST_USER", "$DBHOST_PASSWORD") or die ('No Connect: ' . mysql_error());
mysql_select_db ("$DBHOST_DBNAME");
$query = "SELECT thread_id, MAX(`post_stamp`) from `fud26_msg` group by thread_id";
$result = mysql_query($query) or die ("$admtext[cannotexecutequery]: $query");

echo "Writing forum sitemap to the file<br><br>";
while( $row = mysql_fetch_array($result) ) {
$thread_id = $row["thread_id"];
$post_stamp = $row["MAX(`post_stamp`)"];
$post_time = date("H:i:s",$post_stamp);
$post_date = date("Y-m-d",$post_stamp);
$filetext = "<url><loc>" . $WWW_ROOT . "index.php/t/$thread_id/</loc>";
$filetext .= "<lastmod>" . $post_date . "T" . $post_time . "+00:00</lastmod><changefreq>weekly</changefreq></url>\n";
print $filetext;
}
?>


Just save the code into a php file and place it in your FUDforum directory.

Run the file from a browser, then view the 'Page Source'
You can copy (excluding the first line) and paste the info into your_forum_sitemap.xml file and Google should be happy.
(Make sure you have the site map protocol included at the top of your sitemap file - https://www.google.com/webmasters/tools/docs/en/protocol.html )

I use PATH_INFO style URLs so my output is something like: http://www.mysite.com/forum/index.php/t/3422/
If you dont use PATH_INFO, you will need to change this line:
$filetext = "<url><loc>" . $WWW_ROOT . "index.php/t/$thread_id/</loc>";


to something like this:
$filetext = "<url><loc>" . $WWW_ROOT . "index.php?t=msg&amp;th=" . $thread_id . "&amp;start=0&amp;/</loc>";


I know it's a bit clunky and one of these days I'll get around to getting the script to write the sitemap file directly.

HTH,

Rush

[Updated on: Thu, 29 May 2008 10:55]

Report message to a moderator

Re: Google sitemap of FUDforum [message #157838 is a reply to message #41155] Tue, 02 September 2008 16:16 Go to previous messageGo to next message
littleking is currently offline  littleking   United States
Messages: 187
Registered: January 2007
Karma: 2
Senior Member
add to buddy list
ignore all messages by this user
great code, works wonderfully
Re: Google sitemap of FUDforum [message #157862 is a reply to message #157838] Tue, 23 September 2008 19:59 Go to previous messageGo to next message
rush2112 is currently offline  rush2112   United States
Messages: 15
Registered: April 2005
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Thanks!

Glad it worked for ya...

Rush
Re: Google sitemap of FUDforum [message #161170 is a reply to message #157862] Sat, 21 November 2009 04:21 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
I would appreciate if you guys can test and improve this solution so that it can be added to the next release.
Re: Google sitemap of FUDforum [message #161265 is a reply to message #161170] Sun, 29 November 2009 15:54 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
An improved Google sitemap generator was submitted. It can be downloaded from here
Re: Google sitemap of FUDforum [message #162690 is a reply to message #161265] Fri, 02 July 2010 02:36 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
This tool worked great, however perhaps it should check with the permissions, so that it validates all the threads as "Guest" or "Registered user" - Now it grabs everything on the site and a site such as mine, where 90% of the content is "private" and shouldnt be indexed by Google (Since it will just bump into error links) this script isn't overly helpful I am afraid.

I could of course manually alter the SQL to fit the proper forum_IDs where guest has access to, but I think a real sollution could be nice? It's above my head though, so I cant help with that one.


Re: Google sitemap of FUDforum [message #162694 is a reply to message #162690] Sat, 03 July 2010 04:22 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
Will this help?
http://fudforum.svn.sourceforge.net/viewvc/fudforum?view=revision&revis ion=4975
Re: Google sitemap of FUDforum [message #162696 is a reply to message #162694] Sat, 03 July 2010 06:01 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Yes, if you change
// Limit topics to what the user has access to.
	if ($auth_as_user) {
		$join = 'INNER JOIN fud30_group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
				LEFT JOIN fud30_group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
				LEFT JOIN fud30_mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
		$lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
	} else {
		$join = 'INNER JOIN fud30_group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
		$lmt  = '(g1.group_cache_opt & 2) > 0';
	}


to this:

// Limit topics to what the user has access to.
        if ($auth_as_user) {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
                $lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
        } else {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
                $lmt  = '(g1.group_cache_opt & 2) > 0';
        }


Re: Google sitemap of FUDforum [message #162697 is a reply to message #162696] Sat, 03 July 2010 06:17 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Oh yes, there is another slight overlook also.
$filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}index.php/t/${thread_id}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}index.php?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }

Should index.php really be written in clear? Shouldn't it be replaced by ${ROOT} or something? Like below:

$filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}t/${thread_id}/${thread_title_SEO}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }




With my SEO tweak the whole code looks like this now:

(inner joined msg table to get thread subject so i could mangle all chars away and lowercase it)

note: without tweaks to users.inc.t threads who start with a number will be interpreted as "&start=20" (20=number) and the sitemap link wont work, i fixed this with an is_numeric check in users.inc.t, still would break on a thread where subject actually is a number, but well, I can live with that. - Another fix could be to just start the SEO subject with a -.

PLEASE note that my str_replace code is UGLY and should be corrected by someone that is properly skilled with str_replace or regular expressions. I have no clue about that.
#!/usr/bin/php -q
<?php
/**
* copyright            : (C) 2001-2010 Advanced Internet Designs Inc.
* email                : forum(at)prohost(dot)org
* $Id$
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; version 2 of the License.
**/

        /* Google sitemap settings. */
        $frequency    = 'weekly';
        $priority     = '0.5';
        $auth_as_user = 0;      // User 0 == anonymous.

        set_time_limit(0);
        ini_set('memory_limit', '128M');
        define('forum_debug', 1);
        unset($_SERVER['REMOTE_ADDR']);

        if (strncmp($_SERVER['argv'][0], '.', 1)) {
                require (dirname($_SERVER['argv'][0]) .'/GLOBALS.php');
        } else {
                require (getcwd() .'/GLOBALS.php');
        }

        fud_use('err.inc');
        fud_use('db.inc');

        // Limit topics to what the user has access to.
        if ($auth_as_user) {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
                $lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
        } else {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
                $lmt  = '(g1.group_cache_opt & 2) > 0';
        }
$c = uq('SELECT t.id, t.last_post_date, t.root_msg_id, m.id, m.subject FROM '. $GLOBALS['DBHOST_TBL_PREFIX'] .'thread t '. $join .'
                inner join '. $GLOBALS['DBHOST_TBL_PREFIX'] .'msg m ON t.root_msg_id = m.id
                WHERE '. $lmt .' ORDER BY t.last_post_date DESC LIMIT 50000');

        echo "Writing sitemap.xml file to ${GLOBALS['WWW_ROOT_DISK']}\n";
        $fh = fopen($GLOBALS['WWW_ROOT_DISK'].'/sitemap.xml', 'w');
        $xmlhead = <<<EOF
<?xml version='1.0' encoding='UTF-8'?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">\n
EOF;
        fwrite($fh, $xmlhead);

        while ($r = db_rowarr($c)) {
                $thread_id = $r[0];
                // $post_stamp = date('H:i:s', $r[1]) .'T'. date('Y-m-d', $r[1]);
                $post_stamp = date('H:i:s\TY-m-d', $r[1]);

                $thread_title_SEO = str_replace(" ","-",$r[4]);
                $thread_title_SEO = strtolower($thread_title_SEO);
                $thread_title_SEO = preg_replace('/[^a-z0-9_]/i', '-', $thread_title_SEO);
                $thread_title_SEO = preg_replace('/_[_]*/i', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('---', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('--', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('-s-', 's-', $thread_title_SEO);
                $thread_title_SEO = str_replace("%","",$thread_title_SEO);

                $filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}t/${thread_id}/${thread_title_SEO}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }
                $filetext .= "\t<lastmod>${post_stamp}+00:00</lastmod>\n";
                $filetext .= "\t<changefreq>$frequency</changefreq>\n";
                $filetext .= "\t<priority>$priority</priority>\n";
                $filetext .= "</url>\n";
fwrite($fh, $filetext);
        }

        fwrite($fh, "</urlset>\n");
        fclose($fh);

        $google = 'www.google.com';
        echo "Notify $google...";
        if($fp = @fsockopen($google, 80)) {
                $req = "GET /webmasters/sitemaps/ping?sitemap=". urlencode($GLOBALS['WWW_ROOT'].'sitemap.xml') ." HTTP/1.1\r\n".
                       "Host: $google\r\n".
                       "User-Agent: FUDforum $FORUM_VERSION\r\n".
                       "Connection: Close\r\n\r\n";
                fwrite($fp, $req);
                while(!feof($fp)) {
                        if( @preg_match('~^HTTP/\d\.\d (\d+)~i', fgets($fp, 128), $m) ) {
                                echo ' status: '. intval($m[1]) ."\n";
                                break;
                        }
                }
                fclose($fp);
        }

        echo "Done!\n";
?>



Re: Google sitemap of FUDforum [message #162698 is a reply to message #162697] Sun, 04 July 2010 04:58 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Just noticed something else:
This line:
$post_stamp = date('H:i:s\TY-m-d', $r[1]);

Should most likely look like this:
$post_stamp = date('TY-m-d\H:i:s', $r[1]);


Re: Google sitemap of FUDforum [message #162699 is a reply to message #162698] Sun, 04 July 2010 05:12 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Also perhaps rip out the T to make it valid

Re: Google sitemap of FUDforum [message #162700 is a reply to message #162699] Sun, 04 July 2010 13:19 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
The time and T are all valid according to http://www.w3.org/TR/NOTE-datetime
Should probably be date('Y-m-d\TH:i:s', $r[1]);
Re: Google sitemap of FUDforum [message #162701 is a reply to message #162700] Sun, 04 July 2010 13:48 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Yeah, just like that, sorry, I was tired, at least we got it right now!

Re: Google sitemap of FUDforum [message #162959 is a reply to message #162701] Thu, 09 September 2010 14:29 Go to previous messageGo to next message
slowmo is currently offline  slowmo   Singapore
Messages: 22
Registered: July 2009
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Will this work with 3.0.1? Whats the final script look like now? thanks, this looks useful. If i can get the final script please i can cron it ^^
Re: Google sitemap of FUDforum [message #162961 is a reply to message #162959] Thu, 09 September 2010 15:28 Go to previous message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
Try this version (not the latest, but the latest that should still work with 3.0.1):

http://fudforum.svn.sourceforge.net/viewvc/fudforum/trunk/install/forum_dat a/scripts/sitemap.php?revision=4984
Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: Chrome extension
Next Topic: Require approval of all posts
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Tue Oct 24 00:16:51 EDT 2017

Total time taken to generate the page: 0.01123 seconds