FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum » How To » Google sitemap of FUDforum
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
Google sitemap of FUDforum [message #28343] Wed, 19 October 2005 06:40 Go to next message
Hurry   India
Messages: 33
Registered: October 2005
Karma: 0
Member
Hello! It will be really great if there can be a way to make a Google sitemap.xml file of our FUDforum's categories, topics and posts. My experience that all the sitemap.xml files I had loaded in the http://www.google.com/webmasters/sitemaps/ get very quickly and regularly spidered by google. I hope there is some way to do it or that Ilia mayadd this feature.
Re: Google sitemap of FUDforum [message #28348 is a reply to message #28343] Wed, 19 October 2005 13:16 Go to previous messageGo to next message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
use FUDAPI to make this file or just make it manually.

FUDforum Core Developer
Re: Google sitemap of FUDforum [message #41119 is a reply to message #28348] Sun, 25 May 2008 16:56 Go to previous messageGo to next message
Deimos is currently offline  Deimos   United Kingdom
Messages: 1
Registered: May 2008
Karma: 0
Junior Member
Hello, anyone can explain how to do this please, either manually or using fudapi?

I tried to look at fudapi but didn't found out how to use this file to generate a sitemaps.xml

Thanks for your help
Re: Google sitemap of FUDforum [message #41155 is a reply to message #41119] Thu, 29 May 2008 14:48 Go to previous messageGo to next message
rush2112 is currently offline  rush2112   United States
Messages: 15
Registered: April 2005
Karma: 0
Junior Member
Here's a little file I wrote to do that:

<?php
include ("GLOBALS.php");

$dbh=mysql_connect ("$DBHOST", "$DBHOST_USER", "$DBHOST_PASSWORD") or die ('No Connect: ' . mysql_error());
mysql_select_db ("$DBHOST_DBNAME");
$query = "SELECT thread_id, MAX(`post_stamp`) from `fud26_msg` group by thread_id";
$result = mysql_query($query) or die ("$admtext[cannotexecutequery]: $query");

echo "Writing forum sitemap to the file<br><br>";
while( $row = mysql_fetch_array($result) ) {
$thread_id = $row["thread_id"];
$post_stamp = $row["MAX(`post_stamp`)"];
$post_time = date("H:i:s",$post_stamp);
$post_date = date("Y-m-d",$post_stamp);
$filetext = "<url><loc>" . $WWW_ROOT . "index.php/t/$thread_id/</loc>";
$filetext .= "<lastmod>" . $post_date . "T" . $post_time . "+00:00</lastmod><changefreq>weekly</changefreq></url>\n";
print $filetext;
}
?>


Just save the code into a php file and place it in your FUDforum directory.

Run the file from a browser, then view the 'Page Source'
You can copy (excluding the first line) and paste the info into your_forum_sitemap.xml file and Google should be happy.
(Make sure you have the site map protocol included at the top of your sitemap file - https://www.google.com/webmasters/tools/docs/en/protocol.html )

I use PATH_INFO style URLs so my output is something like: http://www.mysite.com/forum/index.php/t/3422/
If you dont use PATH_INFO, you will need to change this line:
$filetext = "<url><loc>" . $WWW_ROOT . "index.php/t/$thread_id/</loc>";


to something like this:
$filetext = "<url><loc>" . $WWW_ROOT . "index.php?t=msg&amp;th=" . $thread_id . "&amp;start=0&amp;/</loc>";


I know it's a bit clunky and one of these days I'll get around to getting the script to write the sitemap file directly.

HTH,

Rush

[Updated on: Thu, 29 May 2008 14:55]

Report message to a moderator

Re: Google sitemap of FUDforum [message #157838 is a reply to message #41155] Tue, 02 September 2008 20:16 Go to previous messageGo to next message
littleking is currently offline  littleking   United States
Messages: 187
Registered: January 2007
Karma: 2
Senior Member
great code, works wonderfully
Re: Google sitemap of FUDforum [message #157862 is a reply to message #157838] Tue, 23 September 2008 23:59 Go to previous messageGo to next message
rush2112 is currently offline  rush2112   United States
Messages: 15
Registered: April 2005
Karma: 0
Junior Member
Thanks!

Glad it worked for ya...

Rush
Re: Google sitemap of FUDforum [message #161170 is a reply to message #157862] Sat, 21 November 2009 09:21 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3771
Registered: December 2004
Karma: 28
Senior Member
Administrator
Core Developer
I would appreciate if you guys can test and improve this solution so that it can be added to the next release.
Re: Google sitemap of FUDforum [message #161265 is a reply to message #161170] Sun, 29 November 2009 20:54 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3771
Registered: December 2004
Karma: 28
Senior Member
Administrator
Core Developer
An improved Google sitemap generator was submitted. It can be downloaded from here
Re: Google sitemap of FUDforum [message #162690 is a reply to message #161265] Fri, 02 July 2010 06:36 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
This tool worked great, however perhaps it should check with the permissions, so that it validates all the threads as "Guest" or "Registered user" - Now it grabs everything on the site and a site such as mine, where 90% of the content is "private" and shouldnt be indexed by Google (Since it will just bump into error links) this script isn't overly helpful I am afraid.

I could of course manually alter the SQL to fit the proper forum_IDs where guest has access to, but I think a real sollution could be nice? It's above my head though, so I cant help with that one.


Re: Google sitemap of FUDforum [message #162694 is a reply to message #162690] Sat, 03 July 2010 08:22 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3771
Registered: December 2004
Karma: 28
Senior Member
Administrator
Core Developer
Will this help?
http://fudforum.svn.sourceforge.net/viewvc/fudforum?view=revision&revis ion=4975
Re: Google sitemap of FUDforum [message #162696 is a reply to message #162694] Sat, 03 July 2010 10:01 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
Yes, if you change
// Limit topics to what the user has access to.
	if ($auth_as_user) {
		$join = 'INNER JOIN fud30_group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
				LEFT JOIN fud30_group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
				LEFT JOIN fud30_mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
		$lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
	} else {
		$join = 'INNER JOIN fud30_group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
		$lmt  = '(g1.group_cache_opt & 2) > 0';
	}


to this:

// Limit topics to what the user has access to.
        if ($auth_as_user) {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
                $lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
        } else {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
                $lmt  = '(g1.group_cache_opt & 2) > 0';
        }


Re: Google sitemap of FUDforum [message #162697 is a reply to message #162696] Sat, 03 July 2010 10:17 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
Oh yes, there is another slight overlook also.
$filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}index.php/t/${thread_id}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}index.php?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }

Should index.php really be written in clear? Shouldn't it be replaced by ${ROOT} or something? Like below:

$filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}t/${thread_id}/${thread_title_SEO}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }




With my SEO tweak the whole code looks like this now:

(inner joined msg table to get thread subject so i could mangle all chars away and lowercase it)

note: without tweaks to users.inc.t threads who start with a number will be interpreted as "&start=20" (20=number) and the sitemap link wont work, i fixed this with an is_numeric check in users.inc.t, still would break on a thread where subject actually is a number, but well, I can live with that. - Another fix could be to just start the SEO subject with a -.

PLEASE note that my str_replace code is UGLY and should be corrected by someone that is properly skilled with str_replace or regular expressions. I have no clue about that.
#!/usr/bin/php -q
<?php
/**
* copyright            : (C) 2001-2010 Advanced Internet Designs Inc.
* email                : forum(at)prohost(dot)org
* $Id$
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; version 2 of the License.
**/

        /* Google sitemap settings. */
        $frequency    = 'weekly';
        $priority     = '0.5';
        $auth_as_user = 0;      // User 0 == anonymous.

        set_time_limit(0);
        ini_set('memory_limit', '128M');
        define('forum_debug', 1);
        unset($_SERVER['REMOTE_ADDR']);

        if (strncmp($_SERVER['argv'][0], '.', 1)) {
                require (dirname($_SERVER['argv'][0]) .'/GLOBALS.php');
        } else {
                require (getcwd() .'/GLOBALS.php');
        }

        fud_use('err.inc');
        fud_use('db.inc');

        // Limit topics to what the user has access to.
        if ($auth_as_user) {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=2147483647 AND g1.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g2 ON g2.user_id='. $auth_as_user .' AND g2.resource_id=f.id
                                LEFT JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'mod mm ON mm.forum_id=t.forum_id AND mm.user_id='. $auth_as_user .' ';
                $lmt  = '(mm.id IS NOT NULL OR (COALESCE(g2.group_cache_opt, g1.group_cache_opt) & 2) > 0)';
        } else {
                $join = 'INNER JOIN '. $GLOBALS['DBHOST_TBL_PREFIX'] .'group_cache g1 ON g1.user_id=0 AND g1.resource_id=t.forum_id ';
                $lmt  = '(g1.group_cache_opt & 2) > 0';
        }
$c = uq('SELECT t.id, t.last_post_date, t.root_msg_id, m.id, m.subject FROM '. $GLOBALS['DBHOST_TBL_PREFIX'] .'thread t '. $join .'
                inner join '. $GLOBALS['DBHOST_TBL_PREFIX'] .'msg m ON t.root_msg_id = m.id
                WHERE '. $lmt .' ORDER BY t.last_post_date DESC LIMIT 50000');

        echo "Writing sitemap.xml file to ${GLOBALS['WWW_ROOT_DISK']}\n";
        $fh = fopen($GLOBALS['WWW_ROOT_DISK'].'/sitemap.xml', 'w');
        $xmlhead = <<<EOF
<?xml version='1.0' encoding='UTF-8'?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">\n
EOF;
        fwrite($fh, $xmlhead);

        while ($r = db_rowarr($c)) {
                $thread_id = $r[0];
                // $post_stamp = date('H:i:s', $r[1]) .'T'. date('Y-m-d', $r[1]);
                $post_stamp = date('H:i:s\TY-m-d', $r[1]);

                $thread_title_SEO = str_replace(" ","-",$r[4]);
                $thread_title_SEO = strtolower($thread_title_SEO);
                $thread_title_SEO = preg_replace('/[^a-z0-9_]/i', '-', $thread_title_SEO);
                $thread_title_SEO = preg_replace('/_[_]*/i', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('---', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('--', '-', $thread_title_SEO);
                $thread_title_SEO = str_replace('-s-', 's-', $thread_title_SEO);
                $thread_title_SEO = str_replace("%","",$thread_title_SEO);

                $filetext = "<url>\n";
                if ($FUD_OPT_2 & 32768) {       // USE_PATH_INFO
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}t/${thread_id}/${thread_title_SEO}/</loc>\n";
                } else {
                        $filetext .= "\t<loc>${WWW_ROOT}${ROOT}?t=msg&amp;th=${thread_id}&amp;start=0</loc>\n";
                }
                $filetext .= "\t<lastmod>${post_stamp}+00:00</lastmod>\n";
                $filetext .= "\t<changefreq>$frequency</changefreq>\n";
                $filetext .= "\t<priority>$priority</priority>\n";
                $filetext .= "</url>\n";
fwrite($fh, $filetext);
        }

        fwrite($fh, "</urlset>\n");
        fclose($fh);

        $google = 'www.google.com';
        echo "Notify $google...";
        if($fp = @fsockopen($google, 80)) {
                $req = "GET /webmasters/sitemaps/ping?sitemap=". urlencode($GLOBALS['WWW_ROOT'].'sitemap.xml') ." HTTP/1.1\r\n".
                       "Host: $google\r\n".
                       "User-Agent: FUDforum $FORUM_VERSION\r\n".
                       "Connection: Close\r\n\r\n";
                fwrite($fp, $req);
                while(!feof($fp)) {
                        if( @preg_match('~^HTTP/\d\.\d (\d+)~i', fgets($fp, 128), $m) ) {
                                echo ' status: '. intval($m[1]) ."\n";
                                break;
                        }
                }
                fclose($fp);
        }

        echo "Done!\n";
?>



Re: Google sitemap of FUDforum [message #162698 is a reply to message #162697] Sun, 04 July 2010 08:58 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
Just noticed something else:
This line:
$post_stamp = date('H:i:s\TY-m-d', $r[1]);

Should most likely look like this:
$post_stamp = date('TY-m-d\H:i:s', $r[1]);


Re: Google sitemap of FUDforum [message #162699 is a reply to message #162698] Sun, 04 July 2010 09:12 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
Also perhaps rip out the T to make it valid

Re: Google sitemap of FUDforum [message #162700 is a reply to message #162699] Sun, 04 July 2010 17:19 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3771
Registered: December 2004
Karma: 28
Senior Member
Administrator
Core Developer
The time and T are all valid according to http://www.w3.org/TR/NOTE-datetime
Should probably be date('Y-m-d\TH:i:s', $r[1]);
Re: Google sitemap of FUDforum [message #162701 is a reply to message #162700] Sun, 04 July 2010 17:48 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
Yeah, just like that, sorry, I was tired, at least we got it right now!

Re: Google sitemap of FUDforum [message #162959 is a reply to message #162701] Thu, 09 September 2010 18:29 Go to previous messageGo to next message
slowmo is currently offline  slowmo   Singapore
Messages: 22
Registered: July 2009
Karma: 0
Junior Member
Will this work with 3.0.1? Whats the final script look like now? thanks, this looks useful. If i can get the final script please i can cron it ^^
Re: Google sitemap of FUDforum [message #162961 is a reply to message #162959] Thu, 09 September 2010 19:28 Go to previous message
naudefj is currently offline  naudefj   South Africa
Messages: 3771
Registered: December 2004
Karma: 28
Senior Member
Administrator
Core Developer
Try this version (not the latest, but the latest that should still work with 3.0.1):

http://fudforum.svn.sourceforge.net/viewvc/fudforum/trunk/install/forum_dat a/scripts/sitemap.php?revision=4984
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Chrome extension
Next Topic: Require approval of all posts
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 11:32:40 GMT 2024

Total time taken to generate the page: 0.02388 seconds