FUD Question [message #657] |
Thu, 21 February 2002 16:22 |
B@rt
Messages: 6 Registered: February 2002 Location: Hilversum, The Netherland...
Karma: 0
|
Junior Member |
|
|
Hi all,
I couldn't seem to find a place for FUD user questions, so I'm hoping someone is willing to help me here.. Here's my question:
At this moment, we are running phorumon a busy community site (100k pageviews/day; > 100k messages in the database). I've been looking for a new forum server and FUD definately has all the features that I am looking for. BUT....
The fact that so many items are stored on the filesystem instead of in the database (message bodies, user settings) scares the heck out of me. For example, it seems like each thread gets stored in /data/messages. What happens when you have more than, say, 1000 threads? Are they all stored in the same subdirectory? Or does FUD create 'buckets' like for example message 1-500, 501-1000 etc. I'm afraid that I'm not sure if our operating system (FreeBSD) can handle that.
Is there something I've missed while installing? Maybe it's possible to force everything to be stored in the database?
Does anyone else here have experience with running FUD on a large site?
Thx,
B@rt
Cheers,<br><br>B@rt<br><br>--<br>Web Monkey, Project Manager, 3D Addict
|
|
|
Re: FUD Question [message #658 is a reply to message #657] |
Thu, 21 February 2002 16:41 |
Ilia
Messages: 13241 Registered: January 2002
Karma: 0
|
Senior Member Administrator Core Developer |
|
|
B@rt wrote on Thu, 21 February 2002 11:22 AM |
The fact that so many items are stored on the filesystem instead of in the database (message bodies, user settings) scares the heck out of me. For example, it seems like each thread gets stored in /data/messages. What happens when you have more than, say, 1000 threads? Are they all stored in the same subdirectory? Or does FUD create 'buckets' like for example message 1-500, 501-1000 etc. I'm afraid that I'm not sure if our operating system (FreeBSD) can handle that.
|
FUDforum stores each thread in a file of its own and has 1 file of private messages. This is quite fast, in 90% of the cases it will be faster then MySQL because there is less overhead. On a rather old test machine running Linux we've tested to over 100,000+ individual threads and it was VERY fast. For message reading we rely on fast sequential seeks, which are heavily optimized for in any operating system. If you are using an old file system at over 100,000 threads you may see very small slowdowns, something like 100 microseconds extra per fopen call (there are 1,000,000 microseconds within a second). Also, for common threads you'll notice that the more times they are seen the faster they become, the reason being is because UNIX kernels cache the file system using the available RAM, so a commonly opened file is likely to be in this RAM cache affectively making reading from that file instant. In addition we create database views for threads, which allows you to scale to virtually limitless # of threads without taking a performance hit unlike in any other forum software today.
I am currently running FUDforum on a site with just under 50,000 messages and 1200+ threads. You can take a look for yourself at that forum at: http://www.mediaminer.org/forum/
FUDforum Core Developer
|
|
|
|
Re: FUD Question [message #660 is a reply to message #659] |
Thu, 21 February 2002 17:03 |
Ilia
Messages: 13241 Registered: January 2002
Karma: 0
|
Senior Member Administrator Core Developer |
|
|
B@rt wrote on Thu, 21 February 2002 11:48 AM | Hi Protoss,
thanks for your answer! I tried your forum and it's quite fast indeed. Still I'm a bit worried: Am I correct in assuming then that on a system with 100,000 threads there will be 100,000 threads stored in *one* directory? That sounds like a 'dirty' solution to me..
B@rt
|
Yes, all of threads will be stored in a single directory. This won't be a problem, unless you intend to run ls -l in that directory all the time
It may seem "dirty" but consider the alternative SQL approach. 1st there will be a limit to a length of the message, because you'll have to define the maximum length of the field. MySQL does not cache the TEXT/BLOB and like fields because it would be a waste of memory, and since all the data is kept in one file kernel will not cache that file unless you have a large amount of available RAM (won't happen in most cases). When doing a MySQL insert MySQL always sets a lock on the table. So, since all the messages are stored in one table, if you have many people posting at the same time they'll have to wait for all the previous inserts to go through. On a very busy forum that may cause serious delayes during message posting. Another problem is that when you get data from MySQL there is a lot of in-between overhead. MySQL needs to get the data and allocate memory for it, then php's mysql module needs to get the data parse it and allocate memory for it. Then when you fetch the data for yourself another copy of the data is made this time for the php script. In the end you have 3 copies of the same data in memory in addition to various php wrappers around the data. This as you can imagine takes quite a bit of memory & cpu. When working with files there is no MySQL step, the data is fetched directly into php, and absolutely no parsing of the data is done real time. Another problem you may encounter is that a file system (at least most old ones) had a limitation that allowed a file to be no larger then 2 gigabytes. On a large forum that would restrict growth and cause all kinds of problem.
I have a question, which file system do you use? I myself have a few FreeBSD boxes with a stock file system which came with the FreeBSD 3.3 release, albeit it is slower the Linux's ReiserFS when there are lots & lots of files but it is still quite fast. I have a directory structure for storing images with ~10 million images split across 1024 directories. I've had absolutely no problems with speed, while reading files from those directories.
The bottom line to my rant , is that it is MOST UNLIKELY that you will suffer any performance loss over using FUDforum's way to store messages.
FUDforum Core Developer
|
|
|
|
|
Re: FUD Question [message #668 is a reply to message #657] |
Fri, 22 February 2002 02:27 |
hackie
Messages: 177 Registered: January 2002
Karma: 0
|
Senior Member Core Developer |
|
|
B@rt wrote on Fri, 22 February 2002 1:22 AM | Hi all,
I couldn't seem to find a place for FUD user questions, so I'm hoping someone is willing to help me here.. Here's my question:
At this moment, we are running phorumon a busy community site (100k pageviews/day; > 100k messages in the database). I've been looking for a new forum server and FUD definately has all the features that I am looking for. BUT....
The fact that so many items are stored on the filesystem instead of in the database (message bodies, user settings) scares the heck out of me. For example, it seems like each thread gets stored in /data/messages. What happens when you have more than, say, 1000 threads? Are they all stored in the same subdirectory? Or does FUD create 'buckets' like for example message 1-500, 501-1000 etc. I'm afraid that I'm not sure if our operating system (FreeBSD) can handle that.
Is there something I've missed while installing? Maybe it's possible to force everything to be stored in the database?
Does anyone else here have experience with running FUD on a large site?
Thx,
B@rt
|
Well, we think it is better to store the message bodies on the file system for a rather large amount of reasons, prot' there listed some of them, there are more.
Consider, first of all there is of course file system caching of files, while MySQL not chaching BLOBS/TEXT. It gets more complicated, consider the overhead of storing such large chunks of data in the database, well, to retrieve it you would have to transfer all this data over a socket, sure you can make it faster by using UNIX sockets, but still, that is a huge amount of pointless overhead, as opposed to the file system! There is a disadvantage to the file system storing of course, that is, you can't run the forum web server and store the message bodies on a different machine (unless you use NFS of course), but we think it's a fine trade off.
MOREOVER! Early version of FUDforum did use DB for storing bodies, but we converted it for performance considerations to the FS code you see today. It took us about 15MIN ... So, if you want to convert FUDforum to use the DB to store bodies, it would take you all of about.. oh.. 15min and a small script to read them back in.....
In addition the only things stored in those files are message bodies, the messages themselves are of course in the database.
cc intelligence.c -o intelligence
$ ./intelligence
Segmentation fault
|
|
|
|
|
Re: FUD Question [message #2173 is a reply to message #2172] |
Fri, 03 May 2002 16:03 |
Ilia
Messages: 13241 Registered: January 2002
Karma: 0
|
Senior Member Administrator Core Developer |
|
|
To "truly" do full database mode all data will need to be stored in database this includes file attachments, avatars, smilies etc... No bulletin board does this, since that would be extremely silly to store binary data in the database. Just dumping the message bodies into DB like VBulletin and phpBB2 do won't accomplish anything beyond making the forum slower. Adding such functionality would simply be a hack to make FUDforum implement bad functionality of other BB systems. Which is why I resist adding such functionality.
In both UNIX and Windows using NFS (UNIX) or Samba (Windows) data on a drive can be shared by many machines, it is a common practice to share data like this for other applications. So, there is nothing to stop anyone from sharing the data across a network or even Internet.
You should also realize that THE only major datablock stored on disk are the message bodies all other info (presumably user settings, thread info, etc...) is in MySQL database, so it is already VERY easily accessible.
If you open FUD2 code, you can reasonably easily make it write the message bodies to the database rather then disk. If you are familiar with PHP it should take no more then 1 hour or so for you to do that.
FUDforum Core Developer
|
|
|