Re: Ignoring Case on directories [message #171277 is a reply to message #171245] |
Thu, 30 December 2010 00:22 |
jwcarlton
Messages: 76 Registered: December 2010
Karma:
|
Member |
|
|
On Dec 29, 3:56 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> On 12/29/2010 2:15 PM, jwcarlton wrote:
>
>
>
>> On Dec 29, 9:01 am, Captain Paralytic<paul_laut...@yahoo.com> wrote:
>>> On Dec 29, 5:31 am, jwcarlton<jwcarl...@gmail.com> wrote:
>
>>>> On Dec 28, 11:00 pm, Jerry Stuckle<jstuck...@attglobal.net> wrote:
>
>>>> > On 12/28/2010 8:13 PM, jwcarlton wrote:
>
>>>> >> On Dec 28, 5:39 pm, Captain Paralytic<paul_laut...@yahoo.com> wrote:
>>>> >>> On Dec 28, 8:33 pm, jwcarlton<jwcarl...@gmail.com> wrote:
>
>>>> >>>> I think that the answer to this is "no", but I thought I'd ask :-)
>
>>>> >>>> I'm wanting to open a file where the directory path is given by the
>>>> >>>> user. For example:
>
>>>> >>>> if (is_file("/path/to/" . $_GET['directory'] . "/file.txt"))
>>>> >>>> $example = FILE("/path/to/" . $_GET['directory'] . "/file.txt");
>
>>>> >>>> else
>>>> >>>> // return error
>
>>>> >>>> // please ignore any typos; I just typed this up here for the example
>
>>>> >>>> The thing is, the directory path could be, say, /path/to/SomeDirectory/
>>>> >>>> file.txt, but the user could enter "somedirectory"; in which case,
>>>> >>>> they would get the error.
>
>>>> >>>> Currently, I keep all of the directory names in a MySQL database, then
>>>> >>>> before opening file.txt, I search for the directory in MySQL (which is
>>>> >>>> case insensitive), then load the path based on the name in the
>>>> >>>> database instead of what's given. But during peak hours, this method
>>>> >>>> can result in several hundred MySQL queries per minute.
>
>>>> >>>> Before this, I just used opendir to load all of the directories into
>>>> >>>> an array on the fly, then did a case insensitive search through the
>>>> >>>> array. But, when I started having 90,000 directories (30,000 in 3
>>>> >>>> separate parent directories), this was considerably slower than using
>>>> >>>> MySQL.
>
>>>> >>>> So, the MySQL search works, but the question is, can PHP do a
>>>> >>>> directory lookup that's case insensitive; and, preferably, return the
>>>> >>>> case-correct directory name?
>
>>>> >>> A few hundred MySQL queries a minute. What's the problwm with that?
>
>>>> >>> And if you have the directories in a database, why aren't you
>>>> >>> presenting them to the user to select as Jerry suggested.
>
>>>> >>> And if there reall are good reasons for not doing a mere few hundreds
>>>> >>> of queries aa minute, or offering a list of direcories to select from,
>>>> >>> why not just make them all lower case in the first place?
>
>>>> >>> If feels like you are trying to find a way to fix a very poor design.
>>>> >>> The best way to do that is to change the design to something decent.
>
>>>> >> Naturally, this isn't the only script running queries. My average
>>>> >> Apache processes per day is around 500 (although this week, the
>>>> >> average is over 600).
>
>>>> >> I don't think the "why" is terribly relevant to the thread, but the
>>>> >> logic is that the directories represent usernames of registered users.
>>>> >> In addition to other features, this includes the ability for one user
>>>> >> to send a message to another user. Now, they DO have the option of
>>>> >> clicking on that person's username, which resolves the case issue, but
>>>> >> I also have the option for them to simply enter the recipient's
>>>> >> username... which is where I am trying to correct the case. It would
>>>> >> be pointless to force them to choose a username from a list of
>>>> >> 90,000+.
>
>>>> >> If I could do it over, I would have each of these directories created
>>>> >> in lowercase by default. But unfortunately, this system dates back for
>>>> >> about 10 years, and at the time, I had no clue that I would have so
>>>> >> many users (I remember celebrating when we hit 500). One day, I'll
>>>> >> probably go through and revise the entire thing, but for now, I'm
>>>> >> simply trying to find a faster way to find the correct case for the
>>>> >> username.
>
>>>> >> Robert, you're correct that this is running on Linux, so yeah, the
>>>> >> system itself is case sensitive. I was hoping that something like
>>>> >> realpath() or pathinfo() would return a case-corrected directory, but
>>>> >> neither of those do it.
>
>>>> >> Jerry, do you mean to do something other than using opendir to grab
>>>> >> all of the directory names, then sorting through them to find a match?
>>>> >> This turned out to be uber-slow:
>
>>>> >> $dir = opendir("/path/to/1/"); // 30k directories
>>>> >> array_push($dir, opendir("/path/to/2/")); // 30k directories
>>>> >> array_push($dir, opendir("/path/to/3/")); // between 20k and 30k
>>>> >> directories
>
>>>> >> foreach ($dir as $key) {
>>>> >> if (strtolower($key) == strtolower($_GET['directory']))
>>>> >> $found_directory = $key;
>>>> >> }
>
>>>> >> That's just a sample typed up to show the logic, of course, so please
>>>> >> ignore any typos.
>
>>>> > I was considering stricmp() - but if you've got 30K directories, you've
>>>> > got more problems than that. You need to look at your architecture.
>
>>>> > --
>>>> > ==================
>>>> > Remove the "x" from my email address
>>>> > Jerry Stuckle
>>>> > JDS Computer Training Corp.
>>>> > jstuck...@attglobal.net
>>>> > ==================
>
>>>> What would you suggest as a better way to handle an unlimited number
>>>> of folders that contain an unlimited number of messages, as well as
>>>> additional files that may be unique to that user (an "ignore list",
>>>> personalized settings, etc)?
>
>>>> The current system creates a directory for each user, then text files
>>>> within that directory for each "folder", and additional text files for
>>>> settings, etc. I originally based this on how the server stores email,
>>>> and the only real limitation I had was when I hit 32,000 directories
>>>> (then quickly discovered that there's a default limit to the number of
>>>> directories allowed).
>
>>>> I don't know how I could accomplish this in a database. Creating a new
>>>> table for each user seems ridiculous, but what's the alternative? A
>>>> column for each "folder" wouldn't work, since each user can create new
>>>> folders for themselves.
>
>>> Now we get to the crux.
>
>>> You are correct that this should all be done in a database.
>
>>> However you DO NOT have a separate table for each user. You have table
>>> of users which has columns for each of their settings. Then you have
>>> other tables in a 1-M relationship containing things like messages.
>>> This is why the "why" is important.
>
>>> What you have described is a pretty standard relational database
>>> application. The way that the application has been designed makes
>>> handling it far more complicated than it really needs to be.
>
>>> Welcome to the world of database applications. You have a lot to
>>> learn, but if you're willing to apply yourself to it, we will assist.
>
>> Traditionally, I work on this website each December. That's usually my
>> "slow" month, so it's the only time that I can devote entirely to this
>> one.
>
>> This year, the time has been spent on making the site more cross-
>> browser compliant, with an emphasis on mobile. And, of course, speed.
>> With server modifications, program modifications, etc, I've been able
>> to increase the speed by about 25%. And just in time; in the last
>> week, the site traffic has jumped by 66%!
>
>> Last December, I spent the time moving everything over to a database,
>> which turned out to be a LOT more work than I'd anticipated. I didn't
>> know that the default settings for MySQL only allowed for 100 queries
>> at once, so almost immediately, everything went south. It took several
>> weeks to get all of that straightened out (I believe that Jerry was a
>> big help during that time), although I still have issues on days when
>> there's an unexpected jump in traffic.
>
>> My time for this year is pretty much up, so it will probably be next
>> year when I work on moving this section over to a database. It will
>> just about require an entire site rebuild on the backend, so the
>> potential for bugs and errors is scary. But at the same time, I'm
>> expecting a 1000% increase in traffic over the next year, so I'm sure
>> it will be necessary.
>
>> Thanks, guys,
>
>> Jason
>
> 100 concurrent requests should be nothing to MySQL - requests shouldn't
> take very long if the database is properly designed and optimized. The
> fact you're running into that logjam indicates you have some performance
> problems. I would recommend you check it out.
>
> --
> ==================
> Remove the "x" from my email address
> Jerry Stuckle
> JDS Computer Training Corp.
> jstuck...@attglobal.net
> ==================
No, no, that was last year's problem :-) You helped me to resolve it
by setting up proper indexes, using SQL_CACHE on recurring queries,
and by modifying my.cnf. I also had to increase the number of allowed
Apache processes (which I had to increase again recently).
Right now, I don't have any real speed issues. But, as I said before,
I'm expecting a significant increase in traffic in the next year, and
this is my best time to optimize everything in advance.
|
|
|