Re: Ignoring Case on directories [message #171245 is a reply to message #171232] |
Wed, 29 December 2010 20:56 |
Jerry Stuckle
Messages: 2598 Registered: September 2010
Karma:
|
Senior Member |
|
|
On 12/29/2010 2:15 PM, jwcarlton wrote:
> On Dec 29, 9:01 am, Captain Paralytic<paul_laut...@yahoo.com> wrote:
>> On Dec 29, 5:31 am, jwcarlton<jwcarl...@gmail.com> wrote:
>>
>>
>>
>>> On Dec 28, 11:00 pm, Jerry Stuckle<jstuck...@attglobal.net> wrote:
>>
>>>> On 12/28/2010 8:13 PM, jwcarlton wrote:
>>
>>>> > On Dec 28, 5:39 pm, Captain Paralytic<paul_laut...@yahoo.com> wrote:
>>>> >> On Dec 28, 8:33 pm, jwcarlton<jwcarl...@gmail.com> wrote:
>>
>>>> >>> I think that the answer to this is "no", but I thought I'd ask :-)
>>
>>>> >>> I'm wanting to open a file where the directory path is given by the
>>>> >>> user. For example:
>>
>>>> >>> if (is_file("/path/to/" . $_GET['directory'] . "/file.txt"))
>>>> >>> $example = FILE("/path/to/" . $_GET['directory'] . "/file.txt");
>>
>>>> >>> else
>>>> >>> // return error
>>
>>>> >>> // please ignore any typos; I just typed this up here for the example
>>
>>>> >>> The thing is, the directory path could be, say, /path/to/SomeDirectory/
>>>> >>> file.txt, but the user could enter "somedirectory"; in which case,
>>>> >>> they would get the error.
>>
>>>> >>> Currently, I keep all of the directory names in a MySQL database, then
>>>> >>> before opening file.txt, I search for the directory in MySQL (which is
>>>> >>> case insensitive), then load the path based on the name in the
>>>> >>> database instead of what's given. But during peak hours, this method
>>>> >>> can result in several hundred MySQL queries per minute.
>>
>>>> >>> Before this, I just used opendir to load all of the directories into
>>>> >>> an array on the fly, then did a case insensitive search through the
>>>> >>> array. But, when I started having 90,000 directories (30,000 in 3
>>>> >>> separate parent directories), this was considerably slower than using
>>>> >>> MySQL.
>>
>>>> >>> So, the MySQL search works, but the question is, can PHP do a
>>>> >>> directory lookup that's case insensitive; and, preferably, return the
>>>> >>> case-correct directory name?
>>
>>>> >> A few hundred MySQL queries a minute. What's the problwm with that?
>>
>>>> >> And if you have the directories in a database, why aren't you
>>>> >> presenting them to the user to select as Jerry suggested.
>>
>>>> >> And if there reall are good reasons for not doing a mere few hundreds
>>>> >> of queries aa minute, or offering a list of direcories to select from,
>>>> >> why not just make them all lower case in the first place?
>>
>>>> >> If feels like you are trying to find a way to fix a very poor design.
>>>> >> The best way to do that is to change the design to something decent.
>>
>>>> > Naturally, this isn't the only script running queries. My average
>>>> > Apache processes per day is around 500 (although this week, the
>>>> > average is over 600).
>>
>>>> > I don't think the "why" is terribly relevant to the thread, but the
>>>> > logic is that the directories represent usernames of registered users.
>>>> > In addition to other features, this includes the ability for one user
>>>> > to send a message to another user. Now, they DO have the option of
>>>> > clicking on that person's username, which resolves the case issue, but
>>>> > I also have the option for them to simply enter the recipient's
>>>> > username... which is where I am trying to correct the case. It would
>>>> > be pointless to force them to choose a username from a list of
>>>> > 90,000+.
>>
>>>> > If I could do it over, I would have each of these directories created
>>>> > in lowercase by default. But unfortunately, this system dates back for
>>>> > about 10 years, and at the time, I had no clue that I would have so
>>>> > many users (I remember celebrating when we hit 500). One day, I'll
>>>> > probably go through and revise the entire thing, but for now, I'm
>>>> > simply trying to find a faster way to find the correct case for the
>>>> > username.
>>
>>>> > Robert, you're correct that this is running on Linux, so yeah, the
>>>> > system itself is case sensitive. I was hoping that something like
>>>> > realpath() or pathinfo() would return a case-corrected directory, but
>>>> > neither of those do it.
>>
>>>> > Jerry, do you mean to do something other than using opendir to grab
>>>> > all of the directory names, then sorting through them to find a match?
>>>> > This turned out to be uber-slow:
>>
>>>> > $dir = opendir("/path/to/1/"); // 30k directories
>>>> > array_push($dir, opendir("/path/to/2/")); // 30k directories
>>>> > array_push($dir, opendir("/path/to/3/")); // between 20k and 30k
>>>> > directories
>>
>>>> > foreach ($dir as $key) {
>>>> > if (strtolower($key) == strtolower($_GET['directory']))
>>>> > $found_directory = $key;
>>>> > }
>>
>>>> > That's just a sample typed up to show the logic, of course, so please
>>>> > ignore any typos.
>>
>>>> I was considering stricmp() - but if you've got 30K directories, you've
>>>> got more problems than that. You need to look at your architecture.
>>
>>>> --
>>>> ==================
>>>> Remove the "x" from my email address
>>>> Jerry Stuckle
>>>> JDS Computer Training Corp.
>>>> jstuck...@attglobal.net
>>>> ==================
>>
>>> What would you suggest as a better way to handle an unlimited number
>>> of folders that contain an unlimited number of messages, as well as
>>> additional files that may be unique to that user (an "ignore list",
>>> personalized settings, etc)?
>>
>>> The current system creates a directory for each user, then text files
>>> within that directory for each "folder", and additional text files for
>>> settings, etc. I originally based this on how the server stores email,
>>> and the only real limitation I had was when I hit 32,000 directories
>>> (then quickly discovered that there's a default limit to the number of
>>> directories allowed).
>>
>>> I don't know how I could accomplish this in a database. Creating a new
>>> table for each user seems ridiculous, but what's the alternative? A
>>> column for each "folder" wouldn't work, since each user can create new
>>> folders for themselves.
>>
>> Now we get to the crux.
>>
>> You are correct that this should all be done in a database.
>>
>> However you DO NOT have a separate table for each user. You have table
>> of users which has columns for each of their settings. Then you have
>> other tables in a 1-M relationship containing things like messages.
>> This is why the "why" is important.
>>
>> What you have described is a pretty standard relational database
>> application. The way that the application has been designed makes
>> handling it far more complicated than it really needs to be.
>>
>> Welcome to the world of database applications. You have a lot to
>> learn, but if you're willing to apply yourself to it, we will assist.
>
> Traditionally, I work on this website each December. That's usually my
> "slow" month, so it's the only time that I can devote entirely to this
> one.
>
> This year, the time has been spent on making the site more cross-
> browser compliant, with an emphasis on mobile. And, of course, speed.
> With server modifications, program modifications, etc, I've been able
> to increase the speed by about 25%. And just in time; in the last
> week, the site traffic has jumped by 66%!
>
> Last December, I spent the time moving everything over to a database,
> which turned out to be a LOT more work than I'd anticipated. I didn't
> know that the default settings for MySQL only allowed for 100 queries
> at once, so almost immediately, everything went south. It took several
> weeks to get all of that straightened out (I believe that Jerry was a
> big help during that time), although I still have issues on days when
> there's an unexpected jump in traffic.
>
> My time for this year is pretty much up, so it will probably be next
> year when I work on moving this section over to a database. It will
> just about require an entire site rebuild on the backend, so the
> potential for bugs and errors is scary. But at the same time, I'm
> expecting a 1000% increase in traffic over the next year, so I'm sure
> it will be necessary.
>
> Thanks, guys,
>
> Jason
100 concurrent requests should be nothing to MySQL - requests shouldn't
take very long if the database is properly designed and optimized. The
fact you're running into that logjam indicates you have some performance
problems. I would recommend you check it out.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
|
|
|