Re: Ignoring Case on directories [message #171232 is a reply to message #171204] |
Wed, 29 December 2010 19:15 |
jwcarlton
Messages: 76 Registered: December 2010
Karma:
|
Member |
|
|
On Dec 29, 9:01 am, Captain Paralytic <paul_laut...@yahoo.com> wrote:
> On Dec 29, 5:31 am, jwcarlton <jwcarl...@gmail.com> wrote:
>
>
>
>> On Dec 28, 11:00 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>
>>> On 12/28/2010 8:13 PM, jwcarlton wrote:
>
>>>> On Dec 28, 5:39 pm, Captain Paralytic<paul_laut...@yahoo.com> wrote:
>>>> > On Dec 28, 8:33 pm, jwcarlton<jwcarl...@gmail.com> wrote:
>
>>>> >> I think that the answer to this is "no", but I thought I'd ask :-)
>
>>>> >> I'm wanting to open a file where the directory path is given by the
>>>> >> user. For example:
>
>>>> >> if (is_file("/path/to/" . $_GET['directory'] . "/file.txt"))
>>>> >> $example = FILE("/path/to/" . $_GET['directory'] . "/file.txt");
>
>>>> >> else
>>>> >> // return error
>
>>>> >> // please ignore any typos; I just typed this up here for the example
>
>>>> >> The thing is, the directory path could be, say, /path/to/SomeDirectory/
>>>> >> file.txt, but the user could enter "somedirectory"; in which case,
>>>> >> they would get the error.
>
>>>> >> Currently, I keep all of the directory names in a MySQL database, then
>>>> >> before opening file.txt, I search for the directory in MySQL (which is
>>>> >> case insensitive), then load the path based on the name in the
>>>> >> database instead of what's given. But during peak hours, this method
>>>> >> can result in several hundred MySQL queries per minute.
>
>>>> >> Before this, I just used opendir to load all of the directories into
>>>> >> an array on the fly, then did a case insensitive search through the
>>>> >> array. But, when I started having 90,000 directories (30,000 in 3
>>>> >> separate parent directories), this was considerably slower than using
>>>> >> MySQL.
>
>>>> >> So, the MySQL search works, but the question is, can PHP do a
>>>> >> directory lookup that's case insensitive; and, preferably, return the
>>>> >> case-correct directory name?
>
>>>> > A few hundred MySQL queries a minute. What's the problwm with that?
>
>>>> > And if you have the directories in a database, why aren't you
>>>> > presenting them to the user to select as Jerry suggested.
>
>>>> > And if there reall are good reasons for not doing a mere few hundreds
>>>> > of queries aa minute, or offering a list of direcories to select from,
>>>> > why not just make them all lower case in the first place?
>
>>>> > If feels like you are trying to find a way to fix a very poor design.
>>>> > The best way to do that is to change the design to something decent.
>
>>>> Naturally, this isn't the only script running queries. My average
>>>> Apache processes per day is around 500 (although this week, the
>>>> average is over 600).
>
>>>> I don't think the "why" is terribly relevant to the thread, but the
>>>> logic is that the directories represent usernames of registered users.
>>>> In addition to other features, this includes the ability for one user
>>>> to send a message to another user. Now, they DO have the option of
>>>> clicking on that person's username, which resolves the case issue, but
>>>> I also have the option for them to simply enter the recipient's
>>>> username... which is where I am trying to correct the case. It would
>>>> be pointless to force them to choose a username from a list of
>>>> 90,000+.
>
>>>> If I could do it over, I would have each of these directories created
>>>> in lowercase by default. But unfortunately, this system dates back for
>>>> about 10 years, and at the time, I had no clue that I would have so
>>>> many users (I remember celebrating when we hit 500). One day, I'll
>>>> probably go through and revise the entire thing, but for now, I'm
>>>> simply trying to find a faster way to find the correct case for the
>>>> username.
>
>>>> Robert, you're correct that this is running on Linux, so yeah, the
>>>> system itself is case sensitive. I was hoping that something like
>>>> realpath() or pathinfo() would return a case-corrected directory, but
>>>> neither of those do it.
>
>>>> Jerry, do you mean to do something other than using opendir to grab
>>>> all of the directory names, then sorting through them to find a match?
>>>> This turned out to be uber-slow:
>
>>>> $dir = opendir("/path/to/1/"); // 30k directories
>>>> array_push($dir, opendir("/path/to/2/")); // 30k directories
>>>> array_push($dir, opendir("/path/to/3/")); // between 20k and 30k
>>>> directories
>
>>>> foreach ($dir as $key) {
>>>> if (strtolower($key) == strtolower($_GET['directory']))
>>>> $found_directory = $key;
>>>> }
>
>>>> That's just a sample typed up to show the logic, of course, so please
>>>> ignore any typos.
>
>>> I was considering stricmp() - but if you've got 30K directories, you've
>>> got more problems than that. You need to look at your architecture..
>
>>> --
>>> ==================
>>> Remove the "x" from my email address
>>> Jerry Stuckle
>>> JDS Computer Training Corp.
>>> jstuck...@attglobal.net
>>> ==================
>
>> What would you suggest as a better way to handle an unlimited number
>> of folders that contain an unlimited number of messages, as well as
>> additional files that may be unique to that user (an "ignore list",
>> personalized settings, etc)?
>
>> The current system creates a directory for each user, then text files
>> within that directory for each "folder", and additional text files for
>> settings, etc. I originally based this on how the server stores email,
>> and the only real limitation I had was when I hit 32,000 directories
>> (then quickly discovered that there's a default limit to the number of
>> directories allowed).
>
>> I don't know how I could accomplish this in a database. Creating a new
>> table for each user seems ridiculous, but what's the alternative? A
>> column for each "folder" wouldn't work, since each user can create new
>> folders for themselves.
>
> Now we get to the crux.
>
> You are correct that this should all be done in a database.
>
> However you DO NOT have a separate table for each user. You have table
> of users which has columns for each of their settings. Then you have
> other tables in a 1-M relationship containing things like messages.
> This is why the "why" is important.
>
> What you have described is a pretty standard relational database
> application. The way that the application has been designed makes
> handling it far more complicated than it really needs to be.
>
> Welcome to the world of database applications. You have a lot to
> learn, but if you're willing to apply yourself to it, we will assist.
Traditionally, I work on this website each December. That's usually my
"slow" month, so it's the only time that I can devote entirely to this
one.
This year, the time has been spent on making the site more cross-
browser compliant, with an emphasis on mobile. And, of course, speed.
With server modifications, program modifications, etc, I've been able
to increase the speed by about 25%. And just in time; in the last
week, the site traffic has jumped by 66%!
Last December, I spent the time moving everything over to a database,
which turned out to be a LOT more work than I'd anticipated. I didn't
know that the default settings for MySQL only allowed for 100 queries
at once, so almost immediately, everything went south. It took several
weeks to get all of that straightened out (I believe that Jerry was a
big help during that time), although I still have issues on days when
there's an unexpected jump in traffic.
My time for this year is pretty much up, so it will probably be next
year when I work on moving this section over to a database. It will
just about require an entire site rebuild on the backend, so the
potential for bugs and errors is scary. But at the same time, I'm
expecting a 1000% increase in traffic over the next year, so I'm sure
it will be necessary.
Thanks, guys,
Jason
|
|
|