FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Sort using utf characters in mysql or php ? best solutions...
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: Sort using utf characters in mysql or php ? best solutions... [message #172844 is a reply to message #172828] Sat, 05 March 2011 00:53 Go to previous message
SM is currently offline  SM
Messages: 3
Registered: March 2011
Karma:
Junior Member
On Mar 4, 3:53 am, "Álvaro G. Vicario"
<alvaro.NOSPAMTH...@demogracia.com.invalid> wrote:
> El 04/03/2011 1:58, SM escribi /wrote:
>
>
>
>> Using MySQL, i'm selecting a list of songs in spanish that i would
>> like to sort. Here's a list of names that is returned by the query:
>
>> Decirevilla!
>> Alhambra
>> 123 pasitos
>> frica
>> Arroz
>> Decir
>
>> The sorted list should look like this:
>
>> 123 pasitos
>> frica
>> Alhambra
>> Arroz
>> Decirevilla!
>> Decir
>
> "Decir" should come before " Decirevilla!" but I get your point.
>
>
>
>> After all of the research i've read, i've concluded that there is no
>> reasonable way to achieve this using MySQL. I've tried collation,
>> charset, etc... but there is no way the character , ?, etc... can by
>> sorted accordingly to my desired result. Even the is not sorted the
>> way i want to...
>
> These are two different issues. You can absolutely sort Spanish letters
> at MySQL with the appropriate rules. You just need to assign a proper
> collation to the column: utf8_spanish_ci
>
> CREATE TABLE test (
>         word VARCHAR(50) NULL
> )
> COLLATE='utf8_spanish_ci'
> ENGINE=InnoDB;
>
> INSERT INTO test (word)
> VALUES ('ae'), (' d'), ('Ac'), (' b'), ('aa');
>
> SELECT word FROM test ORDER BY word;
>
> "word"
> "aa"
> " b"
> "Ac"
> " d"
> "ae"
>
> Compare with:
>
> CREATE TABLE test (
>         word VARCHAR(50) NULL
> )
> COLLATE='utf8_bin'
> ENGINE=InnoDB;
>
> INSERT INTO test (word)
> VALUES ('ae'), (' d'), ('Ac'), (' b'), ('aa');
>
> SELECT word FROM test ORDER BY word;
>
> "word"
> "Ac"
> "aa"
> "ae"
> " d"
> " b"
>
> The second issue is that you want to ignore character that are not
> letters or numbers. Well, just remove them from the string before
> comparing. Unluckily, MySQL does not provide regexp based replacement.
> So my advice is that you add a secondary column and populate it with PHP:
>
> CREATE TABLE test (
>         word VARCHAR(50) NULL,
>         sort_order VARCHAR(50) NULL
> )
> COLLATE='utf8_spanish_ci'
> ENGINE=InnoDB;
>
> INSERT INTO test (word, sort_order) VALUES
> (' Decirevilla!', 'Decirevilla'),
> ('Alhambra', 'Alhambra'),
> ('123 pasitos', '123 pasitos'),
> (' frica', ' frica'),
> ('Arroz', 'Arroz'),
> ('Decir', 'Decir');
>
> SELECT word FROM test ORDER BY sort_order;
>
> "word"
> "123 pasitos"
> " frica"
> "Alhambra"
> "Arroz"
> "Decir"
> " Decirevilla!"
>
> In order to populate the secondary column, you can use several
> approaches, from preg_replace()
> (<http://es.php.net/manual/en/function.preg-replace.php>) to
> internationalization functions
> (<http://es.php.net/manual/en/book.intl.php>).
>
> echo preg_replace('/\\W/iu', '', ' Hola, mundo! C mo est usted?');
> // Prints: HolamundoC moest usted
>
> (adjust the regexp to your liking.)
>
>> Question 1: Is this a reasonable conclusion?
>
>> I believe the only way to achieve this is by passing the results to an
>> array in php and then sort the array using a custom function... all
>> this using the function usort (need to sort by value and i don't care
>> about maintaning the key association).
>
> This implies that a query that depends on sort order (e.g., "get first
> 20 items that start with 'B') will need to retrieve the complete table.
>
> [...]
>
>> Question 3: I have no idea how to create the normalize function to
>> sort the names accordingly to my needs. How do i ignore certain
>> characters ( , ?, ', !, ) and how do i replace other characters with
>> the natural equivalent ( ->  A, ->  E, etc..)
>
> You can use iconv() for that but it's probably not necessary:
>
> http://sourcecookbook.com/en/recipes/8/function-to-slugify-strings-in...
>
> --
> --http://alvaro.es- lvaro G. Vicario - Burgos, Spain
> -- Mi sitio sobre programaci n web:http://borrame.com
> -- Mi web de humor satinado:http://www.demogracia.com
> --

that's exactly my point... no collation or charset will work on this
case... I mean, why would you want to sort a character like ¡ or ? or
¿... anyways, that's not the point...

After a day of researching and asking around, i believe that i'm
better off letting MySQL handle the sorting and not the PHP. For that,
i need to build an artificial column in my table that does sort
properly... a column called, say, sortable_title; then ORDER BY
sortable_title... When doing INSERTs and UPDATE's, I will use use a
normalize() function in PHP that produces a list with no punctuation,
all lower case, accents stripped, etc...

Gracias Álvaro for all of your tips and guiding.
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Want to free hosting
Next Topic: [ANN]VTD-XML 2.10
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 02:36:22 GMT 2024

Total time taken to generate the page: 0.05270 seconds