FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Sort using utf characters in mysql or php ? best solutions...
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: Sort using utf characters in mysql or php ? best solutions... [message #172828 is a reply to message #172818] Fri, 04 March 2011 08:53 Go to previous messageGo to previous message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma:
Senior Member
El 04/03/2011 1:58, SM escribió/wrote:
> Using MySQL, i'm selecting a list of songs in spanish that i would
> like to sort. Here's a list of names that is returned by the query:
>
> ¡Decirevilla!
> Alhambra
> 123 pasitos
> África
> Arroz
> Decir
>
> The sorted list should look like this:
>
> 123 pasitos
> África
> Alhambra
> Arroz
> ¡Decirevilla!
> Decir

"Decir" should come before "¡Decirevilla!" but I get your point.

>
>
> After all of the research i've read, i've concluded that there is no
> reasonable way to achieve this using MySQL. I've tried collation,
> charset, etc... but there is no way the character ¡, ?, etc... can by
> sorted accordingly to my desired result. Even the Á is not sorted the
> way i want to...

These are two different issues. You can absolutely sort Spanish letters
at MySQL with the appropriate rules. You just need to assign a proper
collation to the column: utf8_spanish_ci

CREATE TABLE test (
word VARCHAR(50) NULL
)
COLLATE='utf8_spanish_ci'
ENGINE=InnoDB;

INSERT INTO test (word)
VALUES ('ae'), ('Ád'), ('Ac'), ('áb'), ('aa');

SELECT word FROM test ORDER BY word;

"word"
"aa"
"áb"
"Ac"
"Ád"
"ae"

Compare with:

CREATE TABLE test (
word VARCHAR(50) NULL
)
COLLATE='utf8_bin'
ENGINE=InnoDB;

INSERT INTO test (word)
VALUES ('ae'), ('Ád'), ('Ac'), ('áb'), ('aa');

SELECT word FROM test ORDER BY word;

"word"
"Ac"
"aa"
"ae"
"Ád"
"áb"

The second issue is that you want to ignore character that are not
letters or numbers. Well, just remove them from the string before
comparing. Unluckily, MySQL does not provide regexp based replacement.
So my advice is that you add a secondary column and populate it with PHP:

CREATE TABLE test (
word VARCHAR(50) NULL,
sort_order VARCHAR(50) NULL
)
COLLATE='utf8_spanish_ci'
ENGINE=InnoDB;

INSERT INTO test (word, sort_order) VALUES
('¡Decirevilla!', 'Decirevilla'),
('Alhambra', 'Alhambra'),
('123 pasitos', '123 pasitos'),
('África', 'África'),
('Arroz', 'Arroz'),
('Decir', 'Decir');

SELECT word FROM test ORDER BY sort_order;

"word"
"123 pasitos"
"África"
"Alhambra"
"Arroz"
"Decir"
"¡Decirevilla!"


In order to populate the secondary column, you can use several
approaches, from preg_replace()
(<http://es.php.net/manual/en/function.preg-replace.php>) to
internationalization functions
(<http://es.php.net/manual/en/book.intl.php>).

echo preg_replace('/\\W/iu', '', '¡Hola, mundo! ¿Cómo está usted?');
// Prints: HolamundoCómoestáusted

(adjust the regexp to your liking.)


> Question 1: Is this a reasonable conclusion?
>
> I believe the only way to achieve this is by passing the results to an
> array in php and then sort the array using a custom function... all
> this using the function usort (need to sort by value and i don't care
> about maintaning the key association).

This implies that a query that depends on sort order (e.g., "get first
20 items that start with 'B') will need to retrieve the complete table.

[...]

> Question 3: I have no idea how to create the normalize function to
> sort the names accordingly to my needs. How do i ignore certain
> characters (¡, ?, ', !, ¿) and how do i replace other characters with
> the natural equivalent (Á -> A, É -> E, etc..)

You can use iconv() for that but it's probably not necessary:

http://sourcecookbook.com/en/recipes/8/function-to-slugify-strings-in-php



--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Want to free hosting
Next Topic: [ANN]VTD-XML 2.10
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Fri Sep 20 04:34:28 GMT 2024

Total time taken to generate the page: 0.03918 seconds