FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum Development » Bug Reports » windows-1251 (Cyr) & MySQL 4.1
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: windows-1251 (Cyr) & MySQL 4.1 [message #32441 is a reply to message #32333] Thu, 29 June 2006 07:51 Go to previous message
aigumnov   Russian Federation
Messages: 2
Registered: June 2006
Karma:
Junior Member
Ilia wrote on Wed, 21 June 2006 19:12

A single forum can use multiple languages while a table charset is for all data, so we have to use a neutral setting.


'Latin1' is not the neutral setting. Just a few notes to make it clear.
I think it works like follows:
1) By default, PHP is 'latin1' client. The PHP-app and MySQL server handshake this during mysql_connect. To note, later the 'handshaked' charset can be changed using 'SET NAMES' or 'SET CHARACTER SET'.
2) In this case, if the table's charset is 'latin1', the MySQL server compares 'client-charset-is-latin1'='table-charset-is-latin1' and turns off any text conversions on input. The effect of this trick is that any data records are stored without any transformation, 'as is'.
3) Then, while reading data, another php-script handshakes with MySQL using 'latin1' charset. Again, as client encoding matches server, no transformations 'on read' are made and 'Everything is fine, because FUDFORUM makes the right conversion while taking data from the database' (actually there is no conversion at all, everything is based on right webpage charset).

If you try to set another charset on client side using, for example, 'SET NAMES cp1251', the MySQL server decides to perform the conversion 'latin1'->'cp1251'. If your data in 'latin1' table are in 'cp1251', the conversion is actually lossy , and you'll get '?' and other garbled characters.

So, by default, phpMyAdmin can't display the data right, as recommended phpMyAdmin configuration sets the 'cp1251' charset.

To read any data stored in such tricky way, one must specify the 'client charset' matching 'table charset'. To read default FUDForum table structure using phpMyAdmin - just 'SET NAMES latin1' and it will read all data 'as is'.

This behaviour leads to more problems when somebody tries to manage the forum data outside the forum php application. For example, on russian Windows, the mysql commandline client uses cp866 charset (by default). The 'cp866'<->'cp1251' conversion is not lossy, however, if the data goes into 'latin1' table in 'cp866', and then such records came out as is (in 'cp866') on 'cp1251' webpage, making the changes unreadable.

Also, there are server configurations that may override this trick, and FUDForum will not work at all.

I don't know how will FUDForum work with Unicode (UTF-8) charsets in database, but if the one's forums are truly multilanguage, the UTF is the only option. All webpages must be Unicode, as it is impossible to have, for example, one message in Chinese and other in Russian simultaneously without Unicode webpages.

If the forums are single language (not 'latin1'), I personally think its worth to make some changes in server setup, and/or FUDForum configuration and source code. Just to make table structure more manageable and to match other data in database. The decision must be made by forum's owner/administrator.

[Updated on: Thu, 29 June 2006 08:52]

Report message to a moderator

[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Import from NNTP to forum
Next Topic: Out of range value adjusted for column icq
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 11:51:01 GMT 2024

Total time taken to generate the page: 0.03718 seconds