FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Cannot write utf8 data into a utf8 column
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
Cannot write utf8 data into a utf8 column [message #170746] Thu, 18 November 2010 16:54 Go to next message
Tony Marston is currently offline  Tony Marston
Messages: 57
Registered: November 2010
Karma: 0
Member
This is very strange. When I try this update in my PHP program:

UPDATE email_msg
SET size=771,
message_text='? ?:\n\n ????2010?11?25-26? ? ? ? 12?09-10? ? ? ?12?16-17?
?? ????:\n\n?
????2010???????2011????????????????????????? ????!\n\n?????????????????????????????????????????--?
? ?? ??!\n\n\n????????????:baoming_gz(at)126(dot)com??! ( ?????????!)??!
\n\n\n\n?-?-?-?:0-2-0--3-9-9-2-6-2-7-5?(0)1-3-7-1-9-0-2-5-4-7-6
??? ????\n\n?-?-?-?:baoming_gz(at)126(dot)com\n\n\n????????,?????"??"?"delete"?:tuidin01(at)163(dot)com
??!',
revised_date='2010-11-18 15:48:09',
revised_user='AJM'
WHERE email_id='159'

it fails with the following error:

"Incorrect string value: '\xA0\xA0 \xE6\x88\x91...' for column
'message_text' at row 1"

When I try the SAME update through SQL-Front or phpMyAdmin it works! Why is
this?

--
Tony Marston
http://www.tonymarston.net
http://www.radicore.org
Re: Cannot write utf8 data into a utf8 column [message #170747 is a reply to message #170746] Fri, 19 November 2010 03:25 Go to previous messageGo to next message
Peter H. Coffin is currently offline  Peter H. Coffin
Messages: 245
Registered: September 2010
Karma: 0
Senior Member
["Followup-To:" header set to comp.databases.mysql.]
On Thu, 18 Nov 2010 16:54:47 -0000, Tony Marston wrote:
> This is very strange. When I try this update in my PHP program:
>
> UPDATE email_msg
> SET size=771,
> message_text='? ?:\n\n ????2010?11?25-26? ? ? ? 12?09-10? ? ? ?12?16-17?
> ?? ????:\n\n?
> ????2010???????2011????????????????????????? ????!\n\n?????????????????????????????????????????--?
> ? ?? ??!\n\n\n????????????:baoming_gz(at)126(dot)com??! ( ?????????!)??!
> \n\n\n\n?-?-?-?:0-2-0--3-9-9-2-6-2-7-5?(0)1-3-7-1-9-0-2-5-4-7-6
> ??? ????\n\n?-?-?-?:baoming_gz(at)126(dot)com\n\n\n????????,?????"??"?"delete"?:tuidin01(at)163(dot)com
> ??!',
> revised_date='2010-11-18 15:48:09',
> revised_user='AJM'
> WHERE email_id='159'
>
> it fails with the following error:
>
> "Incorrect string value: '\xA0\xA0 \xE6\x88\x91...' for column
> 'message_text' at row 1"
>
> When I try the SAME update through SQL-Front or phpMyAdmin it works! Why is
> this?

If it were the SAME, it would work the same. MySQL gives this error when
there's invalid UTF-8 byte sequences, like a continuation byte without
starting byte, or a starting byte that is not followed by a continuation
byte... Find out what the hex for what you're trying to stick in
message_text and I bet it won't be what it should be.

--
69. All midwives will be banned from the realm. All babies will be
delivered at state-approved hospitals. Orphans will be placed in
foster-homes, not abandoned in the woods to be raised by creatures
of the wild. --Peter Anspach's Evil Overlord list
Re: Cannot write utf8 data into a utf8 column - SOLVED [message #170752 is a reply to message #170747] Fri, 19 November 2010 16:25 Go to previous messageGo to next message
Tony Marston is currently offline  Tony Marston
Messages: 57
Registered: November 2010
Karma: 0
Member
"Peter H. Coffin" <hellsop(at)ninehells(dot)com> wrote in message
news:slrniebqgp(dot)1g0(dot)hellsop(at)abyss(dot)ninehells(dot)com...
> ["Followup-To:" header set to comp.databases.mysql.]
> On Thu, 18 Nov 2010 16:54:47 -0000, Tony Marston wrote:
>> This is very strange. When I try this update in my PHP program:
>>
>> UPDATE email_msg
>> SET size=771,
>> message_text='? ?:\n\n ????2010?11?25-26? ? ? ? 12?09-10? ? ?
>> ?12?16-17?
>> ?? ????:\n\n?
>> ????2010???????2011?????????????????????????
>> ????!\n\n?????????????????????????????????????????--?
>> ? ?? ??!\n\n\n????????????:baoming_gz(at)126(dot)com??! ( ?????????!)??!
>> \n\n\n\n?-?-?-?:0-2-0--3-9-9-2-6-2-7-5?(0)1-3-7-1-9-0-2-5-4-7-6
>> ???
>> ????\n\n?-?-?-?:baoming_gz(at)126(dot)com\n\n\n????????,?????"??"?"delete"?:tuidin01(at)163(dot)com
>> ??!',
>> revised_date='2010-11-18 15:48:09',
>> revised_user='AJM'
>> WHERE email_id='159'
>>
>> it fails with the following error:
>>
>> "Incorrect string value: '\xA0\xA0 \xE6\x88\x91...' for column
>> 'message_text' at row 1"
>>
>> When I try the SAME update through SQL-Front or phpMyAdmin it works! Why
>> is
>> this?
>
> If it were the SAME, it would work the same. MySQL gives this error when
> there's invalid UTF-8 byte sequences, like a continuation byte without
> starting byte, or a starting byte that is not followed by a continuation
> byte... Find out what the hex for what you're trying to stick in
> message_text and I bet it won't be what it should be.

The error message was reporting a problem with the hex value \xA0 (decimal
160) which represents '&nbsp;' or the non-breaking space. I discovered that
instead of replacing '&nbsp;' with chr(160) that I needed to replace it with
chr(194).chr(160). I don't now why the chr(194) is necessary, but it solves
my problem.

--
Tony Marston
http://www.tonymarston.net
http://www.radicore.org
Re: Cannot write utf8 data into a utf8 column - SOLVED [message #170753 is a reply to message #170752] Fri, 19 November 2010 17:03 Go to previous messageGo to next message
Michael Fesser is currently offline  Michael Fesser
Messages: 215
Registered: September 2010
Karma: 0
Senior Member
.oO(Tony Marston)

> "Peter H. Coffin" <hellsop(at)ninehells(dot)com> wrote in message
> news:slrniebqgp(dot)1g0(dot)hellsop(at)abyss(dot)ninehells(dot)com...
>>
>> If it were the SAME, it would work the same. MySQL gives this error when
>> there's invalid UTF-8 byte sequences, like a continuation byte without
>> starting byte, or a starting byte that is not followed by a continuation
>> byte... Find out what the hex for what you're trying to stick in
>> message_text and I bet it won't be what it should be.
>
> The error message was reporting a problem with the hex value \xA0 (decimal
> 160) which represents '&nbsp;' or the non-breaking space.

Yes, but 0xA0 is no correct UTF-8 sequence. Only the 128 ASCII chars
(0x00-0x7F) can be written as single-byte sequences. All other chars
require 2 or more bytes.

> I discovered that
> instead of replacing '&nbsp;' with chr(160) that I needed to replace it with
> chr(194).chr(160). I don't now why the chr(194) is necessary, but it solves
> my problem.

0xC2 0xA0 is the correct encoding of the non-breaking space as UTF-8:

http://www.fileformat.info/info/unicode/char/a0/index.htm

If you do such replacements in your scripts, it might make sense to
ensure that all your scripts are stored in UTF-8 as well. Then you don't
have to manually create the correct byte sequences.

Micha
Re: Cannot write utf8 data into a utf8 column - SOLVED [message #170756 is a reply to message #170752] Fri, 19 November 2010 22:48 Go to previous message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
Tony Marston wrote:

> "Peter H. Coffin" <hellsop(at)ninehells(dot)com> wrote in message
> news:slrniebqgp(dot)1g0(dot)hellsop(at)abyss(dot)ninehells(dot)com...

It is called attribution _line_, not attribution novel.

>> ["Followup-To:" header set to comp.databases.mysql.]
>> On Thu, 18 Nov 2010 16:54:47 -0000, Tony Marston wrote:
>>> [garbled posting due to missing character encoding declaration]
>>>
>>> it fails with the following error:
>>>
>>> "Incorrect string value: '\xA0\xA0 \xE6\x88\x91...' for column
>>> 'message_text' at row 1"
>>>
>>> When I try the SAME update through SQL-Front or phpMyAdmin it works! Why
>>> is this?

Those frontends are apparently set up to use UTF-8 for the input character
encoding, so whenever you type/paste a character there, it is regarded the
corresponding Unicode character upon form submit.

PHP, on the other hand, uses the character encoding of the source file: If
you create the source file with e.g. notepad.exe on Windows, most certainly
it will have a Windows-125x character encoding by default.

>> If it were the SAME, it would work the same. MySQL gives this error when
>> there's invalid UTF-8 byte sequences, like a continuation byte without
>> starting byte, or a starting byte that is not followed by a continuation
>> byte... Find out what the hex for what you're trying to stick in
>> message_text and I bet it won't be what it should be.
>
> The error message was reporting a problem with the hex value \xA0 (decimal
> 160) which represents '&nbsp;' or the non-breaking space.

Only in ISO-8859-1/Windows-1252. In UTF-8, A0 is one of the aforementioned
continuation bytes. See below.

> I discovered that instead of replacing '&nbsp;' with chr(160) that I
> needed to replace it with chr(194).chr(160). I don't now why the chr(194)
> is necessary, but it solves my problem.

(Ignorance must be bliss.) If you want to grow beyond the script-kiddie
trial-and-error approach (which seldom is successful in programming), you
should want to learn *how* and *why* things work.

chr(194) . chr(160)

"works" *because* chr(194) returns an _ISO-8859-1/Windows-1252_ character at
code point 0xC2¹ ─ Â ─, and chr(160) returns an ISO-8859-1/Windows-125x
character at code point 0xA0 ─ <NBSP> ─, which is, by concatenation,
combined to `Â<NBSP>' which is an ISO-8859-1/Windows-1252 representation of
the byte sequence

C2 A0

In UTF-8, this is a sequence consisting of two UTF-8 code units (hence
UTF-*8*: 8 bits, or 1 byte, per code unit), encoding the character at
Unicode code point U+00A0 (NO-BREAK SPACE). (C2 is one of the start bytes
of a 2-byte sequence.)

So you should have RTFM and called the equivalent of

mysql_query(utf8_encode($query));

instead. See also:

<http://unicode.org/faq/>
<http://en.wikipedia.org/wiki/UTF-8>
<http://people.w3.org/rishida/tools/conversion/>
<http://php.net/chr>
<http://php.net/utf8_encode>


HTH

PointedEars
___________
¹ Neither character is part of ASCII, despite the PHP manual stating that
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$8300dec7(at)news(dot)demon(dot)co(dot)uk> (2004)
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: store backslash in mysql database
Next Topic: Firewall - NetDefender :
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Thu Nov 28 11:18:07 GMT 2024

Total time taken to generate the page: 0.02768 seconds