Re: json and non UTF8 [message #172210 is a reply to message #172209] |
Fri, 04 February 2011 11:52 |
Luuk
Messages: 329 Registered: September 2010
Karma:
|
Senior Member |
|
|
On 04-02-11 12:31, "Álvaro G. Vicario" wrote:
> El 04/02/2011 11:53, Luuk escribió/wrote:
>> On 04-02-11 01:40, Jeff Thies wrote:
>>> On 2/3/2011 10:07 AM, Luuk wrote:
>>>> On 03-02-11 15:43, Jeff Thies wrote:
>>>> > I've got some text that contains some odd characters that
>>>> > json_encode
>>>> > is returning nulls on. I'd like to either fix the characters, get
>>>> > rid of
>>>> > them or anything that doesn't give me a null.
>>>> >
>>>> > Jeff
>>>>
>>>> Can you give more detail on this 'odd characters'?
>>>> Maybe show them here with bin2hex()?
>>>> i.e. print bin2hex('abcdef');
>>>> 616263646566
>>>
>>> I was trying to json_encode this and it choked on the bullet.
>>>
>>> Project Feasibility& Development
>>> •Client and end user needs assessment
>>>>
>>>> And in the subject you state 'non UTF8'
>>>> In the docs of 'json_encode' i read:
>>>> "This function only works with UTF-8 encoded data"
>>>
>>> Yeah, that is what I noticed too. It took me a couple hours to figure
>>> it out!
>>
>> You could try to convert this text from UTF-8 to another character-set
>>
>> print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
>>
>> this prints "Client";
>
> I admit I'm lost in the thread and I'm not really sure about what the
> problem is but JSON _has_ to be Unicode. It's by design:
>
> «A string is a sequence of zero or more Unicode characters, wrapped in
> double quotes, using backslash escapes.»
>
> http://www.json.org/
>
> And, as already mentioned, json_encode() needs UTF-8 input data:
>
> http://php.net/json_encode
>
> If you need ISO-8859-15 at either end, you need to make a conversion
> from ISO-8859-15 to UTF-8 *before* calling json_encode() and/or a
> conversion from UTF-8 to ISO-8859-15 *after* json_decode().
>
> In any case, every time I hear the expression "odd characters" I suspect
> that the character set of data is simply unknown. The first step is to
> learn which one it is.
>
>
oops..... my mistake! i mixed up the character-sets..
You are right, the text that is used in the json_encode() function must
be encoded in the UTF-8 character-set.
--
Luuk
|
|
|