json and non UTF8 [message #172185] |
Thu, 03 February 2011 14:43 |
Jeff Thies
Messages: 3 Registered: October 2010
Karma: 0
|
Junior Member |
|
|
I've got some text that contains some odd characters that json_encode
is returning nulls on. I'd like to either fix the characters, get rid of
them or anything that doesn't give me a null.
Jeff
|
|
|
Re: json and non UTF8 [message #172187 is a reply to message #172185] |
Thu, 03 February 2011 15:07 |
Luuk
Messages: 329 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
On 03-02-11 15:43, Jeff Thies wrote:
> I've got some text that contains some odd characters that json_encode
> is returning nulls on. I'd like to either fix the characters, get rid of
> them or anything that doesn't give me a null.
>
> Jeff
Can you give more detail on this 'odd characters'?
Maybe show them here with bin2hex()?
i.e. print bin2hex('abcdef');
616263646566
And in the subject you state 'non UTF8'
In the docs of 'json_encode' i read:
"This function only works with UTF-8 encoded data"
--
Luuk
|
|
|
Re: json and non UTF8 [message #172188 is a reply to message #172185] |
Thu, 03 February 2011 15:08 |
alvaro.NOSPAMTHANX
Messages: 277 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
El 03/02/2011 15:43, Jeff Thies escribió/wrote:
> I've got some text that contains some odd characters that json_encode is
> returning nulls on. I'd like to either fix the characters, get rid of
> them or anything that doesn't give me a null.
Do you mean this?
$input = str_replace(chr(0), '', $input);
--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
|
|
|
Re: json and non UTF8 [message #172202 is a reply to message #172187] |
Fri, 04 February 2011 00:40 |
Jeff Thies
Messages: 3 Registered: October 2010
Karma: 0
|
Junior Member |
|
|
On 2/3/2011 10:07 AM, Luuk wrote:
> On 03-02-11 15:43, Jeff Thies wrote:
>> I've got some text that contains some odd characters that json_encode
>> is returning nulls on. I'd like to either fix the characters, get rid of
>> them or anything that doesn't give me a null.
>>
>> Jeff
>
> Can you give more detail on this 'odd characters'?
> Maybe show them here with bin2hex()?
> i.e. print bin2hex('abcdef');
> 616263646566
I was trying to json_encode this and it choked on the bullet.
Project Feasibility & Development
•Client and end user needs assessment
>
> And in the subject you state 'non UTF8'
> In the docs of 'json_encode' i read:
> "This function only works with UTF-8 encoded data"
Yeah, that is what I noticed too. It took me a couple hours to figure
it out!
What I'm doing is this:
http://paragon360.com/services/auditorium_design/index.html
The links in the sidebar have popups that read the first hundred
characters in the page text. Since I'm not the one entering the content
there could be almost anything in there! Clients!
I'm working my way through Alvaros idea, although I don't quite
understand how that works. That would seem to me to have to be done one
character at a time.
Thanks,
Jeff
>
|
|
|
Re: json and non UTF8 [message #172208 is a reply to message #172202] |
Fri, 04 February 2011 10:53 |
Luuk
Messages: 329 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
On 04-02-11 01:40, Jeff Thies wrote:
> On 2/3/2011 10:07 AM, Luuk wrote:
>> On 03-02-11 15:43, Jeff Thies wrote:
>>> I've got some text that contains some odd characters that json_encode
>>> is returning nulls on. I'd like to either fix the characters, get rid of
>>> them or anything that doesn't give me a null.
>>>
>>> Jeff
>>
>> Can you give more detail on this 'odd characters'?
>> Maybe show them here with bin2hex()?
>> i.e. print bin2hex('abcdef');
>> 616263646566
>
> I was trying to json_encode this and it choked on the bullet.
>
> Project Feasibility & Development
> •Client and end user needs assessment
>>
>> And in the subject you state 'non UTF8'
>> In the docs of 'json_encode' i read:
>> "This function only works with UTF-8 encoded data"
>
> Yeah, that is what I noticed too. It took me a couple hours to figure
> it out!
You could try to convert this text from UTF-8 to another character-set
print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
this prints "Client";
--
Luuk
|
|
|
Re: json and non UTF8 [message #172209 is a reply to message #172208] |
Fri, 04 February 2011 11:31 |
alvaro.NOSPAMTHANX
Messages: 277 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
El 04/02/2011 11:53, Luuk escribió/wrote:
> On 04-02-11 01:40, Jeff Thies wrote:
>> On 2/3/2011 10:07 AM, Luuk wrote:
>>> On 03-02-11 15:43, Jeff Thies wrote:
>>>> I've got some text that contains some odd characters that json_encode
>>>> is returning nulls on. I'd like to either fix the characters, get rid of
>>>> them or anything that doesn't give me a null.
>>>>
>>>> Jeff
>>>
>>> Can you give more detail on this 'odd characters'?
>>> Maybe show them here with bin2hex()?
>>> i.e. print bin2hex('abcdef');
>>> 616263646566
>>
>> I was trying to json_encode this and it choked on the bullet.
>>
>> Project Feasibility& Development
>> •Client and end user needs assessment
>>>
>>> And in the subject you state 'non UTF8'
>>> In the docs of 'json_encode' i read:
>>> "This function only works with UTF-8 encoded data"
>>
>> Yeah, that is what I noticed too. It took me a couple hours to figure
>> it out!
>
> You could try to convert this text from UTF-8 to another character-set
>
> print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
>
> this prints "Client";
I admit I'm lost in the thread and I'm not really sure about what the
problem is but JSON _has_ to be Unicode. It's by design:
«A string is a sequence of zero or more Unicode characters, wrapped in
double quotes, using backslash escapes.»
http://www.json.org/
And, as already mentioned, json_encode() needs UTF-8 input data:
http://php.net/json_encode
If you need ISO-8859-15 at either end, you need to make a conversion
from ISO-8859-15 to UTF-8 *before* calling json_encode() and/or a
conversion from UTF-8 to ISO-8859-15 *after* json_decode().
In any case, every time I hear the expression "odd characters" I suspect
that the character set of data is simply unknown. The first step is to
learn which one it is.
--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
|
|
|
Re: json and non UTF8 [message #172210 is a reply to message #172209] |
Fri, 04 February 2011 11:52 |
Luuk
Messages: 329 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
On 04-02-11 12:31, "Álvaro G. Vicario" wrote:
> El 04/02/2011 11:53, Luuk escribió/wrote:
>> On 04-02-11 01:40, Jeff Thies wrote:
>>> On 2/3/2011 10:07 AM, Luuk wrote:
>>>> On 03-02-11 15:43, Jeff Thies wrote:
>>>> > I've got some text that contains some odd characters that
>>>> > json_encode
>>>> > is returning nulls on. I'd like to either fix the characters, get
>>>> > rid of
>>>> > them or anything that doesn't give me a null.
>>>> >
>>>> > Jeff
>>>>
>>>> Can you give more detail on this 'odd characters'?
>>>> Maybe show them here with bin2hex()?
>>>> i.e. print bin2hex('abcdef');
>>>> 616263646566
>>>
>>> I was trying to json_encode this and it choked on the bullet.
>>>
>>> Project Feasibility& Development
>>> •Client and end user needs assessment
>>>>
>>>> And in the subject you state 'non UTF8'
>>>> In the docs of 'json_encode' i read:
>>>> "This function only works with UTF-8 encoded data"
>>>
>>> Yeah, that is what I noticed too. It took me a couple hours to figure
>>> it out!
>>
>> You could try to convert this text from UTF-8 to another character-set
>>
>> print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
>>
>> this prints "Client";
>
> I admit I'm lost in the thread and I'm not really sure about what the
> problem is but JSON _has_ to be Unicode. It's by design:
>
> «A string is a sequence of zero or more Unicode characters, wrapped in
> double quotes, using backslash escapes.»
>
> http://www.json.org/
>
> And, as already mentioned, json_encode() needs UTF-8 input data:
>
> http://php.net/json_encode
>
> If you need ISO-8859-15 at either end, you need to make a conversion
> from ISO-8859-15 to UTF-8 *before* calling json_encode() and/or a
> conversion from UTF-8 to ISO-8859-15 *after* json_decode().
>
> In any case, every time I hear the expression "odd characters" I suspect
> that the character set of data is simply unknown. The first step is to
> learn which one it is.
>
>
oops..... my mistake! i mixed up the character-sets..
You are right, the text that is used in the json_encode() function must
be encoded in the UTF-8 character-set.
--
Luuk
|
|
|
Re: json and non UTF8 [message #172211 is a reply to message #172202] |
Fri, 04 February 2011 14:25 |
Thomas 'PointedEars'
Messages: 701 Registered: October 2010
Karma: 0
|
Senior Member |
|
|
Jeff Thies wrote:
> I'm working my way through Alvaros idea ["to make a conversion
> from ISO-8859-15 to UTF-8 before calling json_encode() and/or a
> conversion from UTF-8 to ISO-8859-15 after json_decode()"],
> although I don't quite understand how that works. That would seem
> to me to have to be done one character at a time.
You are mistaken. There are iconv() and mb_detect_encoding() (whereas
character _encoding_ is the proper term/approach to use here, _not_
character set). RTF(ine)M.
PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16
|
|
|