FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » json and non UTF8
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
json and non UTF8 [message #172185] Thu, 03 February 2011 14:43 Go to next message
Jeff Thies is currently offline  Jeff Thies
Messages: 3
Registered: October 2010
Karma: 0
Junior Member
I've got some text that contains some odd characters that json_encode
is returning nulls on. I'd like to either fix the characters, get rid of
them or anything that doesn't give me a null.

Jeff
Re: json and non UTF8 [message #172187 is a reply to message #172185] Thu, 03 February 2011 15:07 Go to previous messageGo to next message
Luuk is currently offline  Luuk
Messages: 329
Registered: September 2010
Karma: 0
Senior Member
On 03-02-11 15:43, Jeff Thies wrote:
> I've got some text that contains some odd characters that json_encode
> is returning nulls on. I'd like to either fix the characters, get rid of
> them or anything that doesn't give me a null.
>
> Jeff

Can you give more detail on this 'odd characters'?
Maybe show them here with bin2hex()?
i.e. print bin2hex('abcdef');
616263646566

And in the subject you state 'non UTF8'
In the docs of 'json_encode' i read:
"This function only works with UTF-8 encoded data"

--
Luuk
Re: json and non UTF8 [message #172188 is a reply to message #172185] Thu, 03 February 2011 15:08 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 03/02/2011 15:43, Jeff Thies escribió/wrote:
> I've got some text that contains some odd characters that json_encode is
> returning nulls on. I'd like to either fix the characters, get rid of
> them or anything that doesn't give me a null.

Do you mean this?

$input = str_replace(chr(0), '', $input);


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: json and non UTF8 [message #172202 is a reply to message #172187] Fri, 04 February 2011 00:40 Go to previous messageGo to next message
Jeff Thies is currently offline  Jeff Thies
Messages: 3
Registered: October 2010
Karma: 0
Junior Member
On 2/3/2011 10:07 AM, Luuk wrote:
> On 03-02-11 15:43, Jeff Thies wrote:
>> I've got some text that contains some odd characters that json_encode
>> is returning nulls on. I'd like to either fix the characters, get rid of
>> them or anything that doesn't give me a null.
>>
>> Jeff
>
> Can you give more detail on this 'odd characters'?
> Maybe show them here with bin2hex()?
> i.e. print bin2hex('abcdef');
> 616263646566

I was trying to json_encode this and it choked on the bullet.

Project Feasibility & Development
•Client and end user needs assessment
>
> And in the subject you state 'non UTF8'
> In the docs of 'json_encode' i read:
> "This function only works with UTF-8 encoded data"

Yeah, that is what I noticed too. It took me a couple hours to figure
it out!

What I'm doing is this:

http://paragon360.com/services/auditorium_design/index.html

The links in the sidebar have popups that read the first hundred
characters in the page text. Since I'm not the one entering the content
there could be almost anything in there! Clients!

I'm working my way through Alvaros idea, although I don't quite
understand how that works. That would seem to me to have to be done one
character at a time.

Thanks,
Jeff
>
Re: json and non UTF8 [message #172208 is a reply to message #172202] Fri, 04 February 2011 10:53 Go to previous messageGo to next message
Luuk is currently offline  Luuk
Messages: 329
Registered: September 2010
Karma: 0
Senior Member
On 04-02-11 01:40, Jeff Thies wrote:
> On 2/3/2011 10:07 AM, Luuk wrote:
>> On 03-02-11 15:43, Jeff Thies wrote:
>>> I've got some text that contains some odd characters that json_encode
>>> is returning nulls on. I'd like to either fix the characters, get rid of
>>> them or anything that doesn't give me a null.
>>>
>>> Jeff
>>
>> Can you give more detail on this 'odd characters'?
>> Maybe show them here with bin2hex()?
>> i.e. print bin2hex('abcdef');
>> 616263646566
>
> I was trying to json_encode this and it choked on the bullet.
>
> Project Feasibility & Development
> •Client and end user needs assessment
>>
>> And in the subject you state 'non UTF8'
>> In the docs of 'json_encode' i read:
>> "This function only works with UTF-8 encoded data"
>
> Yeah, that is what I noticed too. It took me a couple hours to figure
> it out!

You could try to convert this text from UTF-8 to another character-set

print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));

this prints "Client";

--
Luuk
Re: json and non UTF8 [message #172209 is a reply to message #172208] Fri, 04 February 2011 11:31 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 04/02/2011 11:53, Luuk escribió/wrote:
> On 04-02-11 01:40, Jeff Thies wrote:
>> On 2/3/2011 10:07 AM, Luuk wrote:
>>> On 03-02-11 15:43, Jeff Thies wrote:
>>>> I've got some text that contains some odd characters that json_encode
>>>> is returning nulls on. I'd like to either fix the characters, get rid of
>>>> them or anything that doesn't give me a null.
>>>>
>>>> Jeff
>>>
>>> Can you give more detail on this 'odd characters'?
>>> Maybe show them here with bin2hex()?
>>> i.e. print bin2hex('abcdef');
>>> 616263646566
>>
>> I was trying to json_encode this and it choked on the bullet.
>>
>> Project Feasibility& Development
>> •Client and end user needs assessment
>>>
>>> And in the subject you state 'non UTF8'
>>> In the docs of 'json_encode' i read:
>>> "This function only works with UTF-8 encoded data"
>>
>> Yeah, that is what I noticed too. It took me a couple hours to figure
>> it out!
>
> You could try to convert this text from UTF-8 to another character-set
>
> print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
>
> this prints "Client";

I admit I'm lost in the thread and I'm not really sure about what the
problem is but JSON _has_ to be Unicode. It's by design:

«A string is a sequence of zero or more Unicode characters, wrapped in
double quotes, using backslash escapes.»

http://www.json.org/

And, as already mentioned, json_encode() needs UTF-8 input data:

http://php.net/json_encode

If you need ISO-8859-15 at either end, you need to make a conversion
from ISO-8859-15 to UTF-8 *before* calling json_encode() and/or a
conversion from UTF-8 to ISO-8859-15 *after* json_decode().

In any case, every time I hear the expression "odd characters" I suspect
that the character set of data is simply unknown. The first step is to
learn which one it is.


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: json and non UTF8 [message #172210 is a reply to message #172209] Fri, 04 February 2011 11:52 Go to previous messageGo to next message
Luuk is currently offline  Luuk
Messages: 329
Registered: September 2010
Karma: 0
Senior Member
On 04-02-11 12:31, "Álvaro G. Vicario" wrote:
> El 04/02/2011 11:53, Luuk escribió/wrote:
>> On 04-02-11 01:40, Jeff Thies wrote:
>>> On 2/3/2011 10:07 AM, Luuk wrote:
>>>> On 03-02-11 15:43, Jeff Thies wrote:
>>>> > I've got some text that contains some odd characters that
>>>> > json_encode
>>>> > is returning nulls on. I'd like to either fix the characters, get
>>>> > rid of
>>>> > them or anything that doesn't give me a null.
>>>> >
>>>> > Jeff
>>>>
>>>> Can you give more detail on this 'odd characters'?
>>>> Maybe show them here with bin2hex()?
>>>> i.e. print bin2hex('abcdef');
>>>> 616263646566
>>>
>>> I was trying to json_encode this and it choked on the bullet.
>>>
>>> Project Feasibility& Development
>>> •Client and end user needs assessment
>>>>
>>>> And in the subject you state 'non UTF8'
>>>> In the docs of 'json_encode' i read:
>>>> "This function only works with UTF-8 encoded data"
>>>
>>> Yeah, that is what I noticed too. It took me a couple hours to figure
>>> it out!
>>
>> You could try to convert this text from UTF-8 to another character-set
>>
>> print (json_encode(iconv('UTF-8', \"ISO-8859-15//IGNORE\", '•Client')));
>>
>> this prints "Client";
>
> I admit I'm lost in the thread and I'm not really sure about what the
> problem is but JSON _has_ to be Unicode. It's by design:
>
> «A string is a sequence of zero or more Unicode characters, wrapped in
> double quotes, using backslash escapes.»
>
> http://www.json.org/
>
> And, as already mentioned, json_encode() needs UTF-8 input data:
>
> http://php.net/json_encode
>
> If you need ISO-8859-15 at either end, you need to make a conversion
> from ISO-8859-15 to UTF-8 *before* calling json_encode() and/or a
> conversion from UTF-8 to ISO-8859-15 *after* json_decode().
>
> In any case, every time I hear the expression "odd characters" I suspect
> that the character set of data is simply unknown. The first step is to
> learn which one it is.
>
>

oops..... my mistake! i mixed up the character-sets..

You are right, the text that is used in the json_encode() function must
be encoded in the UTF-8 character-set.

--
Luuk
Re: json and non UTF8 [message #172211 is a reply to message #172202] Fri, 04 February 2011 14:25 Go to previous message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
Jeff Thies wrote:

> I'm working my way through Alvaros idea ["to make a conversion
> from ISO-8859-15 to UTF-8 before calling json_encode() and/or a
> conversion from UTF-8 to ISO-8859-15 after json_decode()"],
> although I don't quite understand how that works. That would seem
> to me to have to be done one character at a time.

You are mistaken. There are iconv() and mb_detect_encoding() (whereas
character _encoding_ is the proper term/approach to use here, _not_
character set). RTF(ine)M.


PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: IE8 crashes when back button clicked after sending email from PHP script
Next Topic: Help with my website
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Tue Nov 26 00:23:41 GMT 2024

Total time taken to generate the page: 0.04375 seconds