FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » mysql dynamic binding and pass-by-ref deprecated
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: Unicode support [message #180983 is a reply to message #180982] Sat, 30 March 2013 20:05 Go to previous messageGo to previous message
Christoph Becker is currently offline  Christoph Becker
Messages: 91
Registered: June 2012
Karma:
Member
Thomas 'PointedEars' Lahn wrote:

> Christoph Becker wrote:
>
>> I am more concerned about the number of characters the string holds.
>> Say, I want to get the last character:
>>
>> $str = '€';
>> echo $str[2];
>>
>>> Because lack of precision in font reproduction, or even in guaranteeing
>>> which font may be selected, renders the former an 'open' question.
>>>
>>> strlen('€')===3 is in fact the correct answer.
>
> That depends on the character encoding of the source code. 3 is the correct
> answer for an *UTF-8*-encoded U+20AC EURO SIGN character, which is then
> encoded E2 82 AC. [1]

It is the correct answer for an UTF-8 encoded string, as strlen()
returns the length of string in bytes. But I doubt, that this is
usually expected, and IMO knowing the number of characters is more
useful for common *PHP* as opposed to low-level C programming.

>> I suppose most *higher level languages* define the length of a string as
>> the number of characters the string holds. Cf. ECMAScript's length
>> property and TCL's [string length]. Even PHP's mb_strlen() returns the
>> number of characters.
>
> AISB, unfortunately the “length” property of ECMAScript String instances is
> not a good example in that regard as it does _not_ mean the number of
> characters but the number of 16-bit units (UTF-16 code units, usually).
> That is only less obvious than with UTF-8 because all Unicode characters in
> the BMP can be encoded using only one such unit. (I am working on a
> workaround; you could call it “an mb_strlen() for ECMAScript
> implementations”.)

ECMA-262, Edition 5.1, Section 15.5.5.1, states:

| *length*
|
| The number of characters in the String value represented by this
| String object.

IMO the spec is quite clear here, and so the implementations might be in
error. Anyway, I appreciate your working on a workaround; may come in
handy. :)

--
Christoph M. Becker
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Need Forex Feed in PHP
Next Topic: can't get includes to load
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Sat Nov 30 16:21:19 GMT 2024

Total time taken to generate the page: 0.03983 seconds