PREG \d vs. [0-9] (was: how to change old ereg?) [message #181943 is a reply to message #181941] |
Wed, 26 June 2013 14:42 |
Christoph Michael Bec
Messages: 207 Registered: June 2013
Karma:
|
Senior Member |
|
|
Thomas 'PointedEars' Lahn wrote:
> Tony Mountifield wrote:
>
>> That's because you have an unescaped / within your regex, so it sees
>> /^M?(([0-9]?)[ ]?([0-9])(/ followed by a ? as a regex modifier.
>
> Good catch. Also, in POSIX Extended Regular Expressions (ERE) this is
> written simpler
>
> ^M?(([0-9]?) ?([0-9])(…
>
> and in Perl-Compatible Regular Expressions (PCRE) it is written simpler
>
> ^M?((\d?) ?(\d)(…
Isn't the exact interpretation of \d locale dependent? I was not able
to find this information on php.net and I am not able to verify this, as
I do not have locales available, which have decimal digits other than
0-9. However, at least when one works with UTF-8 encoded strings and
uses the u modifier for the regular expression, \d is not the same as [0-9]:
>>> $zero = "\xe0\xa5\xa6" // DEVANAGARI DIGIT ZERO
>>> preg_match('/[0-9]/u', $zero)
0
>>> preg_match('/\d/u', $zero)
1
--
Christoph M. Becker
|
|
|