FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Zip Codes ctype? Pregmatch?
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
Zip Codes ctype? Pregmatch? [message #182634] Tue, 20 August 2013 13:27 Go to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Hi all,

I'm attempting to check for US and Canadian zip codes (postal codes).
The US is easy; mostly just be sure it's five numerics and except
"00000" and "99999". But Canadian is a different story because:
It consists of alternating alpha and numeric characters (AnAnAn) but
not the entire alphabet. 8 N.A. English letters are not used, as in
DFIOQUW AND Z or put another way, they only use 18 letters in their
postal codes.
I haven't see a single example in all my research to check if the
1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
characters are numeric.

I've tried preg_match and strpos without succees, likely due to my own
weakness with preg_match, and regex creates an incredibly long statement
I'm sure it's not right to put upon the servers; they slow down even my
local server XAMPP & PHP 5.3 on win 7.

Might anyone have a better method?

Or know of any functions anywhere that could be modified to be used?
Re: Zip Codes ctype? Pregmatch? [message #182635 is a reply to message #182634] Tue, 20 August 2013 13:59 Go to previous messageGo to next message
Martin Leese is currently offline  Martin Leese
Messages: 23
Registered: June 2012
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Twayne wrote:
> Hi all,
>
> I'm attempting to check for US and Canadian zip codes (postal codes).
> The US is easy; mostly just be sure it's five numerics and except
> "00000" and "99999". But Canadian is a different story because:
> It consists of alternating alpha and numeric characters (AnAnAn) but
> not the entire alphabet. 8 N.A. English letters are not used, as in
> DFIOQUW AND Z or put another way, they only use 18 letters in their
> postal codes.

Note that postcodes must have a space in
the middle, ie, AnA nAn. In practice,
however, there is so much brain-dead
software in use that believes otherwise, it
would be prudent to make the space optional.

From the Canada Post Addressing Guide:
"Postal codes must be printed in
upper case with the first three
elements separated from the last
three by one space (no hyphens)."

--
Regards,
Martin Leese
E-mail: please(at)see(dot)Web(dot)for(dot)e-mail(dot)INVALID
Web: http://members.tripod.com/martin_leese/
Re: Zip Codes ctype? Pregmatch? [message #182636 is a reply to message #182634] Tue, 20 August 2013 14:08 Go to previous messageGo to next message
Robert Heller is currently offline  Robert Heller
Messages: 60
Registered: December 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user
At Tue, 20 Aug 2013 13:27:50 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:

>
> Hi all,
>
> I'm attempting to check for US and Canadian zip codes (postal codes).
> The US is easy; mostly just be sure it's five numerics and except
> "00000" and "99999".

5+4:

nnnnn-mmmm

Most of the time, just the basic 5 digits is enough, but sometimes the USPS
wants the additional 4 digits as well.

Whether or not you include the extra 4 digits or not depends on what you are
using the zip code for. UPS and FexEX for example don't use the extra 4
digits, but the USPS does. The extra four digits are important mostly for big
city addresses, where there might be multiple branch POs and/or delivery
routes, etc. for a given post office.


--
Robert Heller -- 978-544-6933 / heller(at)deepsoft(dot)com
Deepwoods Software -- http://www.deepsoft.com/
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
Re: Zip Codes ctype? Pregmatch? [message #182638 is a reply to message #182635] Tue, 20 August 2013 15:01 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-20 1:59 PM, Martin Leese wrote:
....
>
> From the Canada Post Addressing Guide:
> "Postal codes must be printed in
> upper case with the first three
> elements separated from the last
> three by one space (no hyphens)."
>

OUCH! Thanks! After more research I found a trail about that; forgot all
the details but the space it seems is something special more than just
esthetics.

Thanks much!
Re: Zip Codes ctype? Pregmatch? [message #182639 is a reply to message #182636] Tue, 20 August 2013 15:03 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-20 2:08 PM, Robert Heller wrote:
> At Tue, 20 Aug 2013 13:27:50 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:
>
>>
>> Hi all,
>>
>> I'm attempting to check for US and Canadian zip codes (postal codes).
>> The US is easy; mostly just be sure it's five numerics and except
>> "00000" and "99999".
>
> 5+4:
>
> nnnnn-mmmm
>
> Most of the time, just the basic 5 digits is enough, but sometimes the USPS
> wants the additional 4 digits as well.

Agreed; but in this case it's only to ID a country.
>
> Whether or not you include the extra 4 digits or not depends on what you are
> using the zip code for. UPS and FexEX for example don't use the extra 4
> digits, but the USPS does. The extra four digits are important mostly for big
> city addresses, where there might be multiple branch POs and/or delivery
> routes, etc. for a given post office.
>
>
Thanks,

Twayne`
Re: Zip Codes ctype? Pregmatch? RESOLVED [message #182640 is a reply to message #182634] Tue, 20 August 2013 15:06 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
As usual, after futzing with a problem for a couple days and end up
finally finding multiple solutions.

Thanks all,

Twayne`


On 2013-08-20 1:27 PM, Twayne wrote:
> Hi all,
>
> I'm attempting to check for US and Canadian zip codes (postal codes).
> The US is easy; mostly just be sure it's five numerics and except
> "00000" and "99999". But Canadian is a different story because:
> It consists of alternating alpha and numeric characters (AnAnAn) but
> not the entire alphabet. 8 N.A. English letters are not used, as in
> DFIOQUW AND Z or put another way, they only use 18 letters in their
> postal codes.
> I haven't see a single example in all my research to check if the
> 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
> characters are numeric.
>
> I've tried preg_match and strpos without succees, likely due to my own
> weakness with preg_match, and regex creates an incredibly long statement
> I'm sure it's not right to put upon the servers; they slow down even my
> local server XAMPP & PHP 5.3 on win 7.
>
> Might anyone have a better method?
>
> Or know of any functions anywhere that could be modified to be used?
>
Re: Zip Codes ctype? Pregmatch? [message #182641 is a reply to message #182639] Tue, 20 August 2013 15:36 Go to previous messageGo to next message
Robert Heller is currently offline  Robert Heller
Messages: 60
Registered: December 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user
At Tue, 20 Aug 2013 15:03:48 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:

>
> On 2013-08-20 2:08 PM, Robert Heller wrote:
>> At Tue, 20 Aug 2013 13:27:50 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:
>>
>>>
>>> Hi all,
>>>
>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>> The US is easy; mostly just be sure it's five numerics and except
>>> "00000" and "99999".
>>
>> 5+4:
>>
>> nnnnn-mmmm
>>
>> Most of the time, just the basic 5 digits is enough, but sometimes the USPS
>> wants the additional 4 digits as well.
>
> Agreed; but in this case it's only to ID a country.

You might need to 'accept' the extra 4 digits, since people are going to enter
them and will be mifted if your page rejects it. Just quietly drop the extra
digits.

>>
>> Whether or not you include the extra 4 digits or not depends on what you are
>> using the zip code for. UPS and FexEX for example don't use the extra 4
>> digits, but the USPS does. The extra four digits are important mostly for big
>> city addresses, where there might be multiple branch POs and/or delivery
>> routes, etc. for a given post office.
>>
>>
> Thanks,
>
> Twayne`
>

--
Robert Heller -- 978-544-6933 / heller(at)deepsoft(dot)com
Deepwoods Software -- http://www.deepsoft.com/
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
Re: Zip Codes ctype? Pregmatch? [message #182643 is a reply to message #182634] Tue, 20 August 2013 19:52 Go to previous messageGo to next message
Norman Peelman is currently offline  Norman Peelman
Messages: 126
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 08/20/2013 01:27 PM, Twayne wrote:
> Hi all,
>
> I'm attempting to check for US and Canadian zip codes (postal codes).
> The US is easy; mostly just be sure it's five numerics and except
> "00000" and "99999". But Canadian is a different story because:
> It consists of alternating alpha and numeric characters (AnAnAn) but
> not the entire alphabet. 8 N.A. English letters are not used, as in
> DFIOQUW AND Z or put another way, they only use 18 letters in their
> postal codes.
> I haven't see a single example in all my research to check if the
> 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
> characters are numeric.
>
> I've tried preg_match and strpos without succees, likely due to my own
> weakness with preg_match, and regex creates an incredibly long statement
> I'm sure it's not right to put upon the servers; they slow down even my
> local server XAMPP & PHP 5.3 on win 7.
>
> Might anyone have a better method?
>
> Or know of any functions anywhere that could be modified to be used?
>

US Zip code:
[0-9]{5}(-{0,1}[0-9]{4}){0,1}

Canadian zip code (all one line, don't miss the space!):
([A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}
{1}([0-9]{1}[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})


--
Norman
Registered Linux user #461062
-Have you been to www.php.net yet?-
Re: Zip Codes ctype? Pregmatch? RESOLVED [message #182644 is a reply to message #182640] Tue, 20 August 2013 19:53 Go to previous messageGo to next message
Norman Peelman is currently offline  Norman Peelman
Messages: 126
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 08/20/2013 03:06 PM, Twayne wrote:
> As usual, after futzing with a problem for a couple days and end up
> finally finding multiple solutions.
>
> Thanks all,
>
> Twayne`
>
>
> On 2013-08-20 1:27 PM, Twayne wrote:
>> Hi all,
>>
>> I'm attempting to check for US and Canadian zip codes (postal codes).
>> The US is easy; mostly just be sure it's five numerics and except
>> "00000" and "99999". But Canadian is a different story because:
>> It consists of alternating alpha and numeric characters (AnAnAn) but
>> not the entire alphabet. 8 N.A. English letters are not used, as in
>> DFIOQUW AND Z or put another way, they only use 18 letters in their
>> postal codes.
>> I haven't see a single example in all my research to check if the
>> 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
>> characters are numeric.
>>
>> I've tried preg_match and strpos without succees, likely due to my own
>> weakness with preg_match, and regex creates an incredibly long statement
>> I'm sure it's not right to put upon the servers; they slow down even my
>> local server XAMPP & PHP 5.3 on win 7.
>>
>> Might anyone have a better method?
>>
>> Or know of any functions anywhere that could be modified to be used?
>>
>

I responded even though you SOLVED... at least let us know what your
solution was.

--
Norman
Registered Linux user #461062
-Have you been to www.php.net yet?-
Re: Zip Codes ctype? Pregmatch? [message #182645 is a reply to message #182641] Tue, 20 August 2013 20:17 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-20 3:36 PM, Robert Heller wrote:
> At Tue, 20 Aug 2013 15:03:48 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:
>
>>
>> On 2013-08-20 2:08 PM, Robert Heller wrote:
>>> At Tue, 20 Aug 2013 13:27:50 -0400 Twayne <nobody(at)spamcop(dot)net> wrote:
>>>
>>>>
>>>> Hi all,
>>>>
>>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>>> The US is easy; mostly just be sure it's five numerics and except
>>>> "00000" and "99999".
>>>
>>> 5+4:
>>>
>>> nnnnn-mmmm
>>>
>>> Most of the time, just the basic 5 digits is enough, but sometimes the USPS
>>> wants the additional 4 digits as well.
>>
>> Agreed; but in this case it's only to ID a country.
>
> You might need to 'accept' the extra 4 digits, since people are going to enter
> them and will be mifted if your page rejects it. Just quietly drop the extra
> digits.
>
>>>
>>> Whether or not you include the extra 4 digits or not depends on what you are
>>> using the zip code for. UPS and FexEX for example don't use the extra 4
>>> digits, but the USPS does. The extra four digits are important mostly for big
>>> city addresses, where there might be multiple branch POs and/or delivery
>>> routes, etc. for a given post office.
>>>
>>>
>> Thanks,
>>
>> Twayne`
>>
>
I hear you, but when I specifically ask for a 5-digit US Zip Code if the
person wants to insist on 9, that's not the kind of person I want to
hear from anyway.
Besides, a 9-digit code is a lot more personal information than is
needed and very few people want to give away more information about
themselves than is necessary. A 9 digit zip code can in many cases lead
you right to a person's street address. With a name and an address, it's
only a short step to their ss# being compromised and then the field is
set for full identity theft.

Regards,

Twayne`
Re: Zip Codes ctype? Pregmatch? RESOLVED [message #182646 is a reply to message #182644] Tue, 20 August 2013 20:29 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-20 7:53 PM, Norman Peelman wrote:
> On 08/20/2013 03:06 PM, Twayne wrote:

....

>>
>
> I responded even though you SOLVED... at least let us know what your
> solution was.
>

Oh, sorry; guess I was in a hurry. I simply came across a couple of
methods, neither of which was properly coded, and use a combo of the two
methods to assemble mine. If you're interested, here's a crimp sheet I
collected:
--------------
From the Canada Post Addressing Guide:
"Postal codes must be printed in
upper case with the first three
elements separated from the last
three by one space (no hyphens)."

=================================================


POSTAL CODES FOR 12 COUNTRIES

<?php
$country_code="US";
$zip_postal="11111";

$ZIPREG=array(
"US"=>"^\d{5}([\-]?\d{4})?$",
"UK"=>"^(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})$",
"DE"=>" \b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9] \d{3}))\b ",
"CA"=>"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])\
{0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$",
"FR"=>"^(F-)?((2[A|B])|[0-9]{2})[0-9]{3}$",
"IT"=>"^(V-|I-)?[0-9]{5}$",
"AU"=>" ^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4] )|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$ ",
"NL"=>"^[1-9][0-9]{3}\s?([a-zA-Z]{2})?$",
"ES"=>"^([1-9]{2}|[0-9][1-9]|[1-9][0-9])[0-9]{3}$",
"DK"=>"^([D-d][K-k])?( |-)?[1-9]{1}[0-9]{3}$",
"SE"=>"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$",
"BE"=>"^[1-9]{1}[0-9]{3}$"
);

if ($ZIPREG[$country_code]) {

if (!preg_match("/".$ZIPREG[$country_code]."/i",$zip_postal)){
//Validation failed, provided zip/postal code is not valid.
} else {
//Validation passed, provided zip/postal code is valid.
}

} else {

//Validation not available

}

=======================================================================
OR ...



function fnValidatePostal($mValue, $sRegion = '')
{
$mValue = strtolower($mValue));
$sFirst = substr($mValue, 0, 1);
$sRegion = strtolower($sRegion);

$aRegion = array(
'nl' => 'a',
'ns' => 'b',
'pe' => 'c',
'nb' => 'e',
'qc' => array('g', 'h', 'j'),
'on' => array('k', 'l', 'm', 'n', 'p'),
'mb' => 'r',
'sk' => 's',
'ab' => 't',
'bc' => 'v',
'nt' => 'x',
'nu' => 'x',
'yt' => 'y'
);

if (preg_match('/[abceghjlkmnprstvxy]/', $sFirst) &&
!preg_match('/[dfioqu]/', $mValue) && preg_match('/^\w\d\w[-
]?\d\w\d$/', $mValue))
{
if (!empty($sRegion) && array_key_exists($sRegion, $aRegion))
{
if (is_array($aRegion[$sRegion]) && in_array($sFirst,
$aRegion[$sRegion]))
{
return true;
}
else if (is_string($aRegion[$sRegion]) && $sFirst ==
$aRegion[$sRegion])
{
return true;
}
}
else if (empty($sRegion))
{
return true;
}
}

return false;
}
===================================================
AND

===========================================================

Sounds like a regexp pattern like:

Code:

/^(?:[A-CEGHJ-NPR-TVX][0-9]){3}$/

could be used to validate the form of a given Canadian postal code
based on the description you gave. (Whether or not the postal code is
truly valid/used is, of course, another matter altogether.)


All that being said, I see that Canada Post has an API (and I'm
fairly sure the USPS does, too) ... you might actually check validity
with the code issuing authority at the time of submission....
------------------------



I'm most interested in using the API's that are available though as when
I figure out how to access them programatically I'll do so. Most of the
places I want to validate postal codes turn out to have an online API,
seriously relieving me of a lot of code and nullifying the possibility
of future changes, although there apparently have been few in the last
decade or so.
I'm a neophyte with little experience in these matters yet.

Cheers,

Twayne`
Re: Zip Codes ctype? Pregmatch? [message #182647 is a reply to message #182634] Wed, 21 August 2013 07:25 Go to previous messageGo to next message
Jeff North is currently offline  Jeff North
Messages: 58
Registered: November 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user
On Tue, 20 Aug 2013 13:27:50 -0400, in comp.lang.php Twayne
<nobody(at)spamcop(dot)net>
<kv08v0$614$1(at)speranza(dot)aioe(dot)org> wrote:

> | Hi all,
> |
> | I'm attempting to check for US and Canadian zip codes (postal codes).
> | The US is easy; mostly just be sure it's five numerics and except
> | "00000" and "99999". But Canadian is a different story because:
> | It consists of alternating alpha and numeric characters (AnAnAn) but
> | not the entire alphabet. 8 N.A. English letters are not used, as in
> | DFIOQUW AND Z or put another way, they only use 18 letters in their
> | postal codes.
> | I haven't see a single example in all my research to check if the
> | 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
> | characters are numeric.
> |
> | I've tried preg_match and strpos without succees, likely due to my own
> | weakness with preg_match, and regex creates an incredibly long statement
> | I'm sure it's not right to put upon the servers; they slow down even my
> | local server XAMPP & PHP 5.3 on win 7.
> |
> | Might anyone have a better method?
> |
> | Or know of any functions anywhere that could be modified to be used?

Try this (I found it on the web but can't remember where) and I
haven't tired it out:
^[ABCEGHJ-NPRSTVXY]{1}\d{1}[A-Z]{1}\s?\d{1}[A-Z]{1}\d{1}$
Re: Zip Codes ctype? Pregmatch? [message #182648 is a reply to message #182643] Wed, 21 August 2013 08:53 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Norman Peelman wrote:

> On 08/20/2013 01:27 PM, Twayne wrote:
>> I'm attempting to check for US and Canadian zip codes (postal codes).
>> The US is easy; mostly just be sure it's five numerics and except
>> "00000" and "99999". But Canadian is a different story because:
>> It consists of alternating alpha and numeric characters (AnAnAn) but
>> not the entire alphabet. 8 N.A. English letters are not used, as in
>> DFIOQUW AND Z or put another way, they only use 18 letters in their
>> postal codes.
>> I haven't see a single example in all my research to check if the
>> 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
>> characters are numeric.
>>
>> I've tried preg_match and strpos without succees, likely due to my own
>> weakness with preg_match, and regex creates an incredibly long statement
>> I'm sure it's not right to put upon the servers; they slow down even my
>> local server XAMPP & PHP 5.3 on win 7.
>>
>> Might anyone have a better method?
>>
>> Or know of any functions anywhere that could be modified to be used?
>
> US Zip code:
> [0-9]{5}(-{0,1}[0-9]{4}){0,1}
^^^^^ ^^^^^^^^^^ ^^^^^
In Perl-Compatible Regular Expressions (PCRE), as also used by PHP's preg_*
functions, the following shorthands are available:

- “*” for “{0,}”
- “?” for “{0,1}”
- “+” for “{1,}”
- “\d” for “[0-9]” (includes more numeric characters in “UTF-8 mode”)

Thus, the above expression can be simplified to

\d{5}(-?\d{4})?

However, the specification above says that “00000” and “99999” are _not_
valid U.S. ZIP codes, so to be exact you cannot just use either “[0-9]{5}”
or “\d{5}”; but you would have to use, for example, a zero-width negative
lookahead:

$possibleZips = array('00000', '00001', '99998', '99999');
foreach ($possibleZips as $possibleZip)
{
preg_match('^(?![09]{5})\\d{5}(?:-?\\d{4})?$', $possibleZip, $matches);
var_dump($possibleZip);
var_dump($matches);
}

(thanks to Anubhava: <http://stackoverflow.com/a/9609624/855543>)

> Canadian zip code (all one line, don't miss the space!):
> ([A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}
^ ^^^^^^^^
> {1}([0-9]{1}[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})

“{1}” is superfluous in all regular expression flavours (in BRE the escaped
variant is superfluous). An expression that matches, matches exactly one
time unless a following quantifier says otherwise.

In a character class expression, ranges are _not_ delimited by comma.
A comma there is a *literal* comma instead (just like most other special
characters lose, and “-” gains meaning), and repetitions are ignored:

[A-C,E,G-H,J-N,P,R-T,V,X,Y]

matches the same strings as

[A-CEG-HJ-NPR-TVXY,]

So unless you want to allow commas in ZIP codes, you need to remove them
from the respective character class.

Thus, the above expression would have to be changed, and can be simplified
to

^(?:[A-CEG-HJ-NPR-TVXY]\d){3}$

(The “^” makes sure that the second, fourth, aso. character must be a digit.
Let \s* follow it if you want to allow leading whitespace. Likewise for “$”
and trailing whitespace.)

Anyhow, if an expression is repeated, and this repetition cannot be handled
with a quantifier like above, in programming languages like PHP that allow
this, code is easier readable if you assign the repeated expression to a
variable, and have the variable reference expanded:

$cdn_letter = '[A-CEG-HJ-NPR-TVXY]';
$pattern = "^{$cdn_letter}\\d{$cdn_letter}\\d{$cdn_letter}\\d\$";

[In certain programming languages, libraries like my JSX:regexp.js [1] are
useful that allow you to define and use your own character class escape
sequences, eliminating the need for variable expansion: "\\p{cdnLetter}".]

Note that expansion/repetition is semantically different from expression
backreferences:

$pattern2 = "([A-CEG-HJ-NPR-TVXY])\\d\\1\\d\\1\\d";

$pattern would match "A1B2C3"; $pattern2 would match "A1A2A3", but not
"A1B2C3".


PointedEars
___________
[1] <http://PointedEars.de/scripts/test/regexp> p.
--
Use any version of Microsoft Frontpage to create your site.
(This won't prevent people from viewing your source, but no one
will want to steal it.)
-- from <http://www.vortex-webdesign.com/help/hidesource.htm> (404-comp.)
Re: Zip Codes ctype? Pregmatch? [message #182649 is a reply to message #182648] Wed, 21 August 2013 08:58 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Thomas 'PointedEars' Lahn wrote:

> Norman Peelman wrote:
>> On 08/20/2013 01:27 PM, Twayne wrote:
>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>> The US is easy; mostly just be sure it's five numerics and except
>>> "00000" and "99999". […]
>>
>> US Zip code:
>> [0-9]{5}(-{0,1}[0-9]{4}){0,1}
> ^^^^^ ^^^^^^^^^^ ^^^^^
> In Perl-Compatible Regular Expressions (PCRE), as also used by PHP's
> preg_* functions, the following shorthands are available:
>
> - “*” for “{0,}”
> - “?” for “{0,1}”
> - “+” for “{1,}”
> - “\d” for “[0-9]” (includes more numeric characters in “UTF-8 mode”)
>
> Thus, the above expression can be simplified to
>
> \d{5}(-?\d{4})?
>
> However, the specification above says that “00000” and “99999” are _not_
> valid U.S. ZIP codes, so to be exact you cannot just use either “[0-9]{5}”
> or “\d{5}”; but you would have to use, for example, a zero-width negative
> lookahead:
>
> $possibleZips = array('00000', '00001', '99998', '99999');
> foreach ($possibleZips as $possibleZip)
> {
> preg_match('^(?![09]{5})\\d{5}(?:-?\\d{4})?$', $possibleZip,

Replace this line with

preg_match('/^(?![09]{5})\\d{5}(?:-?\\d{4})?$/', $possibleZip,

(add the delimiter).

> $matches);
> var_dump($possibleZip);
> var_dump($matches);
> }
>
> (thanks to Anubhava: <http://stackoverflow.com/a/9609624/855543>)

--
PointedEars
Re: Zip Codes ctype? Pregmatch? [message #182650 is a reply to message #182649] Wed, 21 August 2013 09:06 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Thomas 'PointedEars' Lahn wrote:

> Thomas 'PointedEars' Lahn wrote:
>> Norman Peelman wrote:
>>> On 08/20/2013 01:27 PM, Twayne wrote:
>>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>>> The US is easy; mostly just be sure it's five numerics and except
>>>> "00000" and "99999". […]
>>>
>>> US Zip code:
>>> [0-9]{5}(-{0,1}[0-9]{4}){0,1}
>> ^^^^^ ^^^^^^^^^^ ^^^^^
>> […]
>> However, the specification above says that “00000” and “99999” are _not_
>> valid U.S. ZIP codes, so to be exact you cannot just use either
>> “[0-9]{5}” or “\d{5}”; but you would have to use, for example, a
>> zero-width negative lookahead:
>>
>> $possibleZips = array('00000', '00001', '99998', '99999');

$possibleZips = array('00000', '00001', '99998', '90909', '99999');

>> foreach ($possibleZips as $possibleZip)
>> {
>> preg_match('^(?![09]{5})\\d{5}(?:-?\\d{4})?$', $possibleZip,
>
> Replace this line with
>
> preg_match('/^(?![09]{5})\\d{5}(?:-?\\d{4})?$/', $possibleZip,
>
> (add the delimiter).

Which is still not correct, because '/[09]{5}/' matches "90909", which is
thus designated not valid. ISTM that alternation is required here:

preg_match('/^(?!0{5}|9{5})\\d{5}(?:-?\\d{4})?$/', $possibleZip);


PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16
Re: Zip Codes ctype? Pregmatch? [message #182651 is a reply to message #182647] Wed, 21 August 2013 18:53 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-21 7:25 AM, Jeff North wrote:
> On Tue, 20 Aug 2013 13:27:50 -0400, in comp.lang.php Twayne
> <nobody(at)spamcop(dot)net>
> <kv08v0$614$1(at)speranza(dot)aioe(dot)org> wrote:
>
>> | Hi all,
>> |
>> | I'm attempting to check for US and Canadian zip codes (postal codes).
>> | The US is easy; mostly just be sure it's five numerics and except
>> | "00000" and "99999". But Canadian is a different story because:
>> | It consists of alternating alpha and numeric characters (AnAnAn) but
>> | not the entire alphabet. 8 N.A. English letters are not used, as in
>> | DFIOQUW AND Z or put another way, they only use 18 letters in their
>> | postal codes.
>> | I haven't see a single example in all my research to check if the
>> | 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
>> | characters are numeric.
>> |
>> | I've tried preg_match and strpos without succees, likely due to my own
>> | weakness with preg_match, and regex creates an incredibly long statement
>> | I'm sure it's not right to put upon the servers; they slow down even my
>> | local server XAMPP & PHP 5.3 on win 7.
>> |
>> | Might anyone have a better method?
>> |
>> | Or know of any functions anywhere that could be modified to be used?
>
> Try this (I found it on the web but can't remember where) and I
> haven't tired it out:
> ^[ABCEGHJ-NPRSTVXY]{1}\d{1}[A-Z]{1}\s?\d{1}[A-Z]{1}\d{1}$
>

lol! I'll be happy to check it out! :) I never did get my own to work,
and that's a bit different from what I used.

Twayne`
Re: Zip Codes ctype? Pregmatch? [message #182652 is a reply to message #182648] Wed, 21 August 2013 19:10 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-21 8:53 AM, Thomas 'PointedEars' Lahn wrote:
> Norman Peelman wrote:
>
>> On 08/20/2013 01:27 PM, Twayne wrote:
>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>> The US is easy; mostly just be sure it's five numerics and except
>>> "00000" and "99999". But Canadian is a different story because:

....

>>
>> US Zip code:
>> [0-9]{5}(-{0,1}[0-9]{4}){0,1}
> ^^^^^ ^^^^^^^^^^ ^^^^^
> In Perl-Compatible Regular Expressions (PCRE), as also used by PHP's preg_*
> functions, the following shorthands are available:
>
> - “*” for “{0,}”
> - “?” for “{0,1}”
> - “+” for “{1,}”
> - “\d” for “[0-9]” (includes more numeric characters in “UTF-8 mode”)
>
> Thus, the above expression can be simplified to
>
> \d{5}(-?\d{4})?
>
> However, the specification above says that “00000” and “99999” are _not_
> valid U.S. ZIP codes, so to be exact you cannot just use either “[0-9]{5}”
> or “\d{5}”; but you would have to use, for example, a zero-width negative
> lookahead:
>
> $possibleZips = array('00000', '00001', '99998', '99999');
> foreach ($possibleZips as $possibleZip)
> {
> preg_match('^(?![09]{5})\\d{5}(?:-?\\d{4})?$', $possibleZip, $matches);
> var_dump($possibleZip);
> var_dump($matches);
> }
>
> (thanks to Anubhava: <http://stackoverflow.com/a/9609624/855543>)
>
>> Canadian zip code (all one line, don't miss the space!):
>> ([A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}
> ^ ^^^^^^^^
>> {1}([0-9]{1}[A-C,E,G-H,J-N,P,R-T,V,X,Y]{1}[0-9]{1})
>
> “{1}” is superfluous in all regular expression flavours (in BRE the escaped
> variant is superfluous). An expression that matches, matches exactly one
> time unless a following quantifier says otherwise.
>
> In a character class expression, ranges are _not_ delimited by comma.
> A comma there is a *literal* comma instead (just like most other special
> characters lose, and “-” gains meaning), and repetitions are ignored:
>
> [A-C,E,G-H,J-N,P,R-T,V,X,Y]
>
> matches the same strings as
>
> [A-CEG-HJ-NPR-TVXY,]
>
> So unless you want to allow commas in ZIP codes, you need to remove them
> from the respective character class.
>
> Thus, the above expression would have to be changed, and can be simplified
> to
>
> ^(?:[A-CEG-HJ-NPR-TVXY]\d){3}$
>
> (The “^” makes sure that the second, fourth, aso. character must be a digit.
> Let \s* follow it if you want to allow leading whitespace. Likewise for “$”
> and trailing whitespace.)
>
> Anyhow, if an expression is repeated, and this repetition cannot be handled
> with a quantifier like above, in programming languages like PHP that allow
> this, code is easier readable if you assign the repeated expression to a
> variable, and have the variable reference expanded:
>
> $cdn_letter = '[A-CEG-HJ-NPR-TVXY]';
> $pattern = "^{$cdn_letter}\\d{$cdn_letter}\\d{$cdn_letter}\\d\$";
>
> [In certain programming languages, libraries like my JSX:regexp.js [1] are
> useful that allow you to define and use your own character class escape
> sequences, eliminating the need for variable expansion: "\\p{cdnLetter}".]
>
> Note that expansion/repetition is semantically different from expression
> backreferences:
>
> $pattern2 = "([A-CEG-HJ-NPR-TVXY])\\d\\1\\d\\1\\d";
>
> $pattern would match "A1B2C3"; $pattern2 would match "A1A2A3", but not
> "A1B2C3".
>
>
> PointedEars
> ___________
> [1] <http://PointedEars.de/scripts/test/regexp> p.
>

Woof! A veritable cornucopia of information which I've already dedicated
to a file on my hard drive! I wasn't aware of most of that and it's
going to be really handy soon's I understand it all, for now and the
future.

One slight correction: the Canadian valid letters are:
abc e gh jklmn p rst vxy .

Not sure where it went astray; if you need clarification visit the
Canadian Postal Code reference; don't have the URL itself. Besides, it's
always best to verify ANY information from any source on the 'net.

If you happen to know the Canadian system at all, the fuller breadk-down is:
$aRegion = array(
'nl' => 'a',
'ns' => 'b',
'pe' => 'c',
'nb' => 'e',
'qc' => array('g', 'h', 'j'),
'on' => array('k', 'l', 'm', 'n', 'p'),
'mb' => 'r',
'sk' => 's',
'ab' => 't',
'bc' => 'v',
'nt' => 'x',
'nu' => 'x',
'yt' => 'y'
);

Also verifiable at the Canadian Postal website, including a map.

Thanks much!

Twayne`
Re: Zip Codes ctype? Pregmatch? [message #182653 is a reply to message #182652] Wed, 21 August 2013 22:09 Go to previous messageGo to next message
BootNic is currently offline  BootNic
Messages: 10
Registered: November 2010
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
In article <kv3hd4$p9g$1(at)speranza(dot)aioe(dot)org>, Twayne <nobody(at)spamcop(dot)net>
wrote:

[snip]

> One slight correction: the Canadian valid letters are:
> abc e gh jklmn p rst vxy .

The second and third letters may also contain [WZ]

https://maps.google.com/maps?q=B0W+1H0

https://maps.google.com/maps?q=A1W+4Z1

The pattern you posted in: Message-ID: <kv11lq$8eb$1(at)speranza(dot)aioe(dot)org>

[url]
https://groups.google.com/d/msg/comp.lang.php/D-OKMe-iZa4/ARnTsmDHSdIJ
[/url]

"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])" .
"\ {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$"

seem to work as it is.


^(?=.{3} ?.{3})(?!.{0,}[DFIOQU]|[WZ])(?:[A-Z] ?\d){3}$

• (?=.{3} ?.{3}) Positive lookahead

◦ string matches [3 charters] [optional space] [3 charters]

▪ charters are restricted in the rest of the expression

▪ remove the question mark to make the space required

• (?!.{0,}[DFIOQU]|[WZ]) Negative lookahead

◦ .{0,}[DFIOQU] string may not contain ‘DFIOQU’

◦ [WZ] string may not start with ‘W’ or ‘Z’

• (?:[A-Z] ?\d){3} Basic pattern repeat 3 times

◦ [any letter A-Z] [optional space] [any digit 0-9]

▪ spaces are restricted (index 3 or none) in the Positive lookahead

▪ letters are restricted in the Negative lookahead

[snip]





--
BootNic Wed Aug 21, 2013 10:09 pm
The human mind treats a new idea the same way the body treats a strange
protein; it rejects it.
*P. B. Medawar*
Re: Zip Codes ctype? Pregmatch? [message #182654 is a reply to message #182651] Wed, 21 August 2013 22:46 Go to previous messageGo to next message
Norman Peelman is currently offline  Norman Peelman
Messages: 126
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 08/21/2013 06:53 PM, Twayne wrote:
> On 2013-08-21 7:25 AM, Jeff North wrote:
>> On Tue, 20 Aug 2013 13:27:50 -0400, in comp.lang.php Twayne
>> <nobody(at)spamcop(dot)net>
>> <kv08v0$614$1(at)speranza(dot)aioe(dot)org> wrote:
>>
>>> | Hi all,
>>> |
>>> | I'm attempting to check for US and Canadian zip codes (postal codes).
>>> | The US is easy; mostly just be sure it's five numerics and except
>>> | "00000" and "99999". But Canadian is a different story because:
>>> | It consists of alternating alpha and numeric characters
>>> (AnAnAn) but
>>> | not the entire alphabet. 8 N.A. English letters are not used, as in
>>> | DFIOQUW AND Z or put another way, they only use 18 letters in their
>>> | postal codes.
>>> | I haven't see a single example in all my research to check if the
>>> | 1st, 3rd, and 5th characters are alpha and th 2nd, 4th and 6th
>>> | characters are numeric.
>>> |
>>> | I've tried preg_match and strpos without succees, likely due to my own
>>> | weakness with preg_match, and regex creates an incredibly long
>>> statement
>>> | I'm sure it's not right to put upon the servers; they slow down
>>> even my
>>> | local server XAMPP & PHP 5.3 on win 7.
>>> |
>>> | Might anyone have a better method?
>>> |
>>> | Or know of any functions anywhere that could be modified to be used?
>>
>> Try this (I found it on the web but can't remember where) and I
>> haven't tired it out:
>> ^[ABCEGHJ-NPRSTVXY]{1}\d{1}[A-Z]{1}\s?\d{1}[A-Z]{1}\d{1}$
>>
>
> lol! I'll be happy to check it out! :) I never did get my own to work,
> and that's a bit different from what I used.
>
> Twayne`

I'd say it's a bit different... it doesn't match the rules as given.

--
Norman
Registered Linux user #461062
-Have you been to www.php.net yet?-
Re: Zip Codes ctype? Pregmatch? [message #182655 is a reply to message #182653] Thu, 22 August 2013 18:22 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
<QUOTE>
[snip]

> One slight correction: the Canadian valid letters are:
> abc e gh jklmn p rst vxy .

The second and third letters may also contain [WZ]

https://maps.google.com/maps?q=B0W+1H0

https://maps.google.com/maps?q=A1W+4Z1

The pattern you posted in: Message-ID: <kv11lq$8eb$1(at)speranza(dot)aioe(dot)org>

[url]
https://groups.google.com/d/msg/comp.lang.php/D-OKMe-iZa4/ARnTsmDHSdIJ
[/url]

"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])" .
"\ {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$"

seem to work as it is.
</QUOTE>

Thanks for that, but this is a perfect example of an API use instead of
other methods of checking postal code validity. I haven't mentioned a
few things because I didn't want to become involved in a long session of
e-mail about their postal codes.
There is a lot of confusion in all these areas in Canada, not the
least of them being the law-suit over the "list" copyright being
violated so much. I pretty much at this time consider Canada Post to be
the "bible" for information, as loose and lacking as their information
is. It's pretty clear why the general concensus is to use A-Za-z for the
alpha part of the codes.
For instance, in a few pages on their site, it says "New Postal
Codes are added every month." but nowhere is there any information on
what's been added or where either.
And, a lot of postal zones have created their OWN postal codes,
without the blessing of Canada Post, and they work just fine because
that particular area is known to that office and thus delivers to it.
They're even trying to trade-mark "Canada Post" according to more than a
couple sources, meaning no one carrying their "list", meaning those
carrying their "lists" couldn't say that's where the codes originated.
Lot of miscellaneous wrong-headedness is going on too but I have to
draw the line and be realistic.
That I can find the post doesn't evein indicate the need for the
space in the postal code. Looking at their images sure isn't definitive,
at their website, that is. And at one time it started to be a dash, then
reverted to a space again, and so on.

This is probably the last I'll have to say on this subject. suffice it
to say I settled on all letters and all digits with a space in the middle.

Regards,

Twayne`
Re: Zip Codes ctype? Pregmatch? [message #182657 is a reply to message #182650] Sat, 24 August 2013 06:22 Go to previous messageGo to next message
Curtis Dyer is currently offline  Curtis Dyer
Messages: 34
Registered: January 2011
Karma: 0
Member
add to buddy list
ignore all messages by this user
Thomas 'PointedEars' Lahn <PointedEars(at)web(dot)de> wrote:

> Thomas 'PointedEars' Lahn wrote:
>
>> Thomas 'PointedEars' Lahn wrote:
>>> Norman Peelman wrote:
>>>> On 08/20/2013 01:27 PM, Twayne wrote:
>>>> > I'm attempting to check for US and Canadian zip codes
>>>> > (postal codes). The US is easy; mostly just be sure it's
>>>> > five numerics and except "00000" and "99999". […]

<snip>

>> Replace this line with
>>
>> preg_match('/^(?![09]{5})\\d{5}(?:-?\\d{4})?$/',
>> $possibleZip,
>>
>> (add the delimiter).
>
> Which is still not correct, because '/[09]{5}/' matches "90909",
> which is thus designated not valid. ISTM that alternation is
> required here:
>
> preg_match('/^(?!0{5}|9{5})\\d{5}(?:-?\\d{4})?$/',
> $possibleZip);

Perhaps you might also keep invalid postal codes in an array to look
up before utilizing the regular expression.

<snip>

--
Curtis Dyer
<?$x='<?$x=%c%s%c;printf($x,39,$x,39);?>';printf($x,39,$x,39);?>
Re: Zip Codes ctype? Pregmatch? [message #182658 is a reply to message #182657] Sat, 24 August 2013 06:52 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Curtis Dyer wrote:

> Thomas 'PointedEars' Lahn <PointedEars(at)web(dot)de> wrote:
>> Thomas 'PointedEars' Lahn wrote:
>>> Thomas 'PointedEars' Lahn wrote:
>>>> Norman Peelman wrote:
>>>> > On 08/20/2013 01:27 PM, Twayne wrote:
>>>> >> I'm attempting to check for US and Canadian zip codes
>>>> >> (postal codes). The US is easy; mostly just be sure it's
>>>> >> five numerics and except "00000" and "99999". […]
>
> <snip>
>
>>> Replace this line with
>>>
>>> preg_match('/^(?![09]{5})\\d{5}(?:-?\\d{4})?$/',
>>> $possibleZip,
>>>
>>> (add the delimiter).
>>
>> Which is still not correct, because '/[09]{5}/' matches "90909",
>> which is thus designated not valid. ISTM that alternation is
>> required here:
>>
>> preg_match('/^(?!0{5}|9{5})\\d{5}(?:-?\\d{4})?$/',
>> $possibleZip);
>
> Perhaps you might also keep invalid postal codes in an array to look
> up before utilizing the regular expression.

I did that exactly where you snipped.


PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Re: Zip Codes ctype? Pregmatch? [message #182660 is a reply to message #182657] Sun, 25 August 2013 10:31 Go to previous messageGo to next message
Doug Miller is currently offline  Doug Miller
Messages: 171
Registered: August 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Curtis Dyer <dyer85(at)gmail(dot)com> wrote in news:kva1hr$rc0$1(at)dont-email(dot)me:

> Perhaps you might also keep invalid postal codes in an array to look
> up before utilizing the regular expression.

What happens when new postal codes are added?

Perhaps one might simply send a request to the USPS address validation API to find out if it's
a valid zip code or not. Use the "City-State Lookup" tool described here:

https://www.usps.com/business/web-tools-apis/address-information.htm

I don't know if Canada Post has similar tools.
Re: Zip Codes ctype? Pregmatch? [message #182661 is a reply to message #182657] Sun, 25 August 2013 13:17 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-24 6:22 AM, Curtis Dyer wrote:
> Thomas 'PointedEars' Lahn <PointedEars(at)web(dot)de> wrote:
>
>> Thomas 'PointedEars' Lahn wrote:
>>
>>> Thomas 'PointedEars' Lahn wrote:
>>>> Norman Peelman wrote:
>>>> > On 08/20/2013 01:27 PM, Twayne wrote:
>>>> >> I'm attempting to check for US and Canadian zip codes
>>>> >> (postal codes). The US is easy; mostly just be sure it's
>>>> >> five numerics and except "00000" and "99999". […]
>
> <snip>
>
>>> Replace this line with
>>>
>>> preg_match('/^(?![09]{5})\\d{5}(?:-?\\d{4})?$/',
>>> $possibleZip,
>>>
>>> (add the delimiter).
>>
>> Which is still not correct, because '/[09]{5}/' matches "90909",
>> which is thus designated not valid. ISTM that alternation is
>> required here:
>>
>> preg_match('/^(?!0{5}|9{5})\\d{5}(?:-?\\d{4})?$/',
>> $possibleZip);
>
> Perhaps you might also keep invalid postal codes in an array to look
> up before utilizing the regular expression.
>
> <snip>
>
That's a great idea!

Thanks!
Re: Zip Codes ctype? Pregmatch? [message #182662 is a reply to message #182660] Sun, 25 August 2013 13:21 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-25 10:31 AM, Doug Miller wrote:
> Curtis Dyer <dyer85(at)gmail(dot)com> wrote in news:kva1hr$rc0$1(at)dont-email(dot)me:
>
>> Perhaps you might also keep invalid postal codes in an array to look
>> up before utilizing the regular expression.
>
> What happens when new postal codes are added?
>
> Perhaps one might simply send a request to the USPS address validation API to find out if it's
> a valid zip code or not. Use the "City-State Lookup" tool described here:
>
> https://www.usps.com/business/web-tools-apis/address-information.htm
>
> I don't know if Canada Post has similar tools.
>

Someone said there was a CDN API but I haven't gotten to it yet, though
I have seen it. Haven't tried to programmatically access it yet though;
want to finish up what I've been doing.

Thanks
Re: Zip Codes ctype? Pregmatch? [message #182721 is a reply to message #182634] Sat, 31 August 2013 02:17 Go to previous messageGo to next message
gordonb.qnlcm is currently offline  gordonb.qnlcm
Messages: 1
Registered: August 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
> I'm attempting to check for US and Canadian zip codes (postal codes).

For what purpose? The reason I ask is that the type of checking you
do may depend on the reason for the checking.

What do you want to do with the (US) ZIP code? (a) Mail them a
first-class letter, (b) Mail them a package, (c) Use it as verification
for the billing address on a credit card (which is about equivalent
to (a) but the credit card company does the actual mailing), (Note:
validating a credit card transaction against a payment processor
often costs real money, so it is often desirable to verify the Luhn
checksum on the credit card number, check the number of digits vs.
the bank prefix, and validate the address at least for a valid
country and state/province before sending it to the payment processor.)
(d) figure out how far it is for them to drive to one of your stores,
or (e) Use it to figure out the country since it's too hard to ask
for it? If you're checking because some standard requires you to,
name the standard (and preferably the section). If you're checking
because, well, er, um, I dunno, you're supposed to check user input,
aren't you? maybe you need to review what it is you're trying to
accomplish.

If your requirement is to check because you want to check, you don't
care WHAT you check, you just wanna check! You WANNA! You WANNA!
You WANNA! then I have to wonder if the whole project should be
dropped.

For the USA, there are 5 types of 5-digit ZIP codes (and you
need a database that's kept up to date for determining this):

Unassigned. There are references to 42,000 codes being allocated
(as of 2011), so the majority of the 100,000 possible codes are
unassigned. 00000 and 99999 are permanently unassigned but only
the tip of a very large iceberg. (Usually, unassigned codes are
not assigned a type; they are just left out of the list entirely.)

Standard. These are geographic ZIP codes covering an area, what you'd
normally think of as a ZIP code area.

Post Office. These are ZIP codes with an area covering only part
of the inside of a USPS Post Office (PO boxes, caller services,
etc.) Typically a given PO Box has a unique 9-digit ZIP code. As
a test, I once mailed a letter to "Jeff Snerfelbot" (not anywhere
close to my name) with a 9-digit zip code of a PO box. It arrived.

Unique. Some 5-digit ZIP codes are allocated to a single entity
that generates or receives lots of mail. For example, Wal-Mart
Stores has 72716 and the CIA has 20505. It is rumored that Publisher's
Clearing House has at least 2 5-digit ZIP codes, one for "YES" and
one for "NO".

Military. These are used to route mail to US military forces,
including those stationed outside the country.

The Unassigned ZIP codes are clearly invalid.

The Standard and Unique ZIP codes are valid for most purposes.

If you are planning on using UPS or FedEx, the Post Office ZIP codes
are probably invalid, since they are filled with PO Boxes in USPS
Post Offices.

If you are planning on sending something bulky, the Military ZIP
codes may be invalid. They may have special rules on what you can
send.

A ZIP code may change. I was in 2 zip code splits near Houston,
Texas around Feb. 1976 where the apartment I moved out of and the
house I moved into both changed (5-digit) ZIP codes. 9-digit
codes hadn't been put to use yet.

Private mail box services generally appear in a Standard ZIP code
and UPS or FedEx can probably deliver there if the address looks
like a street address.

A valid address doesn't avoid the possibility that (a) it's an empty
lot, and has always been an empty lot, (b) it's unoccupied, or (c)
it burned down years ago. Someone might rebuild. The Post Office
might eventually invalidate addresses taken over for a freeway
interchange or for which soil erosion and/or a hurricane has placed
it permanently underwater.


You probably should accept a 9-digit ZIP code as well, even if you
ignore the extra 4 digits beyond checking that they are digits,
especially if part of the purpose is country recognition.

If your requirements are to (1) accept what is or may become a valid
code in the near future, and (2) must not require periodic database
updates, then I suggest you figure out what the basic pattern is
and check that. Be liberal in what you accept, as you can't fully
predict the future.

The USA is easy. Rejecting 00000 and 99999 is simple, and you don't
worry about the other 57,998 or so codes that might get used in the
future.

Unless you can find a general specification for Canadian Postal
codes, *NOT* what's currently in use, you're probably better off
allowing A-Z everywhere a letter is currently allowed. Is there
anything around that says that an asterisk won't become part of a
valid Canadian Postal code? or telephone number? There are
supposedly some codes reserved for testing. Those you can exclude.
You might also exclude H0H 0H0, which is reserved for Santa Claus.

On the other hand, if your requirements are to (1) eliminate as
many bad codes as possible, and (2) a database subscription to
issued codes is acceptable, and (3) an occasional rejection of brand
new codes is acceptable, then you want to find an API preferably
maintained by someone else to do the checking for you.
Re: Zip Codes ctype? Pregmatch? [message #182723 is a reply to message #182721] Sat, 31 August 2013 10:40 Go to previous messageGo to next message
Fiver is currently offline  Fiver
Messages: 35
Registered: July 2013
Karma: 0
Member
add to buddy list
ignore all messages by this user
On 2013-08-31 08:17, Gordon Burditt wrote:
>> I'm attempting to check for US and Canadian zip codes (postal codes).
>
> For what purpose? The reason I ask is that the type of checking you
> do may depend on the reason for the checking.
>
> What do you want to do with the (US) ZIP code?
[..snip..]
> If your requirement is to check because you want to check, you don't
> care WHAT you check, you just wanna check! You WANNA! You WANNA!
> You WANNA! then I have to wonder if the whole project should be
> dropped.

If he had no reason to collect the postal codes at all, he wouldn't have
put them in the form (I hope). As soon as he does collect and store them
somewhere (like a database), for whatever purpose, he needs to do some
basic formal/plausibility checking.

This would be true even if he didn't limit the input to US and Canadian
codes. There are some general assumptions you can make about postal
codes that are valid everywhere in the world. For example

- no valid code will start/end with white space
- codes will never be longer than 20 characters
- codes without any alphanumeric characters are never valid

This is just basic data hygiene. I would never store user input without
at least checking the known formal constraints of the field.

The more he knows about the codes, the more he can check. This is where
the formal validation for the US/Canadian codes comes in. The general
format is known. If he has this information, he should use it, if only
to prevent some categories of typos.

Checking if a postal code is actually assigned to a real location at
this point in time, or if a certain delivery method is available to the
location - that's well outside the area of formal checks. If this level
of validation is required, he'll need an API from the delivery service
(or whoever is going to use the code). Then he needs to validate the
code at least twice: once on entry and once before sending out an actual
package.

> The USA is easy. Rejecting 00000 and 99999 is simple, and you don't
> worry about the other 57,998 or so codes that might get used in the
> future.
>
> Unless you can find a general specification for Canadian Postal
> codes, *NOT* what's currently in use, you're probably better off
> allowing A-Z everywhere a letter is currently allowed.

Good advice.

> You might also exclude H0H 0H0, which is reserved for Santa Claus.

Just wondering... why does Santa Claus have a Canadian postal code?
The geographic north pole outside Canadian territory. If his house was
as close to the north pole as he can get while still remaining on firm
land, he should have a Danish postal code.

regards,
5er
Re: Zip Codes ctype? Pregmatch? [message #182725 is a reply to message #182723] Sat, 31 August 2013 11:56 Go to previous messageGo to next message
Norman Peelman is currently offline  Norman Peelman
Messages: 126
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 08/31/2013 10:40 AM, Fiver wrote:
> On 2013-08-31 08:17, Gordon Burditt wrote:
>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>
>
>> You might also exclude H0H 0H0, which is reserved for Santa Claus.
>
> Just wondering... why does Santa Claus have a Canadian postal code?
> The geographic north pole outside Canadian territory. If his house was
> as close to the north pole as he can get while still remaining on firm
> land, he should have a Danish postal code.
>
> regards,
> 5er
>

H0H 0H0 is his Canadian mailing address... there are others.

--
Norman
Registered Linux user #461062
-Have you been to www.php.net yet?-
Re: Zip Codes ctype? Pregmatch? [message #182726 is a reply to message #182723] Sat, 31 August 2013 11:56 Go to previous messageGo to next message
Martin Leese is currently offline  Martin Leese
Messages: 23
Registered: June 2012
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Fiver wrote:

> On 2013-08-31 08:17, Gordon Burditt wrote:
....
>> You might also exclude H0H 0H0, which is reserved for Santa Claus.
>
> Just wondering... why does Santa Claus have a Canadian postal code?

So that Canadian children can write to him.

--
Regards,
Martin Leese
E-mail: please(at)see(dot)Web(dot)for(dot)e-mail(dot)INVALID
Web: http://members.tripod.com/martin_leese/
Re: Zip Codes ctype? Pregmatch? [message #182734 is a reply to message #182725] Sat, 31 August 2013 15:23 Go to previous messageGo to next message
Welsh Vanner is currently offline  Welsh Vanner
Messages: 1
Registered: August 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
On Sat, 31 Aug 2013 11:56:32 -0400, Norman Peelman wrote:
>
> H0H 0H0 is his Canadian mailing address... there are others.

According to the UK Royal Mail his address is
Santa/Father Christmas,
Santa’s Grotto,
Reindeerland,
SAN TA1

:-)
Re: Zip Codes ctype? Pregmatch? [message #182737 is a reply to message #182721] Sat, 31 August 2013 20:07 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-31 2:17 AM, Gordon Burditt wrote:
>> I'm attempting to check for US and Canadian zip codes (postal codes).
>
> For what purpose? The reason I ask is that the type of checking you
> do may depend on the reason for the checking.

While I certainly appreciate all the data you posted, I'm aware of most
of it, not all, and my eventual course, which I'm getting around to now,
is to use their API to determine valid codes (or not).

My "real" reason? Well, I've accomplished what I've set out to do for
now and the next step is using APIs where they exist; supply a
zip/postal code and get a response for whether it's valid or not.

>
....

maybe you need to review what it is you're trying to
> accomplish.
>

My goal is to learn, and what I've learned from tis thread is a great
deal about using PHP to handle these matters for postal codes and many
other similar formats that have nothing to do with postal codes. I
believe I have picked up a good deal of the information/experience I
need now, and am better off for it.

....

>
> The Standard and Unique ZIP codes are valid for most purposes.

Agreed.
>
....
>
> A ZIP code may change. I was in 2 zip code splits near Houston,
> Texas around Feb. 1976 where the apartment I moved out of and the
> house I moved into both changed (5-digit) ZIP codes. 9-digit
> codes hadn't been put to use yet.

Which is the draw to using their automated lookups to determine the
validity or not; it's my current bent.

>

....
>
> On the other hand, if your requirements are to (1) eliminate as
> many bad codes as possible, and (2) a database subscription to
> issued codes is acceptable, and (3) an occasional rejection of brand
> new codes is acceptable, then you want to find an API preferably
> maintained by someone else to do the checking for you.

They're freely available for the US and Canada; I've used them manually
and they accomplish my goals for me. I'm very hungry for information
from the learning PHP POV and this has been an excellent thread to that
end. One of the better things is having learned ctype_ ... something
I've never used successfully before and now find it an easy thing to
handle.

Again, thanks for all that information; I appreciate a post like this
and don't mind saying so.

Regards,

Twayne`
>
Re: Zip Codes ctype? Pregmatch? [message #182738 is a reply to message #182725] Sat, 31 August 2013 20:21 Go to previous messageGo to next message
bill is currently offline  bill
Messages: 310
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 2013-08-31 11:56 AM, Norman Peelman wrote:
> On 08/31/2013 10:40 AM, Fiver wrote:
>> On 2013-08-31 08:17, Gordon Burditt wrote:
>>>> I'm attempting to check for US and Canadian zip codes (postal codes).
>>>
>>
>>> You might also exclude H0H 0H0, which is reserved for Santa Claus.
>>
>> Just wondering... why does Santa Claus have a Canadian postal code?
>> The geographic north pole outside Canadian territory. If his house was
>> as close to the north pole as he can get while still remaining on firm
>> land, he should have a Danish postal code.
>>
>> regards,
>> 5er
>>
>
> H0H 0H0 is his Canadian mailing address... there are others.
>

True; every major city seems to have an address and a lot more just make
up the zip codes so they seem logical to kids, and advertised locally at
that. The post office actually just looks at who it's addressed to; if
it's Santa, Kringle et al, it goes to the various bags they set aside
for them. Here you can even get letters to answer yourself if there's an
address to go along with it. They keep lists of who's 'naughty and nice'
and it's legally the registered person's responsibility to not be stupid
with their letters. Here there's even a form letter template that has to
be used. There are even charity "bags" where you can anonymously
signed-up send gifts to kids in need, and a lot more. It works neatly
here since we're a small rural community; I don't know how other, larger
cities work it. AFAIK there has never been a miscreant in the process;
here, at least. Oh, and they have to be sent in special envelopes, too,
that are donated for the purpose. I've done print runs for them several
times.

It reaffirms the good in people; as long as it's a full registration and
responsibility oriented.

Cheers, won't be long & it'll be here!

Twayne`
Re: Zip Codes ctype? Pregmatch? RESOLVED [message #184246 is a reply to message #182646] Mon, 16 December 2013 05:02 Go to previous messageGo to next message
Moon Elf is currently offline  Moon Elf
Messages: 6
Registered: December 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
On 2013-08-21, Twayne <nobody(at)spamcop(dot)net> wrote:
> On 2013-08-20 7:53 PM, Norman Peelman wrote:
>> On 08/20/2013 03:06 PM, Twayne wrote:
>
> ...
>
>>>
>>
>> I responded even though you SOLVED... at least let us know what your
>> solution was.
>>
>
> Oh, sorry; guess I was in a hurry. I simply came across a couple of
> methods, neither of which was properly coded, and use a combo of the two
> methods to assemble mine. If you're interested, here's a crimp sheet I
> collected:
> --------------
> From the Canada Post Addressing Guide:
> "Postal codes must be printed in
> upper case with the first three
> elements separated from the last
> three by one space (no hyphens)."
>
> =================================================
>
>
> POSTAL CODES FOR 12 COUNTRIES
>
> <?php
> $country_code="US";
> $zip_postal="11111";
>
> $ZIPREG=array(
> "US"=>"^\d{5}([\-]?\d{4})?$",
> "UK"=>"^(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})$",
> "DE"=>" \b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9] \d{3}))\b ",
> "CA"=>"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])\
> {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$",
> "FR"=>"^(F-)?((2[A|B])|[0-9]{2})[0-9]{3}$",
> "IT"=>"^(V-|I-)?[0-9]{5}$",
> "AU"=>" ^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4] )|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$ ",
> "NL"=>"^[1-9][0-9]{3}\s?([a-zA-Z]{2})?$",
> "ES"=>"^([1-9]{2}|[0-9][1-9]|[1-9][0-9])[0-9]{3}$",
> "DK"=>"^([D-d][K-k])?( |-)?[1-9]{1}[0-9]{3}$",
> "SE"=>"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$",
> "BE"=>"^[1-9]{1}[0-9]{3}$"
> );
>

A faster algorithm would be to use regexes which use .+ .* with a unique
fingerprint. The above code is grinding your system probably.

I am sure tutorials such as Mastering Regular Expressions 2nd ed. would help
out.

> if ($ZIPREG[$country_code]) {
>
> if (!preg_match("/".$ZIPREG[$country_code]."/i",$zip_postal)){
> //Validation failed, provided zip/postal code is not valid.
> } else {
> //Validation passed, provided zip/postal code is valid.
> }
>
> } else {
>
> //Validation not available
>
> }
>
> =======================================================================
> OR ...
>
>
>
> function fnValidatePostal($mValue, $sRegion = '')
> {
> $mValue = strtolower($mValue));
> $sFirst = substr($mValue, 0, 1);
> $sRegion = strtolower($sRegion);
>
> $aRegion = array(
> 'nl' => 'a',
> 'ns' => 'b',
> 'pe' => 'c',
> 'nb' => 'e',
> 'qc' => array('g', 'h', 'j'),
> 'on' => array('k', 'l', 'm', 'n', 'p'),
> 'mb' => 'r',
> 'sk' => 's',
> 'ab' => 't',
> 'bc' => 'v',
> 'nt' => 'x',
> 'nu' => 'x',
> 'yt' => 'y'
> );
>
> if (preg_match('/[abceghjlkmnprstvxy]/', $sFirst) &&
> !preg_match('/[dfioqu]/', $mValue) && preg_match('/^\w\d\w[-
> ]?\d\w\d$/', $mValue))
> {
> if (!empty($sRegion) && array_key_exists($sRegion, $aRegion))
> {
> if (is_array($aRegion[$sRegion]) && in_array($sFirst,
> $aRegion[$sRegion]))
> {
> return true;
> }
> else if (is_string($aRegion[$sRegion]) && $sFirst ==
> $aRegion[$sRegion])
> {
> return true;
> }
> }
> else if (empty($sRegion))
> {
> return true;
> }
> }
>
> return false;
> }

Usually you do not test on vars unless you don't know at runtime what they
are which is not good coding practice.

> ===================================================
> AND
>
> ===========================================================
>
> Sounds like a regexp pattern like:
>
> Code:
>
> /^(?:[A-CEGHJ-NPR-TVX][0-9]){3}$/
>
> could be used to validate the form of a given Canadian postal code
> based on the description you gave. (Whether or not the postal code is
> truly valid/used is, of course, another matter altogether.)
>
>
> All that being said, I see that Canada Post has an API (and I'm
> fairly sure the USPS does, too) ... you might actually check validity
> with the code issuing authority at the time of submission....
> ------------------------
>
>
>
> I'm most interested in using the API's that are available though as when
> I figure out how to access them programatically I'll do so. Most of the
> places I want to validate postal codes turn out to have an online API,
> seriously relieving me of a lot of code and nullifying the possibility
> of future changes, although there apparently have been few in the last
> decade or so.
> I'm a neophyte with little experience in these matters yet.
>
> Cheers,
>
> Twayne`

PHP is procedural if you want, which I like but you need to be clever in what
you write, procedurally or not.

ME

--
Member of the DR rogue circle.
Search and you will find.
Re: Zip Codes ctype? Pregmatch? RESOLVED [message #184249 is a reply to message #184246] Mon, 16 December 2013 07:27 Go to previous message
Doug Miller is currently offline  Doug Miller
Messages: 171
Registered: August 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Moon Elf <moonelf(at)moonelfsystem(dot)net> wrote in news:slrnlatjrr(dot)dlu(dot)moonelf(at)sigtrans(dot)org:

> On 2013-08-21, Twayne <nobody(at)spamcop(dot)net> wrote:
[...]
>> <?php
>> $country_code="US";
>> $zip_postal="11111";
>>
>> $ZIPREG=array(
>> "US"=>"^\d{5}([\-]?\d{4})?$",
[...]
>
> A faster algorithm would be to use regexes which use .+ .* with a unique
> fingerprint. The above code is grinding your system probably.
>
> I am sure tutorials such as Mastering Regular Expressions 2nd ed. would help
> out.

Never mind that -- the bigger problem is that it's just plain wrong. Comparing to a RegEx can
determine only if a postal code has the correct *format*, not if it is actually a valid code.

According to this ,
>
>> if ($ZIPREG[$country_code]) {
>>
>> if (!preg_match("/".$ZIPREG[$country_code]."/i",$zip_postal)){
>> //Validation failed, provided zip/postal code is not valid.
>> } else {
>> //Validation passed, provided zip/postal code is valid.
>> }

00000-0000 is a "valid" US postal code. It's not. It's correctly *formatted*, but its contents do
not correspond to a valid ZIP+4.

99999 also fails; it matches the RegEx, but is not a valid ZIP code. Same problem with
11111, 22222, 33333, and 54321 -- and thousands of others. Out of the 100,000 possible 5-
digit zip codes, less than 42,000 are actually in use, but this algorithm will say that all 100,000
of them are valid.

And since only 42K out of 100K 5-digit zip codes are actually in use, *at most* 420 million of
the one billion possible ZIP+4 codes can be valid, making *at least* 580 million *more*
invalid ZIP+4 codes that this algorithm will incorrectly declare to be valid.

It's pretty likely that similar problems exist for the other 11 nations as well.
Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: Where's the error?
Next Topic: Xml Loading special Characters
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Thu Oct 19 21:46:06 EDT 2017

Total time taken to generate the page: 0.01218 seconds