Re: extracting the root domain from a URL [message #171666 is a reply to message #171644] |
Fri, 14 January 2011 23:41 |
Thomas 'PointedEars'
Messages: 701 Registered: October 2010
Karma:
|
Senior Member |
|
|
Mike wrote:
> I thought .com, .asia, etc. was the "TLD".
It is, although that includes a trailing dot that is usually omitted.
> What would you call the 'site.com' or 'site.co.uk' portion of the url?
That is the second-level domain (which is only loosely related to URLs, this
is about is DNS). BTW, the "domain root", if any, was the trailing dot of
any domain name that is usually not written.
> Regardless of the name, can you suggest an effective and accurate way to
> extract it?
Yes.
> In the domain names that use an extra part of the name (e.g.,
> 'site.com.tw' or 'site.co.uk'),
The third-level domain, or people are often using the umbrella term,
sub-level domain.
> I've only ever seen 'com' and 'co' used that way.
There are several others.
> I guess I could check if that center part is 'com' or
> 'co' rather than checking to see if it's strlen() > 3.
Bad idea.
> Though, I'm not sure if that's all folks are using. I wonder if there are
> any published conventions for it?
Each top-level domain has an assigned authority, usually called a NIC
(Network Information Center), which writes the book on the domain names
under their control. Most if not all of them have a website where you would
find those rules. IANA (Internet Assigned Numbers Authority) maintains a
list of the registered TLDs and their assigned authorities, you can find it
on their website. Really, STFW.
PointedEars
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$8300dec7(at)news(dot)demon(dot)co(dot)uk> (2004)
|
|
|