FUDforum: comp.lang.php » mysql dynamic binding and pass-by-ref deprecated

Home » Imported messages » comp.lang.php » mysql dynamic binding and pass-by-ref deprecated

Show: Today's Messages :: Polls :: Message Navigator

Re: UTF-8 charset [message #180854 is a reply to message #180850]

Fri, 22 March 2013 00:17

Thomas 'PointedEars'
Messages: 701
Registered: October 2010

Karma:

Senior Member

M. Strobel wrote:

> Am 21.03.2013 15:31, schrieb Adrian Tuddenham:
>> M. Strobel <sorry_no_mail_here(at)nowhere(dot)dee> wrote:
>>> Am 21.03.2013 00:13, schrieb The Cat in the Hat:
>>>> Christoph Becker <cmbecker69(at)gmx(dot)de> wrote in
>>>> news:kide60$df1$1(at)speranza(dot)aioe(dot)org:
>>>> > The Cat in the Hat wrote:
>>>> >> How about omitting the smart quotes in your posts?
>>>> >>
>>>> >> Come back after you've figured out how to configure your newsreader
>>>> >> properly.
>>>> >
>>>> > Thomas' message was properly encoded as UTF-8. Isn't that acceptable
>>>> > in this NG?
>>>>
>>>> Usenet was designed to use ASCII, not UTF-8. At best, it's poor
>>>> netiquette in *any* NG.
>>>
>>> Utter nonsense.

ACK.

>> There are some computers that cannot read UTF-8.

That would be computers and software older than 30 years now. The Unicode
Standard, version 2.0, and UTF-8, one the encodings for the character set
thus specified, was published in 1992 CE. All reasonably modern operating
systems, in effect all commonly used ones, support Unicode and provide
Unicode-capable fonts. Many have made a Unicode encoding their default
encoding; for example, NTFS encodes filenames using UTF-16, and UTF-8 has
been the default locale encoding on GNU/Linux systems for several years now.

Thus, a major criticism of PHP is that as of version 5.4 it still has no
native Unicode support, while other popular programming languages on the
Web, like ECMAScript implementations and Python, have.

>> Always use the lowest common denominator if you want to communicate
>> effectively without excluding anyone.

Insisting on using *ancient* and therefore inherently *insecure* hardware
and software is no excuse for claiming something mindbogglingly absurd like
that using UTF-8 encoding would be “poor netiquette in *any* NG”. First of
all, netiquette (network etiquette) is less concerned with the technical
aspects of messages but with the behavior of people towards each other thus
exhibited. Which makes this clearly a case of “the pot calling the kettle
black”.

Speaking of netiquette, though, it *is* considered polite (on Usenet) to
introduce oneself with one's full name. People posting under unnecessarily
abbreviated names, pseudonyms, and plain nick names, like “The Cat in the
Hat”, are usually either frowned upon, laughed at, or ignored instead by
longtime regulars (like me). The same applies to people who violate
technical standards or quasi-standards such as RFC 5536, along with the AUPs
of their providers, by using non-addresses in address header field values:

<http://www.zedat.fu-berlin.de/NetNews-Regeln>
<http://www.kirchwitz.de/~amk/dni/netiquette> (based on a Big 8 original)

> More than 90% of the text is still readable if your usenet client only
> displays ASCII.
>
> In the anglophone world people get easily to think ASCII is sufficient,
> but in a global world even 16-bit Unicode (Basic Multilingual Plane) is
> not.

Those are good points, however:

> Some languages use UCS-16 internally, and it is not enough for all uses
> and users.

This is a common misconception. There is no “UCS-16” and there never was.
At most, there is or was (depending how you look at it), UCS-2 (Universal
Character Set 2), to whose character set the Unicode character set from
Unicode 2.0 on is byte-by-byte equivalent.

The underlying misconception here is confusing character set with character
encoding, probably also common because of the “charset” parameter of
Internet messages that despite its name declares *character encoding* of a
message.

UCS-2 is/was a standard for a *character set along with a specific, 16-bit
encoding* to encode all characters in the Basic Multilingual Plane (BMP),
and *only* those. This and the overhead for simple (purely US-ASCII-based)
texts lead to the later success of the competing standard, Unicode.

See also: <http://en.wikipedia.org/wiki/UCS-16> (properly redirected to the
UCS article, with an explanation why “UCS-16” is just wrong.)

*UTF*-16 (Unicode Transformation Format, 16-bit) is *another* encoding where
each *code unit* of a sequence for a character has 16 bit; a character may
be encoded using more than one code unit (the same is true for the other
UTFs). The character set thus encoded is the Unicode character set, and
both are defined in the Unicode Standard. That character set comprises, and
its transformation formats can encode, a lot more than just the BMP,
although of the subsets of Unicode the BMP remains the best supported one,
also due to pre-installed font support. At the moment, there are 1'114'112
code points in Unicode (U+0000 to U+10FFFF), so in theory a text encoded
with one of the UTFs could contain 1'114'112 different characters. (In
practice, some code point ranges in the BMP are reserved for surrogate
characters to allow more than the 65536 potential characters of the BMP
while keeping the encoding relatively simple and backwards-compatible.)

> So UTF-8 is THE solution. There must be some progress sometime.

ACK.

> ASCII, 7-bit encodings and octal are left behind.

US-ASCII *is* a 7-bit encoding. “Octal”?

The most important thing about UTF-8 is that it is equivalent to US-ASCII
for code points below U+0080 (one 8-bit code unit per character, the MSB is
always 0, encodes the same characters as in US-ASCII). Therefore, UTF-8,
through Unicode, allows texts with the greatest possible range of characters
(all written languages, even some extinct ones, common symbols, punctuation
etc.) while using the least amount of memory when this range is _not_ fully
used.

Actually, it should not be necessary to explain all this to people
subscribed to a programming newsgroup, and to Web developers in particular,
but there you are.

<http://www.joelonsoftware.com/articles/Unicode.html>
<http://unicode.org/faq/>

HTH

PointedEars
--
> If you get a bunch of authors […] that state the same "best practices"
> in any programming language, then you can bet who is wrong or right...
Not with javascript. Nonsense propagates like wildfire in this field.
-- Richard Cornford, comp.lang.javascript, 2011-11-14

Report message to a moderator

[Message index]

		mysql dynamic binding and pass-by-ref deprecated By: oldyork90 on Wed, 20 March 2013 03:49
		Re: mysql dynamic binding and pass-by-ref deprecated By: Jerry Stuckle on Wed, 20 March 2013 10:42
		Re: mysql dynamic binding and pass-by-ref deprecated By: oldyork90 on Wed, 20 March 2013 14:09
		Re: mysql dynamic binding and pass-by-ref deprecated By: Thomas 'PointedEars' on Wed, 20 March 2013 14:29
		Re: mysql dynamic binding and pass-by-ref deprecated By: The Cat in the Hat on Wed, 20 March 2013 21:58
		UTF-8 charset (was: mysql dynamic binding and pass-by-ref deprecated) By: Christoph Becker on Wed, 20 March 2013 22:45
		Re: UTF-8 charset (was: mysql dynamic binding and pass-by-ref deprecated) By: The Cat in the Hat on Wed, 20 March 2013 23:13
		Re: UTF-8 charset By: The Natural Philosoph on Wed, 20 March 2013 23:31
		Re: UTF-8 charset By: M. Strobel on Thu, 21 March 2013 09:11
		Re: UTF-8 charset By: adrian on Thu, 21 March 2013 14:31
		Re: UTF-8 charset By: The Natural Philosoph on Thu, 21 March 2013 16:22
		Re: UTF-8 charset By: M. Strobel on Thu, 21 March 2013 17:05
		Re: UTF-8 charset By: Thomas 'PointedEars' on Fri, 22 March 2013 00:17
		Re: UTF-8 charset By: Christoph Becker on Fri, 22 March 2013 00:55
		Unicode support (was: UTF-8 charset) By: Thomas 'PointedEars' on Fri, 22 March 2013 01:16
		Re: Unicode support By: Christoph Becker on Sat, 30 March 2013 12:57
		Re: Unicode support By: Thomas 'PointedEars' on Sat, 30 March 2013 13:25
		Re: Unicode support By: The Natural Philosoph on Sat, 30 March 2013 14:33
		Re: Unicode support By: Christoph Becker on Sat, 30 March 2013 14:55
		Re: Unicode support By: The Natural Philosoph on Sat, 30 March 2013 18:04
		Re: Unicode support By: Christoph Becker on Sat, 30 March 2013 20:26
		Re: Unicode support By: The Natural Philosoph on Sat, 30 March 2013 21:26
		Re: Unicode support By: Christoph Becker on Sat, 30 March 2013 22:14
		Re: Unicode support By: The Natural Philosoph on Sat, 30 March 2013 22:29
		Re: Unicode support By: M. Strobel on Sun, 31 March 2013 12:10
		Re: Unicode support By: The Natural Philosoph on Sun, 31 March 2013 12:41
		Re: Unicode support By: M. Strobel on Sun, 31 March 2013 17:12
		Re: Unicode support By: Thomas 'PointedEars' on Sat, 30 March 2013 19:18
		Re: Unicode support By: Christoph Becker on Sat, 30 March 2013 20:05
		Re: Unicode support By: Thomas 'PointedEars' on Sat, 30 March 2013 20:19
		Re: Unicode support By: M. Strobel on Sun, 31 March 2013 12:06
		Re: Unicode support By: M. Strobel on Tue, 02 April 2013 17:59
		Re: Unicode support By: The Natural Philosoph on Sat, 30 March 2013 14:29
		Re: UTF-8 charset By: The Natural Philosoph on Fri, 22 March 2013 10:50
		Re: UTF-8 charset By: M. Strobel on Fri, 22 March 2013 08:46
		Re: UTF-8 charset By: M. Strobel on Thu, 21 March 2013 17:36
		Re: UTF-8 charset [OT] By: adrian on Thu, 21 March 2013 18:22
		Re: UTF-8 charset [OT] By: Peter H. Coffin on Thu, 21 March 2013 20:00
		Re: UTF-8 charset [OT] By: Thomas 'PointedEars' on Fri, 22 March 2013 00:47
		Re: UTF-8 charset By: The Natural Philosoph on Wed, 20 March 2013 23:20
		Re: UTF-8 charset By: Scott Johnson on Wed, 20 March 2013 23:39
		Re: mysql dynamic binding and pass-by-ref deprecated By: Jerry Stuckle on Thu, 21 March 2013 03:29
		OT: Xnews mime-proxy (was: mysql dynamic binding and pass-by-ref deprecated) By: BootNic on Thu, 21 March 2013 14:53
		Re: mysql dynamic binding and pass-by-ref deprecated By: oldyork90 on Thu, 21 March 2013 13:43

Previous Topic:	Need Forex Feed in PHP
Next Topic:	can't get includes to load

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Sat Nov 30 16:41:23 GMT 2024

Total time taken to generate the page: 0.04207 seconds