|encoding [message #186326]
||Thu, 26 June 2014 01:50
Tear my hair out time.|
I have a csv file that contains text strings that I wish to display in a
The csv file is utf-8, and the text strings include the british pound
symbol encoded as two bytes 0xc2/0xa3
before reading the csv file, which I hope means that the csv file is read
Then I feed the string through htmlentities() before adding it to the web
However, the web page that arrives at the client has Â£
instead of just £.
I'm not sure where it's going wrong, partly because right now I may be
too tired to work out where and how I can inspect the string without
character encodings getting in the way.
If I print_r the data that has been read in to the web page, that shows
ok, but at that point it's still utf-8, not an html entity.
The following is at http://www.sined.co.uk/tmp/pound.php and seems to
demonstrate the issue:
$str1 = "\xc2\xa3";
$str2 = htmlentities( "$str1" );
echo <<< EOT
I'm not sure how to fix this. Ideas anyone?
Denis McMahon, denismfmcmahon(at)gmail(dot)com
|Re: encoding [message #186328 is a reply to message #186327]
||Thu, 26 June 2014 07:20
Denis McMahon wrote:|
> htmlentities( $string, ENT_COMPAT, "UTF-8" );
> Not sure if I actually need the setlocale or not. Seems to work without
In PHP < 5.4 the default of the 3rd parameter is 'ISO-8859-1', so
setting this parameter appropriately is important when $string may
contain non ASCII characters. For instance:
htmlentities("\xC3\xA4", ENT_COMPAT, 'ISO-8859-1');
// => 'Ã¤'
htmlentities("\xC3\xA4", ENT_COMPAT, 'UTF-8');
// => 'ä'
Christoph M. Becker