FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » reading files with accents in the filename from PHP
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: reading files with accents in the filename from PHP [message #183116 is a reply to message #183107] Wed, 09 October 2013 20:54 Go to previous messageGo to previous message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma:
Senior Member
Thomas Mlynarczyk wrote:

> Erwin Moller schrieb:
>> How can PHP open files on the local filesystem that contain certain
>> characters, like umlauts, accents, etc?
>
> $path = __DIR__ . '\Eugène.txt';
> var_dump( PHP_VERSION, file_exists( $path ) );
>
> Works on my Windows XP, PHP 5.4.8, *if* the PHP file is stored in ANSI
> (="Windows") encoding.

There is no “ANSI encoding“. Usually “ANSI encoding” means Windows-1252.
[0] It would be either coincidence or strange if this worked, because FAT32
uses the “OEM character set”, i. e. one of the various IBM code pages, 437
for English, and NTFS uses UTF-16BE [1]. The letter “è” has Windows-1252
code 0xE6, IBM437/IBM850 code 0x8A, and Unicode code point U+00E8 [2]
(encoded in UTF-16 as 0xE8 [3]). It follows that you cannot mean
Windows-1252 by “ANSI”.

[0] <http://en.wikipedia.org/wiki/Windows-1252>
[1] <http://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx>
[2]
<http://en.wikipedia.org/wiki/Western_Latin_character_sets_(computing)#Comparison_table>
[3] <http://rishida.net/tools/conversion/>

BTW, you want to upgrade soon:

< http://blogs.technet.com/b/security/archive/2013/08/15/the-risk-of-running- windows-xp-after-support-ends.aspx>

> Doesn't work if stored in UTF8.

The file, or the string?

> So I suspect it's an encoding issue: the accented character "è" is stored
> as \xE8 in the file system,

Yes, it is, but only with NTFS.

> but if your script is UTF8, then your $path will contain \xC3\xA8 instead.

Which cannot work with NTFS.

> With an "explicit" $path = __DIR__ . "\Eug\xE8ne.txt" it works when the
> script is UTF8.

But only with NTFS and compatible filesystems.


PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: PDO - Cannot retrieve warnings with emulated prepares disabled
Next Topic: Secure website
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Thu Sep 19 16:52:46 GMT 2024

Total time taken to generate the page: 0.04567 seconds