FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » DOMDocument loadHTML and double UTF8 encode
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
DOMDocument loadHTML and double UTF8 encode [message #170254] Fri, 22 October 2010 22:09 Go to previous message
roger21 is currently offline  roger21
Messages: 3
Registered: October 2010
Karma:
Junior Member
hi,

i use DOMDocument loadHTML to parse pages from this forum
http://forum.hardware.fr/ (the forum is in utf8) my problem is some
pages are actually seen as utf8 like this one
http://forum.hardware.fr/hfr/Hardware/liste_sujet-1.htm and some are not
like this one
http://forum.hardware.fr/hfr/HardwarePeripheriques/liste_sujet-1.htm
and so the second kind results in a double utf8 encoding

so i test if the page is doubly encoded and if yes i utf8_decode the
text value i want but there are some side effects, for exemple the euro
sign is not doubly encode so this one become crap when utf8_decoded ...
and i don't know if there are other signs like that

so i am lost (and pissed) any idea how i should manage all that ?
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Previous Topic: US, Canada or International
Next Topic: ==Get an Internship in the United States ==
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 01:26:40 GMT 2024

Total time taken to generate the page: 0.04209 seconds