Re: Processing accented characters submitted from forms [message #184490 is a reply to message #184487] |
Fri, 03 January 2014 12:53 |
JohnT
Messages: 16 Registered: April 2011
Karma:
|
Junior Member |
|
|
On Fri, 03 Jan 2014 12:37:27 +0000, Ben Bacarisse wrote:
> JohnT <john-sospam(at)jtresponse(dot)co(dot)uk> writes: <snip>
>> We're already using iso-8859-1 for the whole website. It will be a lot
>> of work to change all that, so I guess we'll have to put up with the
>> odd Turkish I causing problems.
>
> It's not clear (to me at least) what's happening to the data, but as far
> as any normal set of HTML pages are concerned (PHP generated or
> otherwise) you don't have to put up with a dotted I causing problems on
> an ISO-8859-1 encoded page. You can represent any Unicode character in
> a page using character entities (browser and font support is always and
> issue but not nowadays for anything as ordinary as İ).
I think it must be the browser that is encoding the character because İ
is not supported by iso-8859-1.
It arrives in the request data as the html numeric entity code, as that
is the only way it can be transmitted.
This causes issues:
As I always htmlencode user entered data before display, it means that it
gets encoded twice. I'll have to add the 'disable double encode' flag
thoughout my code :-)
Secondly, it will be added to the database as the entity code, so this
will break searching the database etc...
I think the proper fix would would be to convert to UTF-8.
But thats a lot of work. For now, I think I'll just manually translit the
codes that cause issues.
JohnT
|
|
|