FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum Development » FUDforum 3.0+ » Problems charset
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
Problems charset [message #163184] Tue, 05 October 2010 12:52 Go to next message
INVY is currently offline  INVY   Italy
Messages: 33
Registered: November 2009
Karma: 0
Member
add to buddy list
ignore all messages by this user
Esample (import newsgroups):

http://newsgroups.cyberspazio.org/t/76596/

1° post = UTF-8

2° post = ISO

how to convert ISO --> UTF?
Re: Problems charset [message #163224 is a reply to message #163184] Sun, 10 October 2010 18:41 Go to previous messageGo to next message
INVY is currently offline  INVY   Italy
Messages: 33
Registered: November 2009
Karma: 0
Member
add to buddy list
ignore all messages by this user
add file /include/nntp.inc

echo "FORUM CHARSET=[". $GLOBALS['CHARSET'] ."]\n";
echo "RAW SUBJ=[". $this->subject ."]\n";
echo "BODY=[". $this->body ."]\n";
$this->subject = htmlspecialchars(trim(decode_header_value($this->headers['subject'])));
echo "NEW SUBJ=[". $this->subject ."]\n";


LOG:

E:\Programmi\wamp\bin\php\php5.2.9-2>php E:\Programmi\wamp\www\fudforum\scripts\
nntp.php 1
Importing free.it.storia.medioevo message 1059
FORUM CHARSET=[utf-8]
RAW SUBJ=[]
BODY=[Io sono un moderatore di un ng culturale della gerarchia it.* e ti posso
assicurare che nessuno si Þ mai lamentato di una eventuale nostra censura
(siamo 4 comoderatori): basta seguire le regole del manifesto (votato da
tutti gli utenti all'atto della formazioen del ng moderato) e non insultare
nessuno. Per quanto riguarda spam, troll e altre "simpatiche" cose del
genere, purtroppo esso compare in modo massiccio nei ng non moderati, e
l'appello all'abuse dei provider non dÓ quasi mai esito, tanto che per lungo
tempo la gerarchia it.* Þ stata preclusa agli utenti TIM, proprio perchÚ
moltisimi spammer impostavano da quel provider, il cui abuse non dava alcun
segno di vita...
...........................

Þ
Ó
Þ
Ú

Solution?
Re: Problems charset [message #163225 is a reply to message #163224] Mon, 11 October 2010 01:56 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
You need to find the post (with all its headers) to see if it properly declared its character set.
If not, this is not a FUDforum problem and there would be nothing we could do about it.
If it is FUDforum's fault, you can post the raw message here or mail it to me for investigation.
Re: Problems charset [message #163227 is a reply to message #163225] Mon, 11 October 2010 04:21 Go to previous messageGo to next message
INVY is currently offline  INVY   Italy
Messages: 33
Registered: November 2009
Karma: 0
Member
add to buddy list
ignore all messages by this user
200 News.GigaNews.Com
AUTHINFO USER ****
381 more authentication required
AUTHINFO PASS *******
281 News.GigaNews.Com
Group free.it.storia.medioevo
211 5489 1059 6547 free.it.storia.medioevo
ARTICLE 1059
220 1059 <y0GLa(dot)131176$Ny5(dot)3721365(at)twister2(dot)libero(dot)it>
Path: news5.aus1.giganews.com!firehose2!nntp4!intern1.nntp.aus1.giganews.com!bor
der1.nntp.aus1.giganews.com!nntp.giganews.com!news-out.tin.it!news-in.tin.it!nnt
p.infostrada.it!twister2.libero.it.POSTED!not-for-mail
From: "Vlad[]" <joe_falchettoTOGLI(at)bigfoot(dot)com>
Newsgroups: free.it.storia.medioevo
References: <aFsAa(dot)34484$lK4(dot)1063136(at)twister1(dot)libero(dot)it> <Xns9387CD33758F2cguasc
obiella(at)195(dot)31(dot)190(dot)132> <dRsAa(dot)34416$Ny5(dot)1078349(at)twister2(dot)libero(dot)it> <Xns9387CEE
1C997Bcguascobiella(at)195(dot)31(dot)190(dot)131> <5%sAa(dot)34560$lK4(dot)1064569(at)twister1(dot)libero(dot)it>
Subject: Re: ? ? ?
Lines: 30
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1106
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Message-ID: <y0GLa(dot)131176$Ny5(dot)3721365(at)twister2(dot)libero(dot)it>
Date: Sun, 29 Jun 2003 18:14:54 GMT
NNTP-Posting-Host: 151.24.144.251
X-Complaints-To: abuse(at)libero(dot)it
X-Trace: twister2.libero.it 1056910494 151.24.144.251 (Sun, 29 Jun 2003 20:14:54
 MET DST)
NNTP-Posting-Date: Sun, 29 Jun 2003 20:14:54 MET DST
Organization: [Infostrada]
Xref: intern1.nntp.aus1.giganews.com free.it.storia.medioevo:1059

Io sono un moderatore di un ng culturale della gerarchia it.* e ti posso
assicurare che nessuno si Þ mai lamentato di una eventuale nostra censura
(siamo 4 comoderatori): basta seguire le regole del manifesto (votato da
tutti gli utenti all'atto della formazioen del ng moderato) e non insultare
nessuno. Per quanto riguarda spam, troll e altre "simpatiche" cose del
genere, purtroppo esso compare in modo massiccio nei ng non moderati, e
l'appello all'abuse dei provider non dÓ quasi mai esito, tanto che per lungo
tempo la gerarchia it.* Þ stata preclusa agli utenti TIM, proprio perchÚ
moltisimi spammer impostavano da quel provider, il cui abuse non dava alcun
segno di vita...
Re: Problems charset [message #163232 is a reply to message #163227] Mon, 11 October 2010 11:40 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
The first message has "Content-Type: text/plain; charset=windows-1252".
So, FUDforum converted it from windows-1252 to UTF-8.

The second message doesn't declare its character set.
As a result, FUDforum doesn't know how to treat it and used the char set of your forum (thus, no conversion was done).
Re: Problems charset [message #164402 is a reply to message #163232] Fri, 04 February 2011 01:19 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Цитата:
The second message doesn't declare its character set.


In nntp.inc, v 1.86 I solved the problem (http:// fudforum.org/forum/index.php?t=msg&th=119040&goto=161544&#msg_1 61544). If the charset is not specified,

if (isset ($ this-> headers ['content-type']) & & preg_match ('! charset ="?([^"]+?)"?(;| \ s |$)!', $ this-> headers ['content-type'], $ m)) {
$ charset = $ m [1];
} Else {


/ / $ charset = $ GLOBALS ['CHARSET'];
[b]$ charset = 'koi8-r';[/b]



How to make a nntp.inc in 5075? Where the function parse_msgs? Smile

[Updated on: Fri, 04 February 2011 01:29]

Report message to a moderator

Re: Problems charset [message #164419 is a reply to message #164402] Fri, 04 February 2011 07:35 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
What is 5075?

Function parse_msgs() is in compiler.inc. However, it is no related to the current topic! Why do you ask?
Re: Problems charset [message #164421 is a reply to message #164419] Fri, 04 February 2011 07:47 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Цитата:
What is 5075?

<?php
/**
* copyright            : (C) 2001-2010 Advanced Internet Designs Inc.
* email                : forum(at)prohost(dot)org
* $Id: nntp.inc 5075 2010-11-15 17:59:45Z naudefj $
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; version 2 of the License.
**/


Цитата:
it is no related to the current topic


I have same problem.
Re: Problems charset [message #164423 is a reply to message #164421] Fri, 04 February 2011 07:55 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
Please use the provided workaround.
Re: Problems charset [message #164431 is a reply to message #164423] Fri, 04 February 2011 21:35 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Цитата:
Please use the provided workaround

I do not quite understand you - http://translate.google.ru:)
I'll try to describe the problem.
Determination of coding messages. In version 3.00. File nntp.inc v 1.86. Function parse_msgs (). If no encoding is specified (outlook express does so), "$ charset = 'koi8-r'";. Everything worked.
Version 3.02. Where the definition of encoding? Where parse_msgs ()? (found), things have changed! Smile What do I do? If the encoding is not specified where to insert "$ charset = 'koi8-r'"?
Re: Problems charset [message #164434 is a reply to message #164431] Fri, 04 February 2011 22:55 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
I think I understand the problem.
The workaround doesn't work for FUDforum 3.0.2 because we moved the MIME code into a separate file.
Try to edit "include/mime_decode.inc" and change:

$this->headers['__other_hdr__']['content-type']['charset'] = 'utf-8';

to
$this->headers['__other_hdr__']['content-type']['charset'] = 'koi8-r';

Re: Problems charset [message #164437 is a reply to message #164434] Sat, 05 February 2011 07:18 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Not working as it should - subject not decode.
http://sochiconfa.ru/index.php/f/1/S=aeb61e9aec5dc19b45b7992ba406449f
Tried to edit scripts_common.inc - function decode_header_value ($ val)
I do not understand how it works there. In version 3.00 everything was easier:)
Re: Problems charset [message #164443 is a reply to message #164437] Sat, 05 February 2011 23:13 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
So, what are you saying, the body was correctly decoded but the subject wasn't?

BTW: Your link doesn't work. It points to an empty 3.0.0 forum.
Re: Problems charset [message #164444 is a reply to message #164443] Sat, 05 February 2011 23:39 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Цитата:
So, what are you saying, the body was correctly decoded but the subject wasn't?

Yes

Цитата:
BTW: Your link doesn't work. It points to an empty 3.0.0 forum.


Downgrade Smile
Re: Problems charset [message #164488 is a reply to message #164444] Wed, 09 February 2011 21:45 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

I partially solved the problem.
The file scripts_common.inc

if (function_exists('iconv_mime_decode')) {
		return iconv_mime_decode(trim($val), 2);

to
	if (function_exists('iconv_mime_decode')) {
		return iconv_mime_decode(trim($val), 2, $GLOBALS['CHARSET']);


In file mime_decode.inc
	function fetch_useful_headers() {
include_once "detect_cyr_charset.php";	// include detect charset script (in attachment)		

// if script return charset ISO - nothing encoding
if ((detect_cyr_charset($this->headers['subject'])=="i")) {
$this->subject = htmlspecialchars(trim($this->headers['subject']));
}	
elseif((detect_cyr_charset($this->headers['subject'])=="m"))  // if charset MAC - i dont know 
{ $this->subject = "Decode subject faild :(";
}
else   
{ $this->subject = decode_string((htmlspecialchars(trim($this->headers['subject']))), $this->headers['content-transfer-encoding'], $this->headers['__other_hdr__']['content-type']['charset']);
}



I do not quite understand how this "Hindu Code" works. But it works! Smile
About 80% "$ this-> headers ['subject']" is decoded normally.

There is a way to make it better?

[Updated on: Wed, 09 February 2011 22:15]

Report message to a moderator

Re: Problems charset [message #164505 is a reply to message #164488] Fri, 11 February 2011 23:15 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
I don't mind adding it. The question is, will it also work for other users or is there cases where it will break?
Re: Problems charset [message #164509 is a reply to message #164505] Sat, 12 February 2011 03:19 Go to previous messageGo to next message
rover_scan is currently offline  rover_scan   
Messages: 31
Registered: January 2010
Karma: 0
Member
add to buddy list
ignore all messages by this user

Do not add. This solution only for one of my cases. this solution will not work correctly with other languages. It is not universal. Maybe something from such a solution is useful to you

(http://translate.google.ru Confused )

[Updated on: Sat, 12 February 2011 03:19]

Report message to a moderator

Re: Problems charset [message #183396 is a reply to message #164509] Tue, 22 October 2013 03:19 Go to previous message
naudefj is currently offline  naudefj   
Messages: 3526
Registered: December 2004
Karma: 8
Senior Member
Administrator
Core Developer
add to buddy list
ignore all messages by this user
For future reference, see solution at
http://fudforum.org/forum/index.php?t=msg&goto=183395
Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: Removing Multiple "re: " in subject lines
Next Topic: Time/Date Formatting
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Sat Apr 19 18:55:12 EDT 2014

Total time taken to generate the page: 0.01400 seconds