FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum Development » Bug Reports » PDF from HTML topic is poor
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
PDF from HTML topic is poor [message #28990] Wed, 23 November 2005 17:32 Go to next message
PaulC is currently offline  PaulC   United States
Messages: 13
Registered: November 2005
Location: Silicon Valley
Karma: 0
Junior Member
When generating a PDF from a topic in HTML format, the extended caracter entities ( , <, & #8220;, & #8221; etc.) are not converted to actual characters (shown in their raw form).

<hr> also seem to have been stripped. Most paragraphs, too.

Is the article converted to plain text before the PDF is generated? the PDF created bears little resemblence to the topic viewed online.

This isn't that important to our use of FUDforum, but it's a cool feature and it's a shame it doesn't work well in this circumstance.

[Updated on: Wed, 23 November 2005 17:33]

Report message to a moderator

Re: PDF from HTML topic is poor [message #28992 is a reply to message #28990] Thu, 24 November 2005 00:38 Go to previous messageGo to next message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
PDF document does not understand HTML so, the forum has to convert things into more or less equivalent plain-text.

The entities issue is resolved by the following patch:
http://cvs.prohost.org/c/index.cgi/FUDforum/chngview?cn=7447


FUDforum Core Developer
Re: PDF from HTML topic is poor [message #29003 is a reply to message #28992] Thu, 24 November 2005 15:33 Go to previous messageGo to next message
PaulC is currently offline  PaulC   United States
Messages: 13
Registered: November 2005
Location: Silicon Valley
Karma: 0
Junior Member
Not quite there yet... please see attached.
  • Attachment: testmail.zip
    (Size: 30.38KB, Downloaded 885 times)
Re: PDF from HTML topic is poor [message #29005 is a reply to message #29003] Thu, 24 November 2005 16:41 Go to previous messageGo to next message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
Well, I imported a message and made a PDF out it, the embed image was removed, but that's expected. There is a link to the attachment in the PDF.

No bugs there...


FUDforum Core Developer
Re: PDF from HTML topic is poor [message #29010 is a reply to message #29005] Thu, 24 November 2005 17:01 Go to previous messageGo to next message
PaulC is currently offline  PaulC   United States
Messages: 13
Registered: November 2005
Location: Silicon Valley
Karma: 0
Junior Member
Sorry, should've been more clear.

As an example, take the first snippet of text from the text/html part of the email:
Let=E2=80=99s face it,

the result in the PDF is:
Let⤙s face it,


The =E2=80=99 sequence is correctly converted to ' when viewed through the web interface, but in the PDF it's converted to three characters - a-circumflex, the euro symbol, and the trademark symbol.
Re: PDF from HTML topic is poor [message #29011 is a reply to message #29010] Thu, 24 November 2005 17:06 Go to previous messageGo to next message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
My browser shows it as 3 characters, which is what you have there. The reason SOME browsers show it correctly because they are set to auto-detect UTF-8.

Bottom line is that you have 3 bytes of text and PDF shows you 3 bytes of text.


FUDforum Core Developer
Re: PDF from HTML topic is poor [message #29024 is a reply to message #29011] Fri, 25 November 2005 06:35 Go to previous messageGo to next message
PaulC is currently offline  PaulC   United States
Messages: 13
Registered: November 2005
Location: Silicon Valley
Karma: 0
Junior Member
On Windows XP, IE 6 shows it as a single quote, as does Firefox, and Outlook shows a single quote in the original email.

I don't know what browser you're using, but the majority of my users will be using IE6 and/or Firefox under Windows XP. I'm guessing that's true for most forums.

Paul
Re: PDF from HTML topic is poor [message #29038 is a reply to message #29024] Sun, 27 November 2005 16:19 Go to previous message
Ilia is currently offline  Ilia   Canada
Messages: 13241
Registered: January 2002
Karma: 0
Senior Member
Administrator
Core Developer
I use Firefox 1.0.7 on WinXP and I do not see a single quote. You probably have yours set to auto-detect utf-8

FUDforum Core Developer
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Can not post more than ONE image per post [2.7.3]
Next Topic: bug in 2.7.3? Post message don't redirect to forum
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Mon Nov 25 21:46:07 GMT 2024

Total time taken to generate the page: 0.02708 seconds