FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum Development » Converters » Yahoo Groups -> mbox -> maillist.php -> missing posts
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Yahoo Groups -> mbox -> maillist.php -> missing posts [message #15672] Tue, 30 December 2003 20:34 Go to previous message
srchild is currently offline  srchild   United Kingdom
Messages: 88
Registered: December 2003
Location: UK
Karma:
Member
Looking at converting a yahoogroups group to FUD, so trying to transfer the existing archive.

I've collected the archive from Yahoo using this script:

http://www.lpthe.jussieu.fr/~zeitlin/yahoo2mbox.html

Now it is in mbox format, and it appears to be a valid mbox format e.g. if I view it with Elm it shows the correct number of messages and they are readable.

I load it into FUD 2.5.2 using:

cat archive | formail -s /path/to/php /path/to/maillist.php 1

Using 'Slow Reply Match' to recreate the threads, and subject mangling to remove the [listname] and body mangling to remove some of the advertising dross, it all looks good.

But it only loads about half of the messages, and the rest go missing for no obvious reason. It's not just dying early, it is missing messages out from early on in the archive. I've examined the archive file and can see no clues as to why some messages are imported and others are not. Some are email postings and some are posted from website. A user might have some messages imported whilst others by the same user are not.

I've experimented tidying up the archive file manually (removing adverts and wrapped Received lines from the first few messages, that sort of thing). I tried reordering the first few messages - one which loaded fine when first no longer loaded when moved to second in the archive.

I've tried feeding the archive through formail:

cat archive | formail > archive2

and it appears to quote lots (all?) the From_ lines, except the first line, whereas I thought it was supposed to quote only bogus From_ lines? So perhaps there is a problem with my archive and so formail is not breaking it up properly? (but note that Elm can read it properly).

I've found some fragments of text in messages/msg_1 but can't see how to interpret that - maybe there are clues in there?

Anyone got any clues for me?

Thanks


Simon Child
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Where to start?
Next Topic: Conversion phpBB 2.0.5 => FUDforum problems
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Thu May 02 12:19:15 GMT 2024

Total time taken to generate the page: 0.22535 seconds