FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum » How To » Major XML-aggregation Confusion
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
Major XML-aggregation Confusion [message #163901] Thu, 09 December 2010 19:27 Go to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
I'm dum cornfused.

For the life of me, I cannot figure out the sorcery of board's XML aggregator (or what not). I'm trying to do two things: (a) pick up a xml feed of a blog; and (b) syndicate a forum of my own board. Let's take each of these separately.

IMPORTING A FEED

1. I put the feed url in and created a "rule," just as one might for an email-list capture.

2. I've put in the php executable in the job-administration part of the board. It took the executable when I pressed "set." Next, I scheduled the job to run every minute.

3. I then pressed "run now," and it told me the job was scheduled to run in the background.

4. Nothing happened after several minutes. I pressed "log report" (or something) and it told me that the job either wasn't executed or nothing was captured.

5. I then reset the date on the rule. It told me that all the old messages would be captured. I repeated 3-4. Nothing happened. (It's as if I'm not concentrating enough when I flick the wand).

SYNDICATING MY OWN FEED

I pressed "syndicate this feed" on one of my threads. All it does is download secret xml sorcery code into a textfile. What do I do with it? I was under the impression that if I pressed "syndicate feed," that a new page would come up with the last 15 entries or so. How do I get my own board forum to display its feeds on a webpage?

Yours flunking wizardry.
Re: Major XML-aggregation Confusion [message #163903 is a reply to message #163901] Fri, 10 December 2010 09:29 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
Your steps 1-5 are spot on.
Check if you have "disabled functions" in the System Info ACP.
Also, try to run the script from command line to see what happens.
Re: Major XML-aggregation Confusion [message #163905 is a reply to message #163903] Fri, 10 December 2010 10:55 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
... It worked when I ran it directly from my cron tab, like all other jobs. There is nothing disabled in the System Info.

I think this has something to do with the peculiar problems I have been having with my host-service running php files that start with #!. I'm getting a "500: internal error" (something like that) message every time mail is successfully piped. I went to see if I had a similar error here, but I'm not getting cron emails right now.

Anyway, I think the problem is somewhere on my end.

As always, i appreciate the help
Re: Major XML-aggregation Confusion [message #163906 is a reply to message #163905] Fri, 10 December 2010 11:40 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Frank:

I have noticed one possible error. When I run the cron job myself, the feed becomes imported. HOWEVER, it only imports the body of the message, not the links. There does not appear to be a link to the original website (that I am importing) or links to which the body of text is referring.

Here is an example:

FEED:
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:thr="http://purl.org/syndication/thread/1.0">
    <title>Legal Theory Blog</title>
    <link rel="self" type="application/atom+xml" href="http://lsolum.typepad.com/legaltheory/atom.xml" />
    <link rel="hub" href="http://hubbub.api.typepad.com/" />
    <link rel="alternate" type="text/html" href="http://lsolum.typepad.com/legaltheory/" />
    <id>tag:typepad.com,2003:weblog-104663</id>
    <updated>2010-12-10T11:19:00-06:00</updated>
    <subtitle>&quot;All the theory that fits.&quot;Lawrence B. Solum</subtitle>
    <generator uri="http://www.typepad.com/">TypePad</generator>
    <entry>
        <title>Snyder on the Judicial Genealogy of John Roberts</title>
        <link rel="alternate" type="text/html" href="http://lsolum.typepad.com/legaltheory/2010/12/snyder-on-the-judicial-genealogy-of-john-roberts.html" />
        <link rel="replies" type="text/html" href="http://lsolum.typepad.com/legaltheory/2010/12/snyder-on-the-judicial-genealogy-of-john-roberts.html" />
        <id>tag:typepad.com,2003:post-6a00d8341bf68d53ef0147e08f21c6970b</id>
        <published>2010-12-10T11:19:00-06:00</published>
        <updated>2010-12-10T11:19:00-06:00</updated>
        <summary>Brad Snyder (University of Wisconsin Law School) has posted The Judicial Genealogy (and Mythology) of John Roberts: Clerkships from Gray to Brandeis to Friendly to Roberts (Ohio State Law Journal, Vol. 71, No. 1149, 2010) on SSRN. Here is the...</summary>
        <author>
            <name>Lawrence Solum</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-US" xml:base="http://lsolum.typepad.com/legaltheory/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>Brad Snyder (University of Wisconsin Law School) has posted <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1722362">The Judicial Genealogy (and Mythology) of John Roberts: Clerkships from Gray to Brandeis to Friendly to Roberts</a> (Ohio State Law Journal, Vol. 71, No. 1149, 2010) on SSRN.  Here is the abstract:</p>
<ul>
During his Supreme Court nomination hearings, John Roberts idealized and mythologized the first judge he clerked for, Second Circuit Judge Henry Friendly, as the sophisticated judge-as-umpire. Thus far on the Court,Roberts has found it difficult to live up to his Friendly ideal, particularly in several high-profile cases. This Article addresses the influence of Friendly on Roberts and judges on law clerks by examining the roots of Roberts's distinguished yet unrecognized lineage of former clerks: Louis Brandeis's clerkship with Horace Gray, Friendly's clerkship with Brandeis, and Roberts's clerkships with Friendly and Rehnquist. Labeling this lineage a judicial genealogy, this Article reorients clerkship scholarship away from clerks' influences on judges to judges' influences on clerks. It also shows how Brandeis, Friendly, and Roberts were influenced by their clerkship experiences and how they idealized their judges. By laying the clerkship experiences and career paths of Brandeis, Friendly, and Roberts side-by-side in detailed primary source accounts, this Article argues that judicial influence on clerks is more professional than ideological and that the idealization of judges and emergence of clerkships as must-have credentials contribute to a culture of judicial supremacy.
</ul></div>
</content> 


Here is what I get:

Quote:
Brad Snyder (University of Wisconsin Law School) has posted The Judicial Genealogy (and Mythology) of John Roberts: Clerkships from Gray to Brandeis to Friendly to Roberts (Ohio State Law Journal, Vol. 71, No. 1149, 2010) on SSRN. Here is the abstract: During his Supreme Court nomination hearings, John Roberts idealized and mythologized the first judge he clerked for, Second Circuit Judge Henry Friendly, as the sophisticated judge-as-umpire. Thus far on the Court,Roberts has found it difficult to live up to his Friendly ideal, particularly in several high-profile cases. This Article addresses the influence of Friendly on Roberts and judges on law clerks by examining the roots of Roberts's distinguished yet unrecognized lineage of former clerks: Louis Brandeis's clerkship with Horace Gray, Friendly's clerkship with Brandeis, and Roberts's clerkships with Friendly and Rehnquist. Labeling this lineage a judicial genealogy, this Article reorients clerkship scholarship away from clerks' influences on judges to judges' influences on clerks. It also shows how Brandeis, Friendly, and Roberts were influenced by their clerkship experiences and how they idealized their judges. By laying the clerkship experiences and career paths of Brandeis, Friendly, and Roberts side-by-side in detailed primary source accounts, this Article argues that judicial influence on clerks is more professional than ideological and that the idealization of judges and emergence of clerkships as must-have credentials contribute to a culture of judicial supremacy.
Re: Major XML-aggregation Confusion [message #163909 is a reply to message #163901] Fri, 10 December 2010 22:28 Go to previous messageGo to next message
wittrs is currently offline  wittrs   
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
... fascinating.

I have noticed the following: this protocol does result in links being captured:

 &lt;a href=&quot;http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1595044&quot;&gt;Claim Construction and Technical Training: An Empirical Study of the Reversal Rates of Technically Trained Judges in Patent Claim Construction Cases&lt;/a&gt;


But the normal http://linkname protocol does NOT.

The feed I am trying to capture had the former for one of its entries, but the latter for all else.

Thoughts?
Re: Major XML-aggregation Confusion [message #163910 is a reply to message #163909] Fri, 10 December 2010 22:57 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
The content part in XML must always be escaped. The syntax for HTML and XML is quite similar. If the HTML is not escaped, programs might not know the difference between the two.

http://en.wikipedia.org/wiki/XML#Escaping
There are five predefined entities: &lt; represents "<", &gt; represents ">", &amp; represents "&", &apos; represents ', and &quot; represents "
Re: Major XML-aggregation Confusion [message #163914 is a reply to message #163901] Sat, 11 December 2010 09:52 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
... alright, I get the idea that <a href, etc.> is escaped, and that &lt; (etc.) is not. But are you saying that there is no way to make it so that, if someone has <a href> in their feed, that it can be imported as &lt; (etc.)? Wouldn't a regex sort of thing accomplish that? I notice there isn't a body mangling option in the page that creates the xml rule -- so, if someone were to attempt such a thing, it would go directly in the xml-aggregation php file? (Not sure I'm handy enough to try that, but in the future, who knows?).

Thanks as always! (This XML thing looks nifty; I can import all of my blogs into my forum).
Re: Major XML-aggregation Confusion [message #163915 is a reply to message #163914] Sat, 11 December 2010 11:36 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
If a feed is invalid, you cannot expect FUDforum or any other XML aggregator to correctly interpret it.
Rather contact the owner of the dysfunctional feed and ask them to fix it.
There is a handy validator at http://www.w3schools.com/xml/xml_validator.asp
Re: Major XML-aggregation Confusion [message #163917 is a reply to message #163915] Sat, 11 December 2010 13:32 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Hi again Frank. Is there any way in the future to have the linked-title of each entry captured? Take a look at this, which is a capture of the defective feed we have been talking about:

http://ludwig.squarespace.com/storage/feed.png

Note two things: (1) it has the titles of each entry and a link; and (2) the defective links from the body of the feed are escaped. I'm trying to find a way so that if I capture a feed, the author's specific webpage gets appropriately credited. I do realize that I could place a signature below each item that takes you to the author's main page. But I was hoping in the future to take you to the exact page that the feed entry is quoting.

Sorry to be a pain in the rear. Just passing along a suggestion for the future perhaps. Appreciate your patience.

[Updated on: Sat, 11 December 2010 13:34]

Report message to a moderator

Re: Major XML-aggregation Confusion [message #163918 is a reply to message #163917] Sat, 11 December 2010 14:23 Go to previous messageGo to next message
Ernesto is currently offline  Ernesto   Sweden
Messages: 413
Registered: August 2005
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
I do not understand what you mean wittrs.
This is the content of a typical RSS feed:

<title>This is the title of the post/newsitem</title>
<link>This is a link to the single post/newsitem</link>
<description>This is description</description>
<pubDate>2010-12-11T22:12:00+00:00</pubDate>
<author>Author(at)mail(dot)com</author>

This is also what the forums RSS parser grabs - title, url, content, date and author. This is what the feed contains, this is all you can add, this is what the forum will grab.

The linked title is always captured, the title is set as the post title and the URL is inserted in the post.

It does take you to exactly the page the feed is "quoting".

The feed you screenshotted (http://lsolum.typepad.com/legaltheory/index.rdf) is a valid feed.


Re: Major XML-aggregation Confusion [message #163920 is a reply to message #163901] Sat, 11 December 2010 20:27 Go to previous messageGo to next message
wittrs is currently offline  wittrs   
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Hello Ernesto!

Good to hear from you.

Something must be wrong. Importing this exact feed into my forum results in nothing like this. There is no link to the original page. All there is, is this:

http://ludwig.squarespace.com/storage/tempfeed.png

Is the xml aggregator not accessing the parser? Some of these files start with php binaries, and I fear I may not have exactly the right #! line inserted. But I'm not sure. Any thoughts about why mine looks like this?
Re: Major XML-aggregation Confusion [message #163921 is a reply to message #163920] Sat, 11 December 2010 23:05 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
You can include a link to the original article by using {link} in the signature.
For details, see XML Aggregation.
Re: Major XML-aggregation Confusion [message #163930 is a reply to message #163921] Sun, 12 December 2010 08:14 Go to previous messageGo to next message
wittrs is currently offline  wittrs   
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
... not working:

http://ludwig.squarespace.com/storage/temp1.png

What file is it that would have created and stored the {link}? I think that file is not getting accessed.

[Updated on: Sun, 12 December 2010 08:24]

Report message to a moderator

Re: Major XML-aggregation Confusion [message #163931 is a reply to message #163930] Sun, 12 December 2010 16:03 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Hi guys.

Here's a quick update. The {link} in the signature line only fails to work for one feed. It's working fine for the other. The one it isn't working for is the one we've been discussing. I guess, ultimately, it has something to do with that particular feed (for some reason).

As I continue to investigate, would appreciate it if you have any thoughts.

The php file constructs the signature line fine, it's just that {link} appears to contain no information whatsoever.

Regards and thanks.
Re: Major XML-aggregation Confusion [message #163932 is a reply to message #163931] Sun, 12 December 2010 21:24 Go to previous messageGo to next message
naudefj is currently offline  naudefj   South Africa
Messages: 3624
Registered: December 2004
Karma: 17
Senior Member
Administrator
Core Developer
remove from buddy list
ignore all messages by this user
That particular feed may not contain links.
Can you post the URL so we can have a look?
Re: Major XML-aggregation Confusion [message #163934 is a reply to message #163932] Sun, 12 December 2010 23:22 Go to previous messageGo to next message
wittrs is currently offline  wittrs   
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Here's the feed: http://lsolum.typepad.com/legaltheory/atom.xml

Note that the message board doesn't even seem to pick up the html, even though it is set to accept html: http://seanwilson.org/sworg/index.php?t=msg&th=777&start=0&S=6a c99f162c4925d54d61f47e18d31b48

(I have another feed on the same board that is capturing everything perfectly)
Re: Major XML-aggregation Confusion [message #163941 is a reply to message #163901] Mon, 13 December 2010 20:24 Go to previous messageGo to next message
wittrs is currently offline  wittrs   United States
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
... well, I am encountering difficulties with another feed. This feed only imports one entry, and it leaves the date of the XML aggregation rule as being 1970 (as if the process was only partially executed).

Feed: http://www.poliscijobrumors.com/rss.php?topic=28977

Here's a copy of what was imported: http://seanwilson.org/forum/index.php?t=msg&th=3024&start=0&S=3 37d4d6f4c76c7ea0a017becac0e385f
Re: Major XML-aggregation Confusion [message #163942 is a reply to message #163901] Mon, 13 December 2010 22:45 Go to previous message
wittrs is currently offline  wittrs   
Messages: 134
Registered: August 2009
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
.. just a quick update. I had to turn this feed off. It kept downloading the same single message.

Imagine a feed with 12 messages, X(1) through X(12). Instead of downloading 12 messages, my board only received X(1). And each time the cron job ran, it received X(1) again. And again.

What appears to have happened is that the iteration failed (is that the right word)? It wouldn't go though the process of extracting all of them. And it seemed to exit the program early, because the date on the aggregation rule remained 1970.

Any help you can think of would be great. I'm going to look into it myself tomorrow.
Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: Problem attaching files in FUDForum 2.8.1
Next Topic: Internal links to Windows shares
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Mon Oct 23 22:26:08 EDT 2017

Total time taken to generate the page: 0.00948 seconds