FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » DOMDocument HTML problem
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: DOMDocument HTML problem [message #177105 is a reply to message #177104] Tue, 21 February 2012 19:14 Go to previous messageGo to previous message
Jerry Stuckle is currently offline  Jerry Stuckle
Messages: 2598
Registered: September 2010
Karma:
Senior Member
On 2/21/2012 2:00 PM, Aaron Gray wrote:
> Hi, I am trying to take an incomplete "summary" of a number of
> characters of an HTML fragment without DOCTYPE, HTML, or BODY elements
> and a possibly incomplete unbalanced fragment, and convert it to a
> ballanced fragment without DOCTYPE, HTML, or BODY, using DOMDocument and
> friends.
>
> This is what I have got (it works without a Error 500 using php command
> line tool :-
>
> ~~~~
> <?php
>
> $html = '<div>
> <p><a href="#test">foo</a></p>
> <hr>
> <br>
> <div>name</div>
> ';
>
> $dom = new DOMDocument();
> $newdom = new DOMDocument();
>
> $dom->loadHTML($html);
>
> $Elements = $dom->childNodes;
>
> foreach ( $Elements as $Element ) {
>
> $NewElement = $newdom->createElement($Element->nodeName);
>
> if ($Element->attributes)
> foreach($Element->attributes as $attribute)
> $NewElement->setAttribute($attribute->name, $attribute->value);
>
> if ($Element->childNodes)
> foreach($Element->childNodes as $child)
> $NewElement->appendChild( $newdom->importNode($child, true));
>
> $newdom->appendChild( $NewElement);
> }
>
> echo $newdom->saveHTML();
> ?>
> ~~~~
>
> It balances the unbalanced <div> seems to be adding a duplicate
> <HTML></HTML> at the beginning of the output.
>
> Also I need to get the fragment of the code like a JavaScript DOM
> innerHTML without the <HTML> and <BODY> tags.
>
> Many thanks in advance.
>
> Hope you can help,
>
> Aaron
>
>
>

If your document is not well formed, DOMDocument has to guess at what it
the document is supposed to represent. Results are likely to be rather
indeterminate.

This doesn't mean you need DOCTYPE, <head>, <body>, etc., but things
like unbalanced <div> tags are likely to cause problems.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
[Message index]
 
Read Message
Read Message
Read Message
Previous Topic: FILTER_SANITIZE_NUMBER_FLOAT non/sense
Next Topic: jailshell and PHP daemon
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Sun Nov 24 22:54:23 GMT 2024

Total time taken to generate the page: 0.04356 seconds