String Replacement [message #172225] |
Sun, 06 February 2011 15:49 |
Alec
Messages: 2 Registered: February 2011
Karma: 0
|
Junior Member |
|
|
I am not sure where to start on this on.
I have a string as follows '<text
identifier="3ecebe84-8ddd-475e-86ca-300b5298fc34">
<value><![CDATA[12345]]></value>'
The text identifier always remains the same ie. 3ece..... for each
article. The string is much longer containing several different
identifiers. I would like to be able to jump to the one shown, and
then extract the CDATA value shown after ie. 12345. The length of this
may change, so I need a way to extract the text between the second '['
and first ']' . I hope that makes sense.
Any advice would be greatly appreciated.
Alec
|
|
|
Re: String Replacement [message #172226 is a reply to message #172225] |
Sun, 06 February 2011 17:35 |
Michael Fesser
Messages: 215 Registered: September 2010
Karma: 0
|
Senior Member |
|
|
.oO(Alec)
> I am not sure where to start on this on.
>
> I have a string as follows '<text
> identifier="3ecebe84-8ddd-475e-86ca-300b5298fc34">
> <value><![CDATA[12345]]></value>'
>
> The text identifier always remains the same ie. 3ece..... for each
> article. The string is much longer containing several different
> identifiers. I would like to be able to jump to the one shown, and
> then extract the CDATA value shown after ie. 12345. The length of this
> may change, so I need a way to extract the text between the second '['
> and first ']' . I hope that makes sense.
A regular expression might help, something like
/CDATA\[([^\]]+)/
It should match on the last '[' and then capture all following chars
which are not ']'.
But since this looks like XML, using DOM or SimpleXML might be an option
as well. Usually it's easier and the cleaner way to work with XML using
the appropriate tools.
Micha
|
|
|
Re: String Replacement [message #172230 is a reply to message #172225] |
Mon, 07 February 2011 08:17 |
Pavel Lepin
Messages: 1 Registered: February 2011
Karma: 0
|
Junior Member |
|
|
Alec wrote:
> I am not sure where to start on this on.
Read up on XML processing.
> I have a string as follows '<text
> identifier="3ecebe84-8ddd-475e-86ca-300b5298fc34">
> <value><![CDATA[12345]]></value>'
>
> The text identifier always remains the same ie. 3ece..... for each
> article. The string is much longer containing several different
> identifiers. I would like to be able to jump to the one shown, and
> then extract the CDATA value shown after ie. 12345. The length of this
> may change, so I need a way to extract the text between the second '['
> and first ']' . I hope that makes sense.
Assuming your actual data is well-formed XML (the above snippet is not):
<?php
$dom = new DOMDocument;
$dom->loadXML($string);
$xpath = new DOMXpath($dom);
$query = '/text[starts-with(@identifier, \'3ece\')]/value/text()';
$nodelist = $xpath->evaluate($query);
if ($nodelist->length === 1)
{
print($nodelist->item(0)->nodeValue."\n");
}
else
{
die('invalid query result: '.
'['.strval($nodelist->length).'] nodes returned'."\n");
}
?>
--
Pavel Lepin
|
|
|
Re: String Replacement [message #172249 is a reply to message #172230] |
Tue, 08 February 2011 06:39 |
Alec
Messages: 2 Registered: February 2011
Karma: 0
|
Junior Member |
|
|
Firstly many thanks both of you for the quick reply.
XML processing is new to me, but I will try to get my head around
this.
Alec
|
|
|
Re: String Replacement [message #172252 is a reply to message #172249] |
Tue, 08 February 2011 12:38 |
Yuri Subach
Messages: 1 Registered: February 2011
Karma: 0
|
Junior Member |
|
|
Pavel given example of DOM parser usage for XML. Alternatively you can try to use SAX parser. It can look a bit more complex from programming point but it is fast and efficient by memory usage (critical for large XML). Also SAX parser able to process incorrectly formatted XML which is a problem for DOM.
|
|
|