I'm working with a message board database that already has a bunch of YouTube links in the comments, and I'm trying to replace all of the links with a new alternate.
I've written the regex to find the ID and replace the link correctly, but it only works with the first link that it finds. How do I make it work with all of the matching links in the string?
// I'm not sure why preg_replace isn't catching the extra variables;
// I thought the [&.+]* would do this? Either way, this is a workaround:
list($this_id) = explode("&", $this_id);
// Replace link
if ($this_id) {
$new_link = "Example replacement: $this_id";
On Jul 9, 9:50 am, Jason C <jwcarl...@gmail.com> wrote:
> I'm working with a message board database that already has a bunch of YouTube links in the comments, and I'm trying to replace all of the links with a new alternate.
>
> The existing strings are like:
>
> $this_comment = '<a href="" target="_new">http://www.youtube.com/watc...vidid</a><br><br><a href="" target="_new">http://www.youtube.com/watc...vidid_2</a>';
>
> Notice that this string has 2 separate YouTube links.
>
> If you're not familiar, YouTube has several possible link formats, so using parse_url() doesn't really work:
>
> youtube.com/v/{vidid}
> youtube.com/vi/{vidid}
> youtube.com/?v={vidid}
> youtube.com/?vi={vidid}
> youtube.com/watch?v={vidid}
> youtube.com/watch?vi={vidid}
> youtu.be/{vidid}
> youtube.com/v/{vidid}?feature=autoshare&version=3&autohide=1&au toplay=1
>
> I've written the regex to find the ID and replace the link correctly, but it only works with the first link that it finds. How do I make it work with all of the matching links in the string?
>
> Here's what I have:
>
> // Fetch the VIDID
> $this_id = preg_replace("#.*?<a href=\" http://.*?youtu\.*?be[\.com]*/[watch]*[\?]*(v/|v=|vi/|vi=)*(.*?)[&.+]*\ " target=\"_new\">.*?<\/a>.*#",
> "$2", $this_comment);
>
> // I'm not sure why preg_replace isn't catching the extra variables;
> // I thought the [&.+]* would do this? Either way, this is a workaround:
> list($this_id) = explode("&", $this_id);
>
> // Replace link
> if ($this_id) {
> $new_link = "Example replacement: $this_id";
>
> $this_comment = preg_replace("#<a href=\"http://.*?youtu\.*?be[\.com]*/[watch]*[\?]*(v/|v=|vi/|vi=)*" . $this_id . "[&.+]*\" target=\"_new\">.*?<\/a>#",
> "$new_link", $this_comment);
Well I'm very confused by all this. First of all, why are you using
preg_replace to extract the vidid? I would have thought that a job
better suited to preg_match.
Next, in your string assigned to $this_comment, the first vidid is
different to the other 2, so why are you expecting $this_id to match
all of them?
On Mon, 9 Jul 2012 01:50:39 -0700 (PDT), Jason C wrote:
> I'm working with a message board database that already has a bunch of
> YouTube links in the comments, and I'm trying to replace all of the
> links with a new alternate.
>
> The existing strings are like:
>
> $this_comment = '<a href=""
> target="_new">http://www.youtube.com/watc...vidid</a><br><br><a
> href=""
> target="_new">http://www.youtube.com/watc...vidid_2</a>';
>
> Notice that this string has 2 separate YouTube links.
>
> If you're not familiar, YouTube has several possible link formats, so
> using parse_url() doesn't really work:
>
> youtube.com/v/{vidid}
> youtube.com/vi/{vidid}
> youtube.com/?v={vidid}
> youtube.com/?vi={vidid}
> youtube.com/watch?v={vidid}
> youtube.com/watch?vi={vidid}
> youtu.be/{vidid}
> youtube.com/v/{vidid}?feature=autoshare&version=3&autohide=1&au toplay=1
>
>
> I've written the regex to find the ID and replace the link correctly,
> but it only works with the first link that it finds. How do I make it
> work with all of the matching links in the string?
That's the drawback to using preg_replace() for this. You can't capture
all the bits you want to extract because you *must* enumerate them.
preg_match_all() returns an array of matches, which is what you want if
you don't know how many you're going to get back going in.
> Here's what I have:
>
> // Fetch the VIDID $this_id = preg_replace("#.*?<a
> href=\"http://.*?youtu\.*?be[\.com]*/[watch]*[\?]*(v/|v=|vi/|vi=)*(.*?
> )[&.+]*\" target=\"_new\">.*?<\/a>.*#", "$2", $this_comment);
^^ -- enumerated result
>
> // I'm not sure why preg_replace isn't catching the extra variables;
> // I thought the [&.+]* would do this? Either way, this is a
> workaround: list($this_id) = explode("&", $this_id);
Define "catching" in this context. If you want it back, you need to
paren-tag it so it goes into an enumerated output slot.
--
"'I'm not sleeping with a jr. high schooler! I have a life-sized doll
that looks like one.' Uh huh. That sounds SO much less pathetic."
-- Piro's Conscience www.megatokyo.com
On Monday, July 9, 2012 8:44:06 AM UTC-4, Captain Paralytic wrote:
> Well I'm very confused by all this. First of all, why are you using
> preg_replace to extract the vidid? I would have thought that a job
> better suited to preg_match.
Probably just a lack of knowledge on my part. I thought that preg_match was used to find if the regex was true or false, and then preg_replace would be used to replace whatever.
From Peter's reply, I don't think that either of them are the right command.. But for the sake of my own learning, how would I have modified my script (catching only one) to use preg_match instead of preg_replace?
And, if preg_replace works, then what's the advantage? Speed?
> Next, in your string assigned to $this_comment, the first vidid is
> different to the other 2, so why are you expecting $this_id to match
> all of them?
No, that was the point; they're not going to match, so I need to modify the script to replace ALL of the existing links with the ID that's in that link.
That's why I turned to you guys. My only thought was to put the script in a function, then use a while() loop to keep running the function until there were no more links. I couldn't get it to work, though, and I didn't like the idea of using a loop on it, anyway, so I thought you guys might have a better suggestion.
On Monday, July 9, 2012 9:29:15 AM UTC-4, Peter H. Coffin wrote:
> That's the drawback to using preg_replace() for this. You can't capture
> all the bits you want to extract because you *must* enumerate them.
> preg_match_all() returns an array of matches, which is what you want if
> you don't know how many you're going to get back going in.
Thanks, I'll modify the script this evening with preg_match_all(). :-D
>> // I'm not sure why preg_replace isn't catching the extra variables;
>> // I thought the [&.+]* would do this? Either way, this is a
>> workaround: list($this_id) = explode("&", $this_id);
>
> Define "catching" in this context. If you want it back, you need to
> paren-tag it so it goes into an enumerated output slot.
No, I mean that it's not removing the additional variables. So, this:
On Monday, July 9, 2012 9:29:15 AM UTC-4, Peter H. Coffin wrote:
> That's the drawback to using preg_replace() for this. You can't capture
> all the bits you want to extract because you *must* enumerate them.
> preg_match_all() returns an array of matches, which is what you want if
> you don't know how many you're going to get back going in.
Just a note for anyone else reading this later, preg_match_all() did work perfectly. I changed:
This gives me a multidimensional array of $matches, where $matches[2] is the array that holds the values from $2.
So after finding the array, it's a simple matter of putting the second preg_replace() in a foreach loop:
foreach ($matches[2] as $this_id) {
// I'm still not sure why $this_id is keeping the other params
list($this_id) = explode("&", $this_id);
$this_comment = preg_replace(...);
}
Thanks for the help, Peter! If you happen to see the error I'm making with the extra params (forcing me to use explode to get rid of them), I'd appreciate any insight. The workaround is working, though, so it's not a big deal... just sloppy, I guess.