Re: file_get_contents doesn’t access one URL [message #179609 is a reply to message #179608] |
Tue, 13 November 2012 01:13 |
Chuck Anderson
Messages: 63 Registered: September 2010
Karma:
|
Member |
|
|
Jerry Stuckle wrote:
> On 11/12/2012 4:01 PM, Chuck Anderson wrote:
>> Jerry Stuckle wrote:
>>> On 11/12/2012 8:38 AM, Denis McMahon wrote:
>>>> On Sat, 10 Nov 2012 21:19:29 +0100, Luuk wrote:
>>>>
>>>> > On 10-11-2012 21:06, Charlie wrote:
>>>> >> http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> DBA.aspx
>>>> >>
>>>> >> The above is a legitimate URL that I can access by copying it
>>>> >> into the
>>>> >> address field of my browser. However, as an argument to
>>>> >> file_get_contents I get the error message,
>>>> >>
>>>> >> Warning:
>>>> >> file_get_contents(http://www.philadelphia.careerboard.com/job/
>>>> >> 3167962-MUMPS~2FCache-DBA.aspx) [function.file-get-contents]:
>>>> >> failed to
>>>> >> open stream: HTTP request failed! HTTP/1.1 404 Not Found
>>>>
>>>> > When i visit the page, AND debug this site using Fiddler* i see a 404
>>>> > too.
>>>>
>>>> Firefox sees a web page. Wget sees a 301 redirect followed by a 404:
>>>>
>>>> $ wget
>>>> http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> DBA.aspx
>>>> --2012-11-12 13:28:57-- http://www.philadelphia.careerboard.com/
>>>> job/3167962-MUMPS~2FCache-DBA.aspx
>>>> Resolving www.philadelphia.careerboard.com
>>>> (www.philadelphia.careerboard.com)... 64.74.112.101
>>>> Connecting to www.philadelphia.careerboard.com
>>>> (www.philadelphia.careerboard.com)|64.74.112.101|:80... connected.
>>>> HTTP request sent, awaiting response... 301 Moved Permanently
>>>> Location:
>>>> http://philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> DBA.aspx [following]
>>>> --2012-11-12 13:28:57--
>>>> http://philadelphia.careerboard.com/job/3167962-
>>>> MUMPS~2FCache-DBA.aspx
>>>> Resolving philadelphia.careerboard.com
>>>> (philadelphia.careerboard.com)...
>>>> 64.74.112.101
>>>> Connecting to philadelphia.careerboard.com
>>>> (philadelphia.careerboard.com)|
>>>> 64.74.112.101|:80... connected.
>>>> HTTP request sent, awaiting response... 404 Not Found
>>>> 2012-11-12 13:28:57 ERROR 404: Not Found.
>>>>
>>>> Rgds
>>>>
>>>> Denis McMahon
>>>>
>>>
>>> Denis,
>>>
>>> Not quite. Firefox gets the same 301 and 404. But FF loads some
>>> Javascript (must be part of the 404 error page) and continues
>>> processing from there.
>>>
>>> They are obviously doing what they can to prevent people from doing
>>> just what the op is trying to do.
>>>
>>
>> I have Javascript disabled and the page still loads in Firefox.
>>
>> Could it have anything to do with using an HttpOnly cookie?
>>
>> HTTP/1.1 404 Not Found
>> Date: Mon, 12 Nov 2012 20:56:50 GMT
>> ...
>> Set-Cookie: ASP.NET_SessionId=kyj2do654xkl2auwhtsjrvyo; path=/; HttpOnly
>> ...
>>
>> From http://php.net/manual/en/function.setcookie.php
>>
>> When TRUE the cookie will be made accessible only through the HTTP
>> protocol. This means that the cookie won't be accessible by scripting
>> languages, such as JavaScript. It has been suggested that this setting
>> can effectively help to reduce identity theft through XSS attacks
>> (although it is not supported by all browsers), but that claim is often
>> disputed. Added in PHP 5.2.0. TRUE or FALSE
>>
>
> I didn't actually try it with javascript disabled - I just looked at
> the headers. But it obviously is an attempt to not allow just what
> you're trying to do.
>
> Do you have the owner's permission to access the page via a script?
> If so, the webmaster should be able to help you.
>
I am not the OP. I am merely speculating on how it is done.
--
*****************************
Chuck Anderson • Boulder, CO
http://cycletourist.com
Turn Off, Tune Out, Drop In
*****************************
|
|
|