Re: file_get_contents doesn’t access one URL [message #179608 is a reply to message #179607] |
Mon, 12 November 2012 22:13 |
Jerry Stuckle
Messages: 2598 Registered: September 2010
Karma:
|
Senior Member |
|
|
On 11/12/2012 4:01 PM, Chuck Anderson wrote:
> Jerry Stuckle wrote:
>> On 11/12/2012 8:38 AM, Denis McMahon wrote:
>>> On Sat, 10 Nov 2012 21:19:29 +0100, Luuk wrote:
>>>
>>>> On 10-11-2012 21:06, Charlie wrote:
>>>> > http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>> DBA.aspx
>>>> >
>>>> > The above is a legitimate URL that I can access by copying it into the
>>>> > address field of my browser. However, as an argument to
>>>> > file_get_contents I get the error message,
>>>> >
>>>> > Warning:
>>>> > file_get_contents(http://www.philadelphia.careerboard.com/job/
>>>> > 3167962-MUMPS~2FCache-DBA.aspx) [function.file-get-contents]:
>>>> > failed to
>>>> > open stream: HTTP request failed! HTTP/1.1 404 Not Found
>>>
>>>> When i visit the page, AND debug this site using Fiddler* i see a 404
>>>> too.
>>>
>>> Firefox sees a web page. Wget sees a 301 redirect followed by a 404:
>>>
>>> $ wget
>>> http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>> DBA.aspx
>>> --2012-11-12 13:28:57-- http://www.philadelphia.careerboard.com/
>>> job/3167962-MUMPS~2FCache-DBA.aspx
>>> Resolving www.philadelphia.careerboard.com
>>> (www.philadelphia.careerboard.com)... 64.74.112.101
>>> Connecting to www.philadelphia.careerboard.com
>>> (www.philadelphia.careerboard.com)|64.74.112.101|:80... connected.
>>> HTTP request sent, awaiting response... 301 Moved Permanently
>>> Location: http://philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>> DBA.aspx [following]
>>> --2012-11-12 13:28:57-- http://philadelphia.careerboard.com/job/3167962-
>>> MUMPS~2FCache-DBA.aspx
>>> Resolving philadelphia.careerboard.com (philadelphia.careerboard.com)...
>>> 64.74.112.101
>>> Connecting to philadelphia.careerboard.com
>>> (philadelphia.careerboard.com)|
>>> 64.74.112.101|:80... connected.
>>> HTTP request sent, awaiting response... 404 Not Found
>>> 2012-11-12 13:28:57 ERROR 404: Not Found.
>>>
>>> Rgds
>>>
>>> Denis McMahon
>>>
>>
>> Denis,
>>
>> Not quite. Firefox gets the same 301 and 404. But FF loads some
>> Javascript (must be part of the 404 error page) and continues
>> processing from there.
>>
>> They are obviously doing what they can to prevent people from doing
>> just what the op is trying to do.
>>
>
> I have Javascript disabled and the page still loads in Firefox.
>
> Could it have anything to do with using an HttpOnly cookie?
>
> HTTP/1.1 404 Not Found
> Date: Mon, 12 Nov 2012 20:56:50 GMT
> ...
> Set-Cookie: ASP.NET_SessionId=kyj2do654xkl2auwhtsjrvyo; path=/; HttpOnly
> ...
>
> From http://php.net/manual/en/function.setcookie.php
>
> When TRUE the cookie will be made accessible only through the HTTP
> protocol. This means that the cookie won't be accessible by scripting
> languages, such as JavaScript. It has been suggested that this setting
> can effectively help to reduce identity theft through XSS attacks
> (although it is not supported by all browsers), but that claim is often
> disputed. Added in PHP 5.2.0. TRUE or FALSE
>
I didn't actually try it with javascript disabled - I just looked at the
headers. But it obviously is an attempt to not allow just what you're
trying to do.
Do you have the owner's permission to access the page via a script? If
so, the webmaster should be able to help you.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
|
|
|