Re: file_get_contents doesn’t access one URL [message #179615 is a reply to message #179609] |
Tue, 13 November 2012 13:49 |
Jerry Stuckle
Messages: 2598 Registered: September 2010
Karma:
|
Senior Member |
|
|
On 11/12/2012 8:13 PM, Chuck Anderson wrote:
> Jerry Stuckle wrote:
>> On 11/12/2012 4:01 PM, Chuck Anderson wrote:
>>> Jerry Stuckle wrote:
>>>> On 11/12/2012 8:38 AM, Denis McMahon wrote:
>>>> > On Sat, 10 Nov 2012 21:19:29 +0100, Luuk wrote:
>>>> >
>>>> >> On 10-11-2012 21:06, Charlie wrote:
>>>> >>> http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> > DBA.aspx
>>>> >>>
>>>> >>> The above is a legitimate URL that I can access by copying it
>>>> >>> into the
>>>> >>> address field of my browser. However, as an argument to
>>>> >>> file_get_contents I get the error message,
>>>> >>>
>>>> >>> Warning:
>>>> >>> file_get_contents(http://www.philadelphia.careerboard.com/job/
>>>> >>> 3167962-MUMPS~2FCache-DBA.aspx) [function.file-get-contents]:
>>>> >>> failed to
>>>> >>> open stream: HTTP request failed! HTTP/1.1 404 Not Found
>>>> >
>>>> >> When i visit the page, AND debug this site using Fiddler* i see a 404
>>>> >> too.
>>>> >
>>>> > Firefox sees a web page. Wget sees a 301 redirect followed by a 404:
>>>> >
>>>> > $ wget
>>>> > http://www.philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> > DBA.aspx
>>>> > --2012-11-12 13:28:57-- http://www.philadelphia.careerboard.com/
>>>> > job/3167962-MUMPS~2FCache-DBA.aspx
>>>> > Resolving www.philadelphia.careerboard.com
>>>> > (www.philadelphia.careerboard.com)... 64.74.112.101
>>>> > Connecting to www.philadelphia.careerboard.com
>>>> > (www.philadelphia.careerboard.com)|64.74.112.101|:80... connected.
>>>> > HTTP request sent, awaiting response... 301 Moved Permanently
>>>> > Location:
>>>> > http://philadelphia.careerboard.com/job/3167962-MUMPS~2FCache-
>>>> > DBA.aspx [following]
>>>> > --2012-11-12 13:28:57--
>>>> > http://philadelphia.careerboard.com/job/3167962-
>>>> > MUMPS~2FCache-DBA.aspx
>>>> > Resolving philadelphia.careerboard.com
>>>> > (philadelphia.careerboard.com)...
>>>> > 64.74.112.101
>>>> > Connecting to philadelphia.careerboard.com
>>>> > (philadelphia.careerboard.com)|
>>>> > 64.74.112.101|:80... connected.
>>>> > HTTP request sent, awaiting response... 404 Not Found
>>>> > 2012-11-12 13:28:57 ERROR 404: Not Found.
>>>> >
>>>> > Rgds
>>>> >
>>>> > Denis McMahon
>>>> >
>>>>
>>>> Denis,
>>>>
>>>> Not quite. Firefox gets the same 301 and 404. But FF loads some
>>>> Javascript (must be part of the 404 error page) and continues
>>>> processing from there.
>>>>
>>>> They are obviously doing what they can to prevent people from doing
>>>> just what the op is trying to do.
>>>>
>>>
>>> I have Javascript disabled and the page still loads in Firefox.
>>>
>>> Could it have anything to do with using an HttpOnly cookie?
>>>
>>> HTTP/1.1 404 Not Found
>>> Date: Mon, 12 Nov 2012 20:56:50 GMT
>>> ...
>>> Set-Cookie: ASP.NET_SessionId=kyj2do654xkl2auwhtsjrvyo; path=/; HttpOnly
>>> ...
>>>
>>> From http://php.net/manual/en/function.setcookie.php
>>>
>>> When TRUE the cookie will be made accessible only through the HTTP
>>> protocol. This means that the cookie won't be accessible by scripting
>>> languages, such as JavaScript. It has been suggested that this setting
>>> can effectively help to reduce identity theft through XSS attacks
>>> (although it is not supported by all browsers), but that claim is often
>>> disputed. Added in PHP 5.2.0. TRUE or FALSE
>>>
>>
>> I didn't actually try it with javascript disabled - I just looked at
>> the headers. But it obviously is an attempt to not allow just what
>> you're trying to do.
>>
>> Do you have the owner's permission to access the page via a script? If
>> so, the webmaster should be able to help you.
>>
>
> I am not the OP. I am merely speculating on how it is done.
>
Not only is it off-topic in this newsgroup, but I don't like to discuss
security implementations (especially someone else's) in a public forum.
Sorry.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
|
|
|