Talk

Talk:BrewWiki.com

From WikiApiary, monitoring the MediaWiki universe

Blocking user agent

When User:Bumble Bee tries to connect to this wiki and get stats it fails with an 403 Forbidden error. The server in question is blocking the request. This is what it looks like from curl:

% curl -A "Python-urllib/2.7 (WikiApiary; Bumble Bee; +http://wikiapiary.com/wiki/User:Bumble_Bee)" "http://www.brewwiki.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=json"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--< html xmlns="http://www.w3.org/1999/xhtml">-->
<head>
<title>HTTP Error 403</title>
</head>
<body>
<h1>Error 403</h1>
<p>We're sorry, but we could not fulfill your request for
/api.php?action=query&meta=siteinfo&siprop=statistics&format=json on this server.</p>
<p>You do not have permission to access this server.</p>
<p>Your technical support key is: <strong>40f4-39e2-17f4-e8c8</strong></p>
<p>You can use this key to <a href="http://www.ioerror.us/bb2-support-key?key=40f4-39e2-17f4-e8c8">fix this problem yourself</a>.</p>
<p>If you are unable to fix the problem yourself, please contact <a href="mailto:brewwiki+nospam@nospam.brewwiki.com">brewwiki at brewwiki.com</a> and be sure to provide the technical support key shown above.</p>

If the request is repeated from curl without specifying a user agent it works:

% curl "http://www.brewwiki.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=json"
{"query":{"statistics":{"pages":293,"articles":140,"edits":12864,"images":159,"users":6255,"activeusers":22,"admins":4,"jobs":0}}}

The issue is clearly not Bumble Bee, as this curl which works shows:

% curl -A "(WikiApiary; Bumble Bee; +http://wikiapiary.com/wiki/User:Bumble_Bee)" "http://www.brewwiki.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=json" 
{"query":{"statistics":{"pages":293,"articles":140,"edits":12864,"images":159,"users":6255,"activeusers":22,"admins":4,"jobs":0}}}

The 403 block is in place on the Python urllib signature.

% curl -A "Python-urllib/2.7" "http://www.brewwiki.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=json"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--< html xmlns="http://www.w3.org/1999/xhtml">-->
<head>
<title>HTTP Error 403</title>
</head>
<body>
<h1>Error 403</h1>
<p>We're sorry, but we could not fulfill your request for
/api.php?action=query&meta=siteinfo&siprop=statistics&format=json on this server.</p>
<p>You do not have permission to access this server.</p>
<p>Your technical support key is: <strong>40f4-39e2-17f4-e8c8</strong></p>
<p>You can use this key to <a href="http://www.ioerror.us/bb2-support-key?key=40f4-39e2-17f4-e8c8">fix this problem yourself</a>.</p>
<p>If you are unable to fix the problem yourself, please contact <a href="mailto:brewwiki+nospam@nospam.brewwiki.com">brewwiki at brewwiki.com</a> and be sure to provide the technical support key shown above.</p>

The message in the 403 return suggests that this is a block that is put in place by the ISP, not the wiki.

This makes me wonder, perhaps the user agent for Bumble Bee should drop the urllib signature and just identify itself as Bumble Bee with no descriptive information of the technology used.

The block here can actually be bypassed just by changing "Python" to "python" in the user agent as well.

% curl -A "python-urllib/2.7" "http://www.brewwiki.com/api.php?action=query&meta=siteinfo&siprop=statistics&format=json"
{"query":{"statistics":{"pages":293,"articles":140,"edits":12864,"images":159,"users":6255,"activeusers":22,"admins":4,"jobs":0}}}

Thingles (talk) 20:01, 18 January 2013 (UTC)

Hmm, I think it will just be a matter of time before the ISP blocks "python" as well. I think the user agent information should be dropped. All interesting information is provided on Bumble Bees page and it would also save a couple of KB in the server log. However, I guess that it is a matter of good manners to make Bumble Bee respect the robots exclusion standard, i. e. a User-agent: Bumble Bee Disallow: / --[[kgh]] (talk) 10:18, 19 January 2013 (UTC)

Error

Bumble Bee encountered the following error:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 22:05, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 22:09, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 22:17, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 22:32, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 22:47, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 23:02, 20 January 2013 (UTC)

Error Collecting Statistics

An error was encountered while attempting to collect statistics:

 HTTP Error 403: Bad Behavior

Bumble Bee (talk) 23:17, 20 January 2013 (UTC)