<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="index" title="Text File" newcontext="true">
<p>
Received: with ECARTIS (v1.0.0; list gopher);
 Mon, 31 Oct 2005 07:26:37 -0600 (CST)
Received: from gatekeeper.excelhustler.com ([69.44.136.67]
 helo=excelhustler.com)
	by glockenspiel.complete.org with esmtps
	(with TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(No TLS peer certificate)
	(Exim 4.50)
	id 1EWZgF-0008Kl-0S; Mon, 31 Oct 2005 07:26:35 -0600
Received: from jgoerzen by wile.internal.excelhustler.com with local (Exim
 4.54)
	id 1EWZg8-0002iz-Q6; Mon, 31 Oct 2005 07:26:24 -0600
Date: Mon, 31 Oct 2005 07:26:24 -0600
From: John Goerzen &lt;jgoerzen@complete.org&gt;
To: gopher@complete.org
Subject: [gopher] Re: Bot update
Message-ID: &lt;20051031132624.GA10012@excelhustler.com&gt;
References: &lt;20051031034851.GA30223@katherina.lan.complete.org&gt;
 &lt;4365D8DB.40508@route-add.net&gt;
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: &lt;4365D8DB.40508@route-add.net&gt;
User-Agent: Mutt/1.5.11
X-Spam-Status: No (score 0.0): AWL=0.001
X-Virus-Scanned: by Exiscan on glockenspiel.complete.org at Mon,
 31 Oct 2005 07:26:35 -0600
Content-Transfer-Encoding: 8bit
X-archive-position: 1142
X-ecartis-version: Ecartis v1.0.0
Sender: gopher-bounce@complete.org
Errors-to: gopher-bounce@complete.org
X-original-sender: jgoerzen@complete.org
Precedence: bulk
Reply-to: gopher@complete.org
List-help: &lt;mailto:ecartis@complete.org?Subject=help&gt;
List-unsubscribe: &lt;mailto:gopher-request@complete.org?Subject=unsubscribe&gt;
List-software: Ecartis version 1.0.0
List-Id: Gopher &lt;gopher.complete.org&gt;
X-List-ID: Gopher &lt;gopher.complete.org&gt;
List-subscribe: &lt;mailto:gopher-request@complete.org?Subject=subscribe&gt;
List-owner: &lt;mailto:jgoerzen@complete.org&gt;
List-post: &lt;mailto:gopher@complete.org&gt;
List-archive: &lt;http://www.complete.org/mailinglists/archives/&gt;
X-list: gopher
</p>
<p>On Mon, Oct 31, 2005 at 09:42:03AM +0100, Alessandro Selli wrote:
&gt; John Goerzen wrote:
&gt; &gt; Here&#x27;s an update on the gopher bot:
&gt; &gt;
&gt; &gt; There is currently 28G of data archived representing 386,315
&gt; &gt; documents.  1.3 million documents remain to be visited, from
&gt; &gt; approximately 20 very large Gopher servers.  I believe, then, that the
&gt; &gt; majority of gopher servers have been cached by this point.  3,987
&gt; &gt; different servers are presently represented in the archive.
&gt;
&gt;    Amazing.  I dare say: too good to be true!
</p>
<p>Yes, you&#x27;re right. sigh.
</p>
<p>&gt; Are you definitively, positively sure about all this stuff beeing served
&gt; by so many active Gopher servers?
</p>
<p>I forgot to take into account that the bot creates a directory for the
data from a given server before it tries to connect to it.  So it tried
to connect to 3,987 servers.
</p>
<p>Actually, I received documents from 216 servers.  Sigh.
</p>
<p>So far, the top server in terms of number of selectors downloaded is
serpiente.dgsca.unam.mx with over 57,000.  But many of the top servers
are still being crawled.
</p>
<p>-- John
</p>
<p></p>
</card>
</wml>
