Received: with ECARTIS (v1.0.0; list gopher); Mon, 31 Oct 2005 07:26:37 -0600 (CST) Received: from gatekeeper.excelhustler.com ([69.44.136.67] helo=excelhustler.com) by glockenspiel.complete.org with esmtps (with TLS-1.0:RSA_AES_256_CBC_SHA:32) (No TLS peer certificate) (Exim 4.50) id 1EWZgF-0008Kl-0S; Mon, 31 Oct 2005 07:26:35 -0600 Received: from jgoerzen by wile.internal.excelhustler.com with local (Exim 4.54) id 1EWZg8-0002iz-Q6; Mon, 31 Oct 2005 07:26:24 -0600 Date: Mon, 31 Oct 2005 07:26:24 -0600 From: John Goerzen To: gopher@complete.org Subject: [gopher] Re: Bot update Message-ID: <20051031132624.GA10012@excelhustler.com> References: <20051031034851.GA30223@katherina.lan.complete.org> <4365D8DB.40508@route-add.net> MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4365D8DB.40508@route-add.net> User-Agent: Mutt/1.5.11 X-Spam-Status: No (score 0.0): AWL=0.001 X-Virus-Scanned: by Exiscan on glockenspiel.complete.org at Mon, 31 Oct 2005 07:26:35 -0600 Content-Transfer-Encoding: 8bit X-archive-position: 1142 X-ecartis-version: Ecartis v1.0.0 Sender: gopher-bounce@complete.org Errors-to: gopher-bounce@complete.org X-original-sender: jgoerzen@complete.org Precedence: bulk Reply-to: gopher@complete.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: Gopher X-List-ID: Gopher List-subscribe: List-owner: List-post: List-archive: X-list: gopher On Mon, Oct 31, 2005 at 09:42:03AM +0100, Alessandro Selli wrote: > John Goerzen wrote: > > Here's an update on the gopher bot: > > > > There is currently 28G of data archived representing 386,315 > > documents. 1.3 million documents remain to be visited, from > > approximately 20 very large Gopher servers. I believe, then, that the > > majority of gopher servers have been cached by this point. 3,987 > > different servers are presently represented in the archive. > > Amazing. I dare say: too good to be true! Yes, you're right. sigh. > Are you definitively, positively sure about all this stuff beeing served > by so many active Gopher servers? I forgot to take into account that the bot creates a directory for the data from a given server before it tries to connect to it. So it tried to connect to 3,987 servers. Actually, I received documents from 216 servers. Sigh. So far, the top server in terms of number of selectors downloaded is serpiente.dgsca.unam.mx with over 57,000. But many of the top servers are still being crawled. -- John