<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="index" title="Text File" newcontext="true">
<p>
Received: with ECARTIS (v1.0.0; list gopher);
 Sun, 30 May 2004 18:08:13 -0500 (CDT)
Return-Path: &lt;tfraser@cs.umd.edu&gt;
X-Original-To: gopher@complete.org
Delivered-To: gopher@complete.org
Received: from localhost (localhost [127.0.0.1])
	by glockenspiel.complete.org (Postfix) with ESMTP id 65BD52AA
	for &lt;gopher@complete.org&gt;; Sun, 30 May 2004 18:08:12 -0500 (CDT)
Received: from glockenspiel.complete.org ([127.0.0.1])
	by localhost (glockenspiel [127.0.0.1]) (amavisd-new, port 10025)
	with ESMTP id 18146-02 for &lt;gopher@complete.org&gt;;
	Sun, 30 May 2004 18:08:10 -0500 (CDT)
Received: from junkmail.cs.umd.edu (junkmail.cs.umd.edu [128.8.128.69])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by glockenspiel.complete.org (Postfix) with ESMTP id BFE0AF9
	for &lt;gopher@complete.org&gt;; Sun, 30 May 2004 18:08:05 -0500 (CDT)
Received: from nerds.cs.umd.edu (nerds.cs.umd.edu [128.8.129.84])
	by junkmail.cs.umd.edu (8.12.10/8.12.5) with ESMTP id i4UN806p013431
	for &lt;gopher@complete.org&gt;; Sun, 30 May 2004 19:08:00 -0400 (EDT)
Received: (from tfraser@localhost)
	by nerds.cs.umd.edu (8.12.10/8.12.5) id i4UN7x6P027575
	for gopher@complete.org; Sun, 30 May 2004 19:07:59 -0400 (EDT)
Date: Sun, 30 May 2004 19:07:59 -0400
From: Tim Fraser &lt;tfraser@cs.umd.edu&gt;
To: gopher@complete.org
Subject: [gopher] Re: Cicada Incomplete Gopher Census
Message-ID: &lt;20040530230758.GA27407@nerds.cs.umd.edu&gt;
References: &lt;20040528022333.GA7147@nerds.cs.umd.edu&gt;
 &lt;200405280245.TAA06966@floodgap.com&gt;
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: &lt;200405280245.TAA06966@floodgap.com&gt;
User-Agent: Mutt/1.4.1i
X-Virus-Scanned: by amavisd-new-20030616-p7 (Debian) at complete.org
Content-Transfer-Encoding: 8bit
X-archive-position: 929
X-ecartis-version: Ecartis v1.0.0
Sender: gopher-bounce@complete.org
Errors-to: gopher-bounce@complete.org
X-original-sender: tfraser@cs.umd.edu
Precedence: bulk
Reply-to: gopher@complete.org
List-help: &lt;mailto:ecartis@complete.org?Subject=help&gt;
List-unsubscribe: &lt;mailto:gopher-request@complete.org?Subject=unsubscribe&gt;
List-software: Ecartis version 1.0.0
List-Id: Gopher &lt;gopher.complete.org&gt;
X-List-ID: Gopher &lt;gopher.complete.org&gt;
List-subscribe: &lt;mailto:gopher-request@complete.org?Subject=subscribe&gt;
List-owner: &lt;mailto:jgoerzen@complete.org&gt;
List-post: &lt;mailto:gopher@complete.org&gt;
List-archive: &lt;http://www.complete.org/mailinglists/archives/&gt;
X-list: gopher
</p>
<p>ck&gt; Actually, you can see the Floodgap census here
</p>
<p>Thanks for updating the floodgap directory!  It was browsing through
this directory and cools sites like quux.org (to name just one) that
got me interested in Gopher again.  I think the &quot;new gopher servers
since 1999&quot; directory is an especially interesting feature, since it
highlights new growth.
</p>
<p>ck&gt; After the V-2 cleanup this weekend, it has pared itself down to
ck&gt; 255 unique hosts and a database of about 1.8 million selectors.
</p>
<p>OK, I found only 154, so I clearly have a bug.  My selector counts
seem very low, too.  I&#x27;m not sure it&#x27;s worth debugging given that the
floodgap index is updating again, but just in case I get bored: my
spider is supposed to follow only selectors with type 1 or 11.  Are
there other directory types that I should follow?
</p>
<p>tf&gt; my primitive spider had been automatically banned
ck&gt; It was? I don&#x27;t remember blocking any IP addresses ...
</p>
<p>Perhaps I was mistaken.  After using another machine to read point 4
in the floodgap terms of service (the one about automatically blocking
the netblocks of spiders and robots), I just assumed that was the
cause without any real proof and left it at that.
</p>
<p>How does floodgap&#x27;s Veronica-2 spider limit the load it places on
sites?  Does it check for a robots.txt file, or some similar
mechanism?
</p>
<p>- Tim Fraser
</p>
</card>
</wml>
