<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="index" title="Text File" newcontext="true">
<p>
Received: with ECARTIS (v1.0.0; list gopher);
 Fri, 06 Oct 2006 00:55:43 -0500 (CDT)
Received: from mo-69-69-114-6.sta.embarqhsd.net ([69.69.114.6]
 helo=erwin.lan.complete.org)
	by glockenspiel.complete.org with esmtps
	(with TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(TLS peer CN erwin.complete.org, certificate verified)
	(Exim 4.50)
	id 1GVigN-0001Wq-7E; Fri, 06 Oct 2006 00:55:43 -0500
Received: from katherina.lan.complete.org ([10.200.0.4])
	by erwin.lan.complete.org with esmtps
	(with TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(No TLS peer certificate)
	(Exim 4.50)
	id 1GVigI-0005rj-OH; Fri, 06 Oct 2006 00:55:34 -0500
Received: from jgoerzen by katherina.lan.complete.org with local (Exim 4.63)
	(envelope-from &lt;jgoerzen@katherina.lan.complete.org&gt;)
	id 1GVigH-0002tN-Tm; Fri, 06 Oct 2006 00:55:33 -0500
Date: Fri, 6 Oct 2006 00:55:33 -0500
From: John Goerzen &lt;jgoerzen@complete.org&gt;
To: gopher@complete.org
Subject: [gopher] The archive
Message-ID: &lt;20061006055533.GB10760@katherina.lan.complete.org&gt;
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.13 (2006-08-11)
X-Spam-Status: No (score 0.1): AWL=0.031, FORGED_RCVD_HELO=0.05
X-Virus-Scanned: by Exiscan on glockenspiel.complete.org at Fri,
 06 Oct 2006 00:55:43 -0500
Content-Transfer-Encoding: 8bit
X-archive-position: 1407
X-ecartis-version: Ecartis v1.0.0
Sender: gopher-bounce@complete.org
Errors-to: gopher-bounce@complete.org
X-original-sender: jgoerzen@complete.org
Precedence: bulk
Reply-to: gopher@complete.org
List-help: &lt;mailto:ecartis@complete.org?Subject=help&gt;
List-unsubscribe: &lt;mailto:gopher-request@complete.org?Subject=unsubscribe&gt;
List-software: Ecartis version 1.0.0
List-Id: Gopher &lt;gopher.complete.org&gt;
X-List-ID: Gopher &lt;gopher.complete.org&gt;
List-subscribe: &lt;mailto:gopher-request@complete.org?Subject=subscribe&gt;
List-owner: &lt;mailto:jgoerzen@complete.org&gt;
List-post: &lt;mailto:gopher@complete.org&gt;
List-archive: &lt;http://www.complete.org/mailinglists/archives/&gt;
X-list: gopher
</p>
<p>First off, thanks to all those that have expressed interest in this.
I have your emails and will get back to you.  I&#x27;ve been rather busy
lately due to the birth of our first baby [1] and our upcoming move in
about a week.  So it will likely be some time before I actually get
anything sent off.
</p>
<p>I realized also that quux.org had never been included in the run,
since it was large and I could populate it from local backups, which I
have now done.
</p>
<p>I&#x27;d also like to document the directory structure.  It is, roughly:
</p>
<p>gopher-arch/gopher/hostname/portnumber/selector
</p>
<p>Wheere the selector is a Gopher menu, you will see it exist as a
directory with a file named .gophermap within it.  This file contains
the raw Gopher menu file that was sent over by the server.  This
should be easily usable by PyGopherd and Bucktooth with only minor
modifications.
</p>
<p>I have run a duplicate file detector across the entire archive.  Any
duplicate files in it are hardlinked together.  This saved about 10G
of space.  If you&#x27;re on Windows, expect this to consume 10G more when
unpacked than if you&#x27;re on a Unix.
</p>
<p>I also have a dump of the PostgreSQL database behind the robot (10M
compressed, 200M uncompressed, 1.2G when loaded into PostgreSQL).  I
will toss that on the DVD as well for anyone that&#x27;s interested.
</p>
<p>The DVDs will be generated with:
</p>
<p>tar -cvf - gopher-arch/ | bzip2 -9 | split -d -b 4200m - gopher-arch.tar.bz2.
</p>
<p>That is, each DVD will contain a slice of the tar&#x27;d+bzipped
directory.  If you are going to get a set of DVDs, you can read them
in, and simply:
</p>
<p>cat gopher-arch.tar.bz2.* | bzcat | tar -xvf -
</p>
<p>Some gopher servers do not use the slash as a path separator in the
selector.  Those servers will have a huge number of files/directories
in their top-level -- could be thousands.  You will need an efficient
modern filesystem to extract all of them in their entirety, but there
aren&#x27;t many.
</p>
<p>I will get back to everyone once I have the time to send out the DVDs.
</p>
<p>[1] http://changelog.complete.org/posts/545-The-News.html
</p>
<p></p>
<p></p>
</card>
</wml>
