<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="index" title="Text File" newcontext="true">
<p>
Received: with LISTAR (v1.0.0; list gopher);
 Mon, 05 Nov 2001 10:34:05 -0500 (EST)
Return-Path: &lt;spectre@stockholm.ptloma.edu&gt;
Delivered-To: gopher@complete.org
Received: from stockholm.ptloma.edu (stockholm.ptloma.edu [199.106.86.50])
	by pi.glockenspiel.complete.org (Postfix) with ESMTP id E3B673B80B
	for &lt;gopher@complete.org&gt;; Mon,  5 Nov 2001 10:34:04 -0500 (EST)
Received: (from spectre@localhost)
	by stockholm.ptloma.edu (8.9.1/8.9.1) id HAA07810
	for gopher@complete.org; Mon, 5 Nov 2001 07:39:10 -0800
From: Cameron Kaiser &lt;spectre@stockholm.ptloma.edu&gt;
Message-Id: &lt;200111051539.HAA07810@stockholm.ptloma.edu&gt;
Subject: [gopher] Large indexing systems
To: gopher@complete.org
Date: Mon, 5 Nov 2001 07:39:10 -0800 (PST)
X-Mailer: ELM [version 2.4ME+ PL39 (25)]
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit
X-archive-position: 227
X-listar-version: Listar v1.0.0
Sender: gopher-bounce@complete.org
Errors-to: gopher-bounce@complete.org
X-original-sender: spectre@stockholm.ptloma.edu
Precedence: bulk
Reply-to: gopher@complete.org
List-help: &lt;mailto:listar@complete.org?Subject=help&gt;
List-unsubscribe: &lt;mailto:gopher-request@complete.org?Subject=unsubscribe&gt;
List-software: Listar version 1.0.0
X-List-ID: Gopher &lt;gopher.complete.org&gt;
List-subscribe: &lt;mailto:gopher-request@complete.org?Subject=subscribe&gt;
List-owner: &lt;mailto:jgoerzen@complete.org&gt;
List-post: &lt;mailto:gopher@complete.org&gt;
List-archive: &lt;http://www.complete.org/mailinglists/archives/&gt;
X-list: gopher
</p>
<p></p>
<p>Soliciting suggestions:
</p>
<p>sfWAIS has crapped out on Veronica-2&#x27;s final database. (When the pedal hits
the metal ...) Apparently it can&#x27;t cope with a dictionary that size -- when
it comes to the final merge, it dies with a file seek error. Some hasty
calculations seem to allege that disk space is not the problem.
</p>
<p>Does anyone have experience with a good large-document number indexing system?
I tried Isearch, which was developed by people connected with the WAIS
project, but it doesn&#x27;t like the ancient g++ on this system and this system
doesn&#x27;t like newer g++&#x27;s :-) and there&#x27;s no guarantee it doesn&#x27;t suffer
from the same problem, anyway.
</p>
<p>I have a few ideas for developing my own large-document number indexer, and
I did some simulations with a rough version and got some hopeful numbers
back w.r.t. disk space utilisation and search time latency. However, going on
to develop this fully would unnecessarily delay the release of the last V-2
database as I would have to write something to build the new search index and
then rewrite VISHNU and Veronica-2 to talk to it. So, any suggestions from
the floor?
</p>
<p>--
----------------------------- personal page: http://www.armory.com/~spectre/ --
 Cameron Kaiser, Point Loma Nazarene University * ckaiser@stockholm.ptloma.edu
-- Please dispose of this message in the usual manner. -- Mission: Impossible -
</p>
</card>
</wml>
