Received: with ECARTIS (v1.0.0; list gopher); Wed, 23 Aug 2006 19:11:34 -0500 (CDT) Received: from outbound4.mail.tds.net ([216.170.230.94]) by glockenspiel.complete.org with esmtp (Exim 4.50) id 1GG2om-0002S0-BI for gopher@complete.org; Wed, 23 Aug 2006 19:11:34 -0500 Received: from outaamta02.mail.tds.net (outaamta02.mail.tds.net [216.170.230.32]) by outbound4.mail.tds.net (8.13.6/8.13.4) with ESMTP id k7O0BSWK030441 for ; Wed, 23 Aug 2006 19:11:28 -0500 Received: from [127.0.0.1] (really [69.21.205.10]) by outaamta02.mail.tds.net with ESMTP id <20060824001128.DFLF1142.outaamta02.mail.tds.net@[127.0.0.1]> for ; Wed, 23 Aug 2006 19:11:28 -0500 Message-ID: <44ECEE1B.9010005@sdf.lonestar.org> Date: Wed, 23 Aug 2006 19:08:59 -0500 From: Benn Newman User-Agent: Thunderbird 1.5.0.5 (Windows/20060719) MIME-Version: 1.0 To: gopher@complete.org Subject: [gopher] Re: Gopherspace archive References: <200608232334.k7NNYtqi011628@floodgap.com> In-Reply-To: <200608232334.k7NNYtqi011628@floodgap.com> Content-type: text/plain X-Spam-Status: No (score 0.0): none X-Virus-Scanned: by Exiscan on glockenspiel.complete.org at Wed, 23 Aug 2006 19:11:34 -0500 Content-Transfer-Encoding: 8bit X-archive-position: 1371 X-ecartis-version: Ecartis v1.0.0 Sender: gopher-bounce@complete.org Errors-to: gopher-bounce@complete.org X-original-sender: newmanbe@sdf.lonestar.org Precedence: bulk Reply-to: gopher@complete.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: Gopher X-List-ID: Gopher List-subscribe: List-owner: List-post: List-archive: X-list: gopher Cameron Kaiser wrote: >> If you would read the paper >> (We interrupt this e-mail for an MLA-ish citation! >> Lesk, M. E., "Some Applications of Inverted Indexes on the UNIX System." >> Murray Hill, New Jersey: A really long time ago >> We now return to our e-mail) >> you would find out that there is a maximum number of keys per file/entry >> (or it is supposed to to anyway, documentation not meeting reality adds >> spice to life (or something like that)). > > How would that apply, though? If you're only indexing by file*name*, that's > one thing, but if you were doing a full-text index, then your number of keys > is determined by the contents of the files, not the number of files > themselves. Unless I'm not understanding what you would allow to be > searchable, which is possible. :) > No idea, right, I think we are both confused! Ask me again when I'm not so sleeepp..... Through my tiredness, I realised a mistake: it is not a full text index; it is a partial text index. It takes certain keywords (which can be manual or automagic, by default a maximum of one hundred) from a file or files and makes an index. Another program searches that index. So in summary, I would try to make a ``partial-text'' index of the archive. I would then (or perhaps while the index is being built!) make a front end (a mole) to the program that searches the index (using sed or awk). Then, if we find someone to host it (the archive), we could have a on-line (i.e. Gopherspace, or Web, or whatever (hytelnet!)). (I never knew how fun parenthesis were!) The index could also be distributed with BitTorrent. -- Benn Newman -- Binary/unsupported file stripped by Ecartis -- -- Type: application/x-pkcs7-signature -- File: smime.p7s -- Desc: S/MIME Cryptographic Signature