Table of Contents

Preface

Using this Manual
Content Overview
Conventions
Guide to Other Documentation

Chapter 1 Overview

Verity Spider Features
Supports Web Standards
Indexing Other Data Sources
Restart Options
State Maintenance Through a Persistent Store
Performance Improvements
Proxy Handling Improvements
Verity Spider License
What's New for Verity Spider
Meta Collections No Longer Supported
Improved Control
Options Changes for Verity Spider V3.6
Discontinued Command-Line Options
Verity Spider V3.5
Verity Spider V3.6
New Command-Line Options
Improved Verity Spider Workflow
Updating Collections
Upgrading to Verity Spider V3.6
Verity Spider V3.1 and GUI Spider Collections
Verity Spider V3.5 Collections
Verity Spider V3.6 Collections

Chapter 2 Verity Spider Reference

Verity Spider Syntax
Discontinued Command-line Options
New Command-line Options
Reference of Command-line Options
Setting MIME Types
Syntax Restrictions
MIME Types and Web Crawling
MIME Types and File System Indexing
Indexing Unknown MIME Types
Known MIME Types for File System Indexing

Chapter 3 Key Features

Authenticating Secure Paths
Compromised Security
Limiting the Spider
During Web Crawling
During Directory Walking
Meta Tag Indexing
Adding a Field Definition
Prefix Mapping
Verity Spider Indexing Options
Using the Control File
Prefix Mapping Examples
Collection Servicers
Notes and Considerations
Shared Library Environment Variable
Persistent Searches and Squeezes
Process Configuration
Using collsvc with the Verity Spider
Running the Verity Spider
Running collsvc
Collection Servicer Arguments
Collection Servicer Examples
Verity Spider Reporting
vsdb Arguments
vsdb Examples
The Use of Last-Modified Date
How Last-Modified is Used
New Documents
Dynamic Documents
How Last-Modified is Determined
Using a Custom Last-Modified Value
Providing a Value for Last-Modified
Overriding an Existing Last-Modified Value
Working with Proxy Servers
Specifying a Proxy Server
Authenticating Proxy Servers
Specifying Authentication Information
Specifying Hosts for Direct Access

Chapter 4 Indexing Examples

Indexing Workflow
Indexing Topic Sets
Reparsing a Site
Omitting Optimization
Indexing Virtual Hosts
Updating Only Certain Documents
Custom Value for Last-Modifed Date
An Intranet with Secure Hosts
Web Sites and Proxy Servers
Restarting an Interrupted Job
Adding to an Existing Collection
Refreshing a Collection
Including Previously Dropped Documents
Resynchronizing a Persistent Store
File Systems

Appendix A Error Messages

Appendix B Upgrading Collections

Using extract and mkvdk
Extracting and Reindexing
Example
Updating Information Server
Deleting Meta-collection Files
Extract Utility Limitation
Using a Perl Script
How to Get Perl
Using the Sample Perl Script
meta2uni.pl Arguments
Editing meta2uni.pl




Copyright © 1998, Verity, Inc. All rights reserved.