mirror of
https://gitee.com/clygintang/Dockfile-Coreseek.git
synced 2025-07-21 00:00:15 +08:00
1598 lines
62 KiB
XML
Executable File
1598 lines
62 KiB
XML
Executable File
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
|
||
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
|
||
<appendix>
|
||
<title>Sphinx manpages</title>
|
||
|
||
<refentry id="indexer">
|
||
<refmeta>
|
||
<refentrytitle>indexer</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
|
||
<refmiscinfo class="manual">Sphinxsearch</refmiscinfo>
|
||
|
||
<refmiscinfo class="version">2.0.2</refmiscinfo>
|
||
</refmeta>
|
||
|
||
<refnamediv>
|
||
<refname>indexer</refname>
|
||
|
||
<refpurpose>Sphinxsearch fulltext index generator</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
<cmdsynopsis>
|
||
<command>indexer</command>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<arg choice="opt">--rotate</arg>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain">--noprogress</arg>
|
||
|
||
<arg choice="plain">--quiet</arg>
|
||
</group>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain">--all</arg>
|
||
|
||
<arg choice="plain"><replaceable>INDEX</replaceable></arg>
|
||
|
||
<arg choice="plain"><replaceable>...</replaceable></arg>
|
||
</group>
|
||
</cmdsynopsis>
|
||
|
||
<cmdsynopsis>
|
||
<command>indexer</command>
|
||
|
||
<arg choice="plain">--buildstops
|
||
<replaceable>OUTPUTFILE</replaceable></arg>
|
||
|
||
<arg choice="plain"><replaceable>COUNT</replaceable></arg>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain">--noprogress</arg>
|
||
|
||
<arg choice="plain">--quiet</arg>
|
||
</group>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain">--all</arg>
|
||
|
||
<arg choice="plain"><replaceable>INDEX</replaceable></arg>
|
||
|
||
<arg choice="plain"><replaceable>...</replaceable></arg>
|
||
</group>
|
||
</cmdsynopsis>
|
||
|
||
<cmdsynopsis>
|
||
<command>indexer</command>
|
||
|
||
<arg choice="plain">--merge
|
||
<replaceable>MAIN_INDEX</replaceable></arg>
|
||
|
||
<arg choice="plain"><replaceable>DELTA_INDEX</replaceable></arg>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<arg choice="opt">--rotate</arg>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain">--noprogress</arg>
|
||
|
||
<arg choice="plain">--quiet</arg>
|
||
</group>
|
||
</cmdsynopsis>
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1>
|
||
<title>Description</title>
|
||
|
||
<para>Sphinx is a collection of programs that aim to provide high
|
||
quality fulltext search.</para>
|
||
|
||
<para><command>indexer</command> is the first of the two principle tools
|
||
as part of Sphinx. Invoked from either the command line directly, or as
|
||
part of a larger script, <command>indexer</command> is solely
|
||
responsible for gathering the data that will be searchable.</para>
|
||
|
||
<para>The calling syntax for indexer is as follows:</para>
|
||
|
||
<programlisting>$ indexer [OPTIONS] [indexname1 [indexname2 [...]]]</programlisting>
|
||
|
||
<para>Essentially you would list the different possible indexes (that
|
||
you would later make available to search) in
|
||
<filename>sphinx.conf</filename>, so when calling
|
||
<command>indexer</command>, as a minimum you need to be telling it what
|
||
index (or indexes) you want to index.</para>
|
||
|
||
<para>If <filename>sphinx.conf</filename> contained details on 2
|
||
indexes, <emphasis>mybigindex</emphasis> and
|
||
<emphasis>mysmallindex</emphasis>, you could do the following:</para>
|
||
|
||
<programlisting>$ indexer mybigindex
|
||
$ indexer mysmallindex mybigindex</programlisting>
|
||
|
||
<para>As part of the configuration file,
|
||
<filename>sphinx.conf</filename>, you specify one or more indexes for
|
||
your data. You might call <command>indexer</command> to reindex one of
|
||
them, ad-hoc, or you can tell it to process all indexes - you are not
|
||
limited to calling just one, or all at once, you can always pick some
|
||
combination of the available indexes.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Options</title>
|
||
|
||
<para>The majority of the options for <command>indexer</command> are
|
||
given in the configuration file, however there are some options you
|
||
might need to specify on the command line as well, as they can affect
|
||
how the indexing operation is performed. These options are:</para>
|
||
|
||
<variablelist remap="IP">
|
||
<varlistentry>
|
||
<term><option>--all</option></term>
|
||
|
||
<listitem>
|
||
<para>Tells <command>indexer</command> to update every index
|
||
listed in <filename>sphinx.conf</filename>, instead of listing
|
||
individual indexes. This would be useful in small configurations,
|
||
or cron-type or maintenance jobs where the entire index set will
|
||
get rebuilt each day, or week, or whatever period is best.</para>
|
||
|
||
<para>Example usage:</para>
|
||
|
||
<programlisting>$ indexer --config /home/myuser/sphinx.conf --all</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--buildstops</option>
|
||
<replaceable>outfile.txt</replaceable>
|
||
<replaceable>NUM</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Reviews the index source, as if it were indexing the data,
|
||
and produces a list of the terms that are being indexed. In other
|
||
words, it produces a list of all the searchable terms that are
|
||
becoming part of the index. Note; it does not update the index in
|
||
question, it simply processes the data 'as if' it were indexing,
|
||
including running queries defined with
|
||
<emphasis>sql_query_pre</emphasis> or
|
||
<emphasis>sql_query_post</emphasis>.
|
||
<filename>outputfile.txt</filename> will contain the list of
|
||
words, one per line, sorted by frequency with most frequent first,
|
||
and <emphasis>NUM</emphasis> specifies the maximum number of words
|
||
that will be listed; if sufficiently large to encompass every word
|
||
in the index, only that many words will be returned. Such a
|
||
dictionary list could be used for client application features
|
||
around "Did you mean..." functionality, usually in conjunction
|
||
with <option>--buildfreqs</option>, below.</para>
|
||
|
||
<para>Example:</para>
|
||
|
||
<programlisting>$ indexer myindex --buildstops word_freq.txt 1000</programlisting>
|
||
|
||
<para>This would produce a document in the current directory,
|
||
<filename>word_freq.txt</filename> with the 1,000 most common
|
||
words in 'myindex', ordered by most common first. Note that the
|
||
file will pertain to the last index indexed when specified with
|
||
multiple indexes or <option>--all</option> (i.e. the last one
|
||
listed in the configuration file)</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--buildfreqs</option></term>
|
||
|
||
<listitem>
|
||
<para>Used in pair with <option>--buildstops</option> (and is
|
||
ignored if <option>--buildstops</option> is not specified). As
|
||
<option>--buildstops</option> provides the list of words used
|
||
within the index, <option>--buildfreqs</option> adds the quantity
|
||
present in the index, which would be useful in establishing
|
||
whether certain words should be considered stopwords if they are
|
||
too prevalent. It will also help with developing "Did you mean..."
|
||
features where you can how much more common a given word compared
|
||
to another, similar one.</para>
|
||
|
||
<para>Example:</para>
|
||
|
||
<programlisting>$ indexer myindex --buildstops word_freq.txt 1000 --buildfreqs</programlisting>
|
||
|
||
<para>This would produce the <filename>word_freq.txt</filename> as
|
||
above, however after each word would be the number of times it
|
||
occurred in the index in question.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--config</option> <replaceable>CONFIGRILE</replaceable>,
|
||
<option>-c</option> <replaceable>CONFIGFILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Use the given file as configuration. Normally, it will look
|
||
for <filename>sphinx.conf</filename> in the installation directory
|
||
(e.g.<filename> /usr/local/sphinx/etc/sphinx.conf</filename> if
|
||
installed into <filename>/usr/local/sphinx</filename>), followed
|
||
by the current directory you are in when calling indexer from the
|
||
shell. This is most of use in shared environments where the binary
|
||
files are installed somewhere like
|
||
<filename>/usr/local/sphinx/</filename> but you want to provide
|
||
users with the ability to make their own custom Sphinx set-ups, or
|
||
if you want to run multiple instances on a single server. In cases
|
||
like those you could allow them to create their own
|
||
<filename>sphinx.conf</filename> files and pass them to
|
||
<command>indexer</command> with this option.</para>
|
||
|
||
<para>For example:</para>
|
||
|
||
<programlisting>$ indexer --config /home/myuser/sphinx.conf myindex</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dump-rows</option>
|
||
<replaceable>FILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Dumps rows fetched by SQL source(s) into the specified file,
|
||
in a MySQL compatible syntax. Resulting dumps are the exact
|
||
representation of data as received by indexer and help to repeat
|
||
indexing-time issues.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--merge</option> <replaceable>DST-INDEX</replaceable>
|
||
<replaceable>SRC-INDEX</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Physically merge together two indexes. For example if you
|
||
have a main+delta scheme, where the main index rarely changes, but
|
||
the delta index is rebuilt frequently, and
|
||
<option>--merge</option> would be used to combine the two. The
|
||
operation moves from right to left - the contents of
|
||
<replaceable>SRC-INDEX</replaceable> get examined and physically
|
||
combined with the contents of <replaceable>DST-INDEX</replaceable>
|
||
and the result is left in <replaceable>DST-INDEX</replaceable>. In
|
||
pseudo-code, it might be expressed as:
|
||
<replaceable>DST-INDEX</replaceable> +=
|
||
<replaceable>SRC-INDEX</replaceable></para>
|
||
|
||
<para>An example:</para>
|
||
|
||
<programlisting>$ indexer --merge main delta --rotate</programlisting>
|
||
|
||
<para>In the above example, where the main is the master, rarely
|
||
modified index, and delta is the less frequently modified one, you
|
||
might use the above to call <command>indexer</command> to combine
|
||
the contents of the delta into the main index and rotate the
|
||
indexes.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--merge-dst-range</option>
|
||
<replaceable>ATTR</replaceable> <replaceable>MIN</replaceable>
|
||
<replaceable>MAX</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Run the filter range given upon merging. Specifically, as
|
||
the merge is applied to the destination index (as part of
|
||
<option>--merge</option>, and is ignored if
|
||
<option>--merge</option> is not specified),
|
||
<command>indexer</command> will also filter the documents ending
|
||
up in the destination index, and only documents will pass through
|
||
the filter given will end up in the final index. This could be
|
||
used for example, in an index where there is a 'deleted'
|
||
attribute, where 0 means 'not deleted'. Such an index could be
|
||
merged with:<programlisting>$ indexer --merge main delta --merge-dst-range deleted 0 0</programlisting></para>
|
||
|
||
<para>Any documents marked as deleted (value 1) would be removed
|
||
from the newly-merged destination index. It can be added several
|
||
times to the command line, to add successive filters to the merge,
|
||
all of which must be met in order for a document to become part of
|
||
the final index.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--merge-killlists</option>,
|
||
<option>--merge-klists</option></term>
|
||
|
||
<listitem>
|
||
<para>Used in pair with <option>--merge</option>. Usually when
|
||
merging <command>indexer</command> uses kill-list of source index
|
||
(i.e., the one which is merged into) as the filter to wipe out the
|
||
matching docs from the destination index. At the same time the
|
||
kill-list of the destination itself isn't touched at all. When
|
||
using <option>--merge-killlists</option>, (or it shorter form
|
||
<option>--merge-klists</option>) the <command>indexer</command>
|
||
will not filter the dst-index docs with src-index killlist, but it
|
||
will merge their kill-lists together, so the final result index
|
||
will have the kill-list containing the merged source
|
||
kill-lists.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--noprogress</option></term>
|
||
|
||
<listitem>
|
||
<para>Don't display progress details as they occur; instead, the
|
||
final status details (such as documents indexed, speed of indexing
|
||
and so on are only reported at completion of indexing. In
|
||
instances where the script is not being run on a console (or
|
||
'tty'), this will be on by default.</para>
|
||
|
||
<para>Example usage:</para>
|
||
|
||
<programlisting>$ indexer --rotate --all --noprogress</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--print-queries</option></term>
|
||
|
||
<listitem>
|
||
<para>Prints out SQL queries that indexer sends to the database,
|
||
along with SQL connection and disconnection events. That is useful
|
||
to diagnose and fix problems with SQL sources.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--quiet</option></term>
|
||
|
||
<listitem>
|
||
<para>Tells <command>indexer</command> not to output anything,
|
||
unless there is an error. Again, most used for cron-type, or other
|
||
script jobs where the output is irrelevant or unnecessary, except
|
||
in the event of some kind of error.</para>
|
||
|
||
<para>Example usage:</para>
|
||
|
||
<programlisting>$ indexer --rotate --all --quiet</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--rotate</option></term>
|
||
|
||
<listitem>
|
||
<para>Used for rotating indexes. Unless you have the situation
|
||
where you can take the search function offline without troubling
|
||
users, you will almost certainly need to keep search running
|
||
whilst indexing new documents. <option>--rotate</option> creates a
|
||
second index, parallel to the first (in the same place, simply
|
||
including <filename>.new</filename> in the filenames). Once
|
||
complete, <command>indexer</command> notifies
|
||
<command>searchd</command> via sending the
|
||
<emphasis>SIGHUP</emphasis> signal, and <command>searchd</command>
|
||
will attempt to rename the indexes (renaming the existing ones to
|
||
include <filename>.old</filename> and renaming the
|
||
<filename>.new</filename> to replace them), and then start serving
|
||
from the newer files. Depending on the setting of
|
||
<option>seamless_rotate</option>, there may be a slight delay in
|
||
being able to search the newer indexes.</para>
|
||
|
||
<para>Example usage:</para>
|
||
|
||
<programlisting>$ indexer --rotate --all</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--sighup-each</option></term>
|
||
|
||
<listitem>
|
||
<para>is useful when you are rebuilding many big indexes, and want
|
||
each one rotated into <command>searchd</command> as soon as
|
||
possible. With <option>--sighup-each</option>,
|
||
<command>indexer</command> will send a <emphasis>SIGHUP</emphasis>
|
||
signal to <command>searchd</command> after succesfully completing
|
||
the work on each index. (The default behavior is to send a single
|
||
<emphasis>SIGHUP</emphasis> after all the indexes were
|
||
built.)</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--verbose</option></term>
|
||
|
||
<listitem>
|
||
<para>Guarantees that every row that caused problems indexing
|
||
(duplicate, zero, or missing document ID; or file field IO issues;
|
||
etc) will be reported. By default, this option is off, and problem
|
||
summaries may be reported instead.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Author</title>
|
||
|
||
<para>Andrey Aksenoff (<email>shodan@sphinxsearch.com</email>). This
|
||
manual page is written by Alexey Vinogradov
|
||
(<email>klirichek@sphinxsearch.com</email>), using the one written by
|
||
Christian Hofstaedtler ch+debian-packages@zeha.at for the <emphasis
|
||
remap="B">Debian</emphasis> system (but may be used by others).
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU General Public License, Version 2 any later
|
||
version published by the Free Software Foundation.</para>
|
||
|
||
<para>On Debian systems, the complete text of the GNU General Public
|
||
License can be found in
|
||
<filename>/usr/share/common-licenses/GPL</filename>.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>See also</title>
|
||
|
||
<para><citerefentry>
|
||
<refentrytitle>searchd</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>search</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>indextool</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>spelldump</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry></para>
|
||
|
||
<para id="docref">Sphinx and it's programs are documented fully by the
|
||
<emphasis remap="I">Sphinx reference manual</emphasis> available in
|
||
<filename>/usr/share/doc/sphinxsearch</filename>.</para>
|
||
</refsect1>
|
||
</refentry>
|
||
|
||
<refentry id="searchd">
|
||
<refmeta>
|
||
<refentrytitle>searchd</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
|
||
<refmiscinfo class="manual">Sphinxsearch</refmiscinfo>
|
||
|
||
<refmiscinfo class="version">2.0.2</refmiscinfo>
|
||
</refmeta>
|
||
|
||
<refnamediv>
|
||
<refname>searchd</refname>
|
||
|
||
<refpurpose>Sphinxsearch network daemon.</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
<cmdsynopsis>
|
||
<command>searchd</command>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<arg choice="opt">--cpustats</arg>
|
||
|
||
<arg choice="opt">--iostats</arg>
|
||
|
||
<arg choice="opt">--index <replaceable>INDEX</replaceable></arg>
|
||
|
||
<arg choice="opt">--port <replaceable>PORT</replaceable></arg>
|
||
</cmdsynopsis>
|
||
|
||
<cmdsynopsis>
|
||
<command>searchd</command>
|
||
|
||
<arg choice="plain">--status</arg>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<arg choice="opt">--pidfile <replaceable>PIDFILE</replaceable></arg>
|
||
</cmdsynopsis>
|
||
|
||
<cmdsynopsis>
|
||
<command>searchd</command>
|
||
|
||
<arg choice="plain">--stop</arg>
|
||
|
||
<arg choice="opt">--config <replaceable>CONFIGFILE</replaceable></arg>
|
||
|
||
<arg choice="opt">--pidfile <replaceable>PIDFILE</replaceable></arg>
|
||
</cmdsynopsis>
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1>
|
||
<title>Description</title>
|
||
|
||
<para>Sphinx is a collection of programs that aim to provide high
|
||
quality fulltext search.</para>
|
||
|
||
<para>Searchd is the second of the two principle tools as part of
|
||
Sphinx. <command>searchd</command> is the part of the system which
|
||
actually handles searches; it functions as a server and is responsible
|
||
for receiving queries, processing them and returning a dataset back to
|
||
the different APIs for client applications.</para>
|
||
|
||
<para>Unlike <command>indexer</command>, <command>searchd</command> is
|
||
not designed to be run either from a regular script or command-line
|
||
calling, but instead either as a daemon to be called from
|
||
<emphasis>init.d</emphasis> (on Unix/Linux type systems) or to be called
|
||
as a service (on Windows-type systems). so not all of the command line
|
||
options will always apply, and so will be build-dependent.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Options</title>
|
||
|
||
<para>These programs follow the usual GNU command line syntax, with long
|
||
options starting with two dashes (`-').</para>
|
||
|
||
<para>The options available to searchd on all builds are:</para>
|
||
|
||
<variablelist remap="IP">
|
||
<varlistentry>
|
||
<term><option>--config</option>
|
||
<replaceable>CONFIGFILE</replaceable>, <option>-c</option>
|
||
<replaceable>CONFIGFILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Tell <command>searchd</command> to use the given file as its
|
||
configuration, just as with <command>indexer</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--console</option></term>
|
||
|
||
<listitem>
|
||
<para>Force <command>searchd</command> into console mode;
|
||
typically it will be running as a conventional server application,
|
||
and will aim to dump information into the log files (as specified
|
||
in <filename>sphinx.conf</filename>). Sometimes though, when
|
||
debugging issues in the configuration or the daemon itself, or
|
||
trying to diagnose hard-to-track-down problems, it may be easier
|
||
to force it to dump information directly to the console/command
|
||
line from which it is being called. Running in console mode also
|
||
means that the process will not be forked (so searches are done in
|
||
sequence) and logs will not be written to. (It should be noted
|
||
that console mode is not the intended method for running
|
||
searchd.)</para>
|
||
|
||
<para>You can invoke it as such:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --console</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--cpustats</option></term>
|
||
|
||
<listitem>
|
||
<para>Used to provide actual CPU time report (in addition to wall
|
||
time) in both query log file (for every given query) and status
|
||
report (aggregated). It depends on
|
||
<emphasis>clock_gettime()</emphasis> system call and might
|
||
therefore be unavailable on certain systems.</para>
|
||
|
||
<para>You might start searchd thus:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --cpustats</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--help</option>, <option>-h</option>,
|
||
<option>--?</option>, <option>-?</option></term>
|
||
|
||
<listitem>
|
||
<para>List all of the parameters that can be called in your
|
||
particular build of <command>searchd</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--index</option> <replaceable>INDEX</replaceable>,
|
||
<option>-i</option> <replaceable>INDEX</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Serve only the specified index. Like
|
||
<option>--port</option>, this is usually for debugging purposes;
|
||
more long-term changes would generally be applied to the
|
||
configuration file itself.</para>
|
||
|
||
<para>Usage example:</para>
|
||
|
||
<programlisting>$ searchd --index myindex</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--iostats</option></term>
|
||
|
||
<listitem>
|
||
<para>Used in conjuction with the logging options (the
|
||
<option>query_log</option> will need to have been activated in
|
||
<filename>sphinx.conf</filename>) to provide more detailed
|
||
information on a per-query basis as to the input/output operations
|
||
carried out in the course of that query, with a slight performance
|
||
hit and of course bigger logs. Further details are available under
|
||
the query log format section.</para>
|
||
|
||
<para>You might start searchd thus:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --iostats</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--listen</option>, <option>-l</option> <replaceable>(
|
||
address ":" port | port | path ) [ ":" protocol
|
||
]</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Works as <option>--port</option>, but allow you to specify
|
||
not only the port, but full path, as IP address and port, or
|
||
Unix-domain socket path, that <command>searchd</command> will
|
||
listen on. Otherwords, you can specify either an IP address (or
|
||
hostname) and port number, or just a port number, or Unix socket
|
||
path. If you specify port number but not the address, searchd will
|
||
listen on all network interfaces. Unix path is identified by a
|
||
leading slash. As the last param you can also specify a protocol
|
||
handler (listener) to be used for connections on this socket.
|
||
Supported protocol values are 'sphinx' (Sphinx 0.9.x API protocol)
|
||
and 'mysql41' (MySQL protocol used since 4.1 upto at least
|
||
5.1).</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--logdebug, --logdebugv, --logdebugvv</option></term>
|
||
|
||
<listitem>
|
||
<para>Enable additional debug output in the daemon log. Should
|
||
only be needed rarely, to assist with debugging issues that could
|
||
not be easily reproduced on request. <option>--logdebug</option>
|
||
causes daemon to fire general debug messages.
|
||
<option>--logdebugv</option> and <option>--logdebugvv</option>
|
||
points to 'verbose' and 'very verbose' debug info. The last could
|
||
really flood your logfile.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--pidfile</option>
|
||
<replaceable>PIDFILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Explicitly state a PID file, where the process information
|
||
is stored regarding <command>searchd</command>, used for
|
||
inter-process communications (for example,
|
||
<command>indexer</command> will need to know the PID to contact
|
||
<command>searchd</command> for rotating indexes). Normally,
|
||
<command>searchd</command> would use a PID if running in regular
|
||
mode (i.e. not with <option>--console</option>), but it is
|
||
possible that you will be running it in console mode whilst the
|
||
index is being updated and rotated, for which a PID file will be
|
||
needed.</para>
|
||
|
||
<para>Example:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --pidfile /home/myuser/sphinx.pid</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--port</option> <replaceable>PORT</replaceable>,
|
||
<option>-p</option> <replaceable>PORT</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Specify the <emphasis>port</emphasis> that
|
||
<command>searchd</command> should listen on, usually for debugging
|
||
purposes. This will usually default to <option>9312</option>, but
|
||
sometimes you need to run it on a different port. Specifying it on
|
||
the command line will override anything specified in the
|
||
configuration file. The valid range is 0 to 65535, but ports
|
||
numbered 1024 and below usually require a privileged account in
|
||
order to run. Look also the <option>--listen</option> option, it
|
||
will give you more possibilities to tune here.</para>
|
||
|
||
<para>An example of usage:</para>
|
||
|
||
<programlisting>$ searchd --port 9313</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--status</option></term>
|
||
|
||
<listitem>
|
||
<para>Query running <command>searchd</command> instance status,
|
||
using the connection details from the (optionally) provided
|
||
configuration file. It will try to connect to the running instance
|
||
using the first configured UNIX socket or TCP port. On success, it
|
||
will query for a number of status and performance counter values
|
||
and print them. You can use <emphasis>Status()</emphasis> API call
|
||
to access the very same counters from your application.</para>
|
||
|
||
<para>Examples:</para>
|
||
|
||
<programlisting>$ searchd --status
|
||
$ searchd --config /home/myuser/sphinx.conf --status</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--stop</option></term>
|
||
|
||
<listitem>
|
||
<para>Asynchronously stop <command>searchd</command>, using the
|
||
details of the PID file as specified in the
|
||
<filename>sphinx.conf</filename> file, so you may also need to
|
||
confirm to <command>searchd</command> which configuration file to
|
||
use with the <option>--config</option> option. NB, calling
|
||
<option>--stop</option> will also make sure any changes applied to
|
||
the indexes with <emphasis>UpdateAttributes()</emphasis> will be
|
||
applied to the index files themselves.</para>
|
||
|
||
<para>Example:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --stop</programlisting>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--stopwait</option></term>
|
||
|
||
<listitem>
|
||
<para>Synchronously stop <command>searchd</command>.
|
||
<option>--stop</option> essentially tells the running instance to
|
||
exit (by sending it a <emphasis>SIGTERM</emphasis>) and then
|
||
immediately returns. <option>--stopwait</option> will also attempt
|
||
to wait until the running <command>searchd</command> instance
|
||
actually finishes the shutdown (eg. saves all the pending
|
||
attribute changes) and exits.</para>
|
||
|
||
<para>Example:</para>
|
||
|
||
<programlisting>$ searchd --config /home/myuser/sphinx.conf --stopwait</programlisting>
|
||
|
||
<para>Possible exit codes are as follows:</para>
|
||
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>0 on success;</para>
|
||
</listitem>
|
||
|
||
<listitem>
|
||
<para>1 if connection to running <command>searchd</command>
|
||
daemon failed;</para>
|
||
</listitem>
|
||
|
||
<listitem>
|
||
<para>2 if daemon reported an error during shutdown;</para>
|
||
</listitem>
|
||
|
||
<listitem>
|
||
<para>3 if daemon crashed during shutdown</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--strip-path</option></term>
|
||
|
||
<listitem>
|
||
<para>Strip the path names from all the file names referenced from
|
||
the index (<emphasis>stopwords</emphasis>,
|
||
<emphasis>wordforms</emphasis>, <emphasis>exceptions</emphasis>,
|
||
etc). This is useful for picking up indexes built on another
|
||
machine with possibly different path layouts.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Signals</title>
|
||
|
||
<para>Last but not least, as every other daemon,
|
||
<command>searchd</command> supports a number of signals.</para>
|
||
|
||
<para><variablelist>
|
||
<varlistentry>
|
||
<term>SIGTERM</term>
|
||
|
||
<listitem>
|
||
<para>Initiates a clean shutdown. New queries will not be
|
||
handled; but queries that are already started will not be
|
||
forcibly interrupted.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term>SIGHUP</term>
|
||
|
||
<listitem>
|
||
<para>Initiates index rotation. Depending on the value of
|
||
<option>seamless_rotate</option> setting, new queries might be
|
||
shortly stalled; clients will receive temporary errors.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term>SIGUSR1</term>
|
||
|
||
<listitem>
|
||
<para>Forces reopen of searchd log and query log files, letting
|
||
you implement log file rotation.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist></para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Author</title>
|
||
|
||
<para>Andrey Aksenoff (<email>shodan@sphinxsearch.com</email>). This
|
||
manual page is written by Alexey Vinogradov
|
||
(<email>klirichek@sphinxsearch.com</email>), using the one written by
|
||
Christian Hofstaedtler ch+debian-packages@zeha.at for the <emphasis
|
||
remap="B">Debian</emphasis> system (but may be used by others).
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU General Public License, Version 2 any later
|
||
version published by the Free Software Foundation.</para>
|
||
|
||
<para>On Debian systems, the complete text of the GNU General Public
|
||
License can be found in
|
||
<filename>/usr/share/common-licenses/GPL</filename>.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>See also</title>
|
||
|
||
<para><citerefentry>
|
||
<refentrytitle>indexer</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>search</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>indextool</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry></para>
|
||
|
||
<para>Sphinx and it's programs are documented fully by the <emphasis
|
||
remap="I">Sphinx reference manual</emphasis> available in
|
||
<filename>/usr/share/doc/sphinxsearch</filename>.</para>
|
||
</refsect1>
|
||
</refentry>
|
||
|
||
<refentry id="search">
|
||
<refmeta>
|
||
<refentrytitle>search</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
|
||
<refmiscinfo class="manual">Sphinxsearch</refmiscinfo>
|
||
|
||
<refmiscinfo class="version">2.0.2</refmiscinfo>
|
||
</refmeta>
|
||
|
||
<refnamediv>
|
||
<refname>search</refname>
|
||
|
||
<refpurpose>Sphinxsearch command-line index query</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
<cmdsynopsis>
|
||
<command>search</command>
|
||
|
||
<arg choice="opt">OPTIONS</arg>
|
||
|
||
<arg choice="plain" rep="norepeat">word1</arg>
|
||
|
||
<group choice="opt">
|
||
<arg choice="plain" rep="norepeat">word2 <arg>word3
|
||
<arg>...</arg></arg></arg>
|
||
</group>
|
||
</cmdsynopsis>
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1>
|
||
<title>Description</title>
|
||
|
||
<para>Sphinx is a collection of programs that aim to provide high
|
||
quality fulltext search.</para>
|
||
|
||
<para><command>search</command> is one of the helper tools within the
|
||
Sphinx package. Whereas <command>searchd</command> is responsible for
|
||
searches in a server-type environment, <command>search</command> is
|
||
aimed at testing the index from the command line, and testing the index
|
||
quickly without building a framework to make the connection to the
|
||
server and process its response.</para>
|
||
|
||
<para>Note: <command>search</command> is not intended to be deployed as
|
||
part of a client application; it is strongly recommended you do not
|
||
write an interface to <command>search</command> instead of
|
||
<command>searchd</command>, and none of the bundled client APIs support
|
||
this method. (In any event, <command>search</command> will reload files
|
||
each time, whereas <command>searchd</command> will cache them in memory
|
||
for performance.)</para>
|
||
|
||
<para>That said, many types of query that you could build in the APIs
|
||
could also be made with <command>search</command>, however for very
|
||
complex searches it may be easier to construct them using a small script
|
||
and the corresponding API. Additionally, some newer features may be
|
||
available in the <command>searchd</command> system that have not yet
|
||
been brought into <command>search</command>.</para>
|
||
|
||
<para>When calling <command>search</command>, it is not necessary to
|
||
have <command>searchd</command> running; simply make sure that the
|
||
account running the <command>search</command> program has read access to
|
||
the configuration file and the index files.</para>
|
||
|
||
<para>The default behaviour is to apply a <command>search</command> for
|
||
<emphasis>word1</emphasis> (AND <emphasis>word2</emphasis> AND
|
||
<emphasis>word3</emphasis>... as specified) to all fields in all indexes
|
||
as given in the configuration file. If constructing the equivalent in
|
||
the API, this would be the equivalent to passing
|
||
<option>SPH_MATCH_ALL</option> to <command>SetMatchMode</command>, and
|
||
specifying * as the indexes to query as part of Query.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Options</title>
|
||
|
||
<para>There are many options available to
|
||
<command>search</command>.</para>
|
||
|
||
<para>Firstly, the general options:</para>
|
||
|
||
<variablelist remap="IP">
|
||
<varlistentry>
|
||
<term><option>--config</option> <replaceable>CONFIGFILE</replaceable>,
|
||
<option>-c</option> <replaceable>CONFIGFILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Use the given file as its configuration, just as with
|
||
<command>indexer</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--index</option> <replaceable>INDEX</replaceable>,
|
||
<option>-i</option> <replaceable>INDEX</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Limit searching to the specified index only; normally
|
||
<command>search</command> would attempt to search all of the
|
||
physical indexes listed in <filename>sphinx.conf</filename>, not
|
||
any distributed ones.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--stdin</option></term>
|
||
|
||
<listitem>
|
||
<para>Accept the query from the standard input, rather than the
|
||
command line. This can be useful for testing purposes whereby you
|
||
could feed input via pipes and from scripts</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
|
||
<para>Options for setting matches:</para>
|
||
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term><option>--any</option>, <option>-a</option></term>
|
||
|
||
<listitem>
|
||
<para>Changes the matching mode to match any of the words as part
|
||
of the query (word1 OR word2 OR word3). In the API this would be
|
||
equivalent to passing <option>SPH_MATCH_ANY</option> to
|
||
<command>SetMatchMode</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--phrase</option>, <option>-p</option></term>
|
||
|
||
<listitem>
|
||
<para>Changes the matching mode to match all of the words as part
|
||
of the query, and do so in the phrase given (not including
|
||
punctuation). In the API this would be equivalent to passing
|
||
<option>SPH_MATCH_PHRASE</option> to
|
||
<command>SetMatchMode</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--boolean</option>, <option>-b</option></term>
|
||
|
||
<listitem>
|
||
<para>Changes the matching mode to <emphasis>Boolean
|
||
matching</emphasis>. Note if using Boolean syntax matching on the
|
||
command line, you may need to escape the symbols (with a
|
||
backslash) to avoid the shell/command line processor applying
|
||
them, such as ampersands being escaped on a Unix/Linux system to
|
||
avoid it forking to the <command>search</command> process,
|
||
although this can be resolved by using <option>--stdin</option>,
|
||
as below. In the API this would be equivalent to passing
|
||
<option>SPH_MATCH_BOOLEAN</option> to
|
||
<command>SetMatchMode</command>.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--ext</option>, <option>-e</option></term>
|
||
|
||
<listitem>
|
||
<para>Changes the matching mode to <emphasis>Extended
|
||
matching</emphasis>. In the API this would be equivalent to
|
||
passing <option>SPH_MATCH_EXTENDED</option> to
|
||
<command>SetMatchMode</command>, and it should be noted that use
|
||
of this mode is being discouraged in favour of Extended2,
|
||
below.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--ext2</option>, <option>-e2</option></term>
|
||
|
||
<listitem>
|
||
<para>Changes the matching mode to <emphasis>Extended matching,
|
||
version 2</emphasis>. In the API this would be equivalent to
|
||
passing <option>SPH_MATCH_EXTENDED2</option> to
|
||
<command>SetMatchMode</command>, and it should be noted that use
|
||
of this mode is being recommended in favour of Extended, due to
|
||
being more efficient and providing other features.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--filter</option> <replaceable><attr></replaceable><replaceable><v></replaceable>,
|
||
<option>-f</option> <replaceable><attr></replaceable><replaceable><v></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>Filters the results such that only documents where the
|
||
attribute given (attr) matches the value given (v). For example,
|
||
<option>--filter</option> <replaceable>deleted</replaceable>
|
||
<replaceable>0</replaceable> only matches documents with an
|
||
attribute called 'deleted' where its value is 0. You can also add
|
||
multiple filters on the command line, by specifying multiple
|
||
<option>--filter</option> multiple times, however if you apply a
|
||
second filter to an attribute it will override the first defined
|
||
filter.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
|
||
<para>Options for handling the results:</para>
|
||
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term><option>--limit</option> <replaceable><count></replaceable>,
|
||
<option>-l</option> <replaceable><count></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>limits the total number of matches back to the number given.
|
||
If a 'group' is specified, this will be the number of grouped
|
||
results. This defaults to 20 results if not specified (as do the
|
||
APIs)</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--offset</option> <replaceable><count></replaceable>,
|
||
<option>-o</option> <replaceable><count></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>offsets the result list by the number of places set by the
|
||
count; this would be used for pagination through results, where if
|
||
you have 20 results per 'page', the second page would begin at
|
||
offset 20, the third page at offset 40, etc.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--group</option> <replaceable><attr></replaceable>,
|
||
<option>-g</option> <replaceable><attr></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>specifies that results should be grouped together based on
|
||
the attribute specified. Like the GROUP BY clause in SQL, it will
|
||
combine all results where the attribute given matches, and returns
|
||
a set of results where each returned result is the best from each
|
||
group. Unless otherwise specified, this will be the best match on
|
||
relevance.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--groupsort</option> <replaceable><expr></replaceable>,
|
||
<option>-gs</option> <replaceable><expr></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>instructs that when results are grouped with
|
||
<option>--group</option>, the expression given in
|
||
<replaceable><expr></replaceable> shall determine the order
|
||
of the groups. Note, this does not specify which is the best item
|
||
within the group, only the order in which the groups themselves
|
||
shall be returned.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--sortby</option> <replaceable><clause></replaceable>,
|
||
<option>-s</option> <replaceable><clause></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>specifies that results should be sorted in the order listed
|
||
in <replaceable><clause></replaceable>. This allows you to
|
||
specify the order you wish results to be presented in, ordering by
|
||
different columns. For example, you could say
|
||
<option>--sortby</option> <replaceable>"@weight DESC entrytime
|
||
DESC"</replaceable> to sort entries first by weight (or relevance)
|
||
and where two or more entries have the same weight, to then sort
|
||
by the time with the highest time (newest) first. You will usually
|
||
need to put the items in quotes (<option>--sortby</option>
|
||
<replaceable>"@weight DESC"</replaceable>) or use commas
|
||
(<option>--sortby</option>
|
||
<replaceable>@weight,DESC</replaceable>) to avoid the items being
|
||
treated separately. Additionally, like the regular sorting modes,
|
||
if <option>--group</option> (grouping) is being used, this will
|
||
state how to establish the best match within each group.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--sortexpr</option> <replaceable><expr></replaceable>,
|
||
<option>-S</option> <replaceable><expr></replaceable></term>
|
||
|
||
<listitem>
|
||
<para>specifies that the search results should be presented in an
|
||
order determined by an arithmetic expression, stated in expr. For
|
||
example: <option>--sortexpr</option> <replaceable>"@weight + (
|
||
user_karma + ln(pageviews) )*0.1"</replaceable> (again noting that
|
||
this will have to be quoted to avoid the shell dealing with the
|
||
asterisk). Extended sort mode is discussed in more detail under
|
||
the <option>SPH_SORT_EXTENDED</option> entry under the
|
||
<emphasis>Sorting modes</emphasis> section of the manual.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--sort=date</option></term>
|
||
|
||
<listitem>
|
||
<para>specifies that the results should be sorted by descending
|
||
(i.e. most recent first) date. This requires that there is an
|
||
attribute in the index that is set as a timestamp.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--rsort=date</option></term>
|
||
|
||
<listitem>
|
||
<para>specifies that the results should be sorted by ascending
|
||
(i.e. oldest first) date. This requires that there is an attribute
|
||
in the index that is set as a timestamp.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--sort=ts</option></term>
|
||
|
||
<listitem>
|
||
<para>specifies that the results should be sorted by timestamp in
|
||
groups; it will return all of the documents whose timestamp is
|
||
within the last hour, then sorted within that bracket for
|
||
relevance. After, it would return the documents from the last day,
|
||
sorted by relevance, then the last week and then the last month.
|
||
It is discussed in more detail under the
|
||
<option>SPH_SORT_TIME_SEGMENTS</option> entry under the
|
||
<emphasis>Sorting modes</emphasis> section of the manual.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
|
||
<para>Other options:</para>
|
||
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term><option>--noinfo</option>, <option>-q</option></term>
|
||
|
||
<listitem>
|
||
<para>instructs <command>search</command> not to look-up data in
|
||
your SQL database. Specifically, for debugging with MySQL and
|
||
<command>search</command>, you can provide it with a query to look
|
||
up the full article based on the returned document ID. It is
|
||
explained in more detail under the <option>sql_query_info</option>
|
||
directive.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Author</title>
|
||
|
||
<para>Andrey Aksenoff (<email>shodan@sphinxsearch.com</email>). This
|
||
manual page is written by Alexey Vinogradov
|
||
(<email>klirichek@sphinxsearch.com</email>). Permission is granted to
|
||
copy, distribute and/or modify this document under the terms of the GNU
|
||
General Public License, Version 2 any later version published by the
|
||
Free Software Foundation.</para>
|
||
|
||
<para>On Debian systems, the complete text of the GNU General Public
|
||
License can be found in
|
||
<filename>/usr/share/common-licenses/GPL</filename>.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>See also</title>
|
||
|
||
<para><citerefentry>
|
||
<refentrytitle>indexer</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>searchd</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>indextool</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry></para>
|
||
|
||
<para>Sphinx and it's programs are documented fully by the <emphasis
|
||
remap="I">Sphinx reference manual</emphasis> available in
|
||
<filename>/usr/share/doc/sphinxsearch</filename>.</para>
|
||
</refsect1>
|
||
</refentry>
|
||
|
||
<refentry id="spelldump">
|
||
<refmeta>
|
||
<refentrytitle>spelldump</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
|
||
<refmiscinfo class="manual">Sphinxsearch</refmiscinfo>
|
||
|
||
<refmiscinfo class="version">2.0.2</refmiscinfo>
|
||
</refmeta>
|
||
|
||
<refnamediv>
|
||
<refname>spelldump</refname>
|
||
|
||
<refpurpose>Sphinxsearch tool for extract the contents of a dictionary
|
||
file.</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
<cmdsynopsis>
|
||
<command>spelldump</command>
|
||
|
||
<arg choice="opt">OPTIONS</arg>
|
||
|
||
<arg choice="plain">dictionary</arg>
|
||
|
||
<arg choice="plain">affix</arg>
|
||
|
||
<arg>result</arg>
|
||
|
||
<arg>locale-name</arg>
|
||
</cmdsynopsis>
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1>
|
||
<title>Description</title>
|
||
|
||
<para>Sphinx is a collection of programs that aim to provide high
|
||
quality fulltext search.</para>
|
||
|
||
<para>spelldump is used to extract the contents of a dictionary file
|
||
that uses ispell or MySpell format, which can help build word lists for
|
||
wordforms - all of the possible forms are pre-built for you.</para>
|
||
|
||
<para>The two main parameters are the dictionary's main file and its
|
||
affix file; usually these are named as
|
||
<filename>[language-prefix].dict</filename> and
|
||
<filename>[language-prefix].aff</filename> and will be available with
|
||
most common Linux distributions, as well as various places online.
|
||
<option>[result]</option> specifies where the dictionary data should be
|
||
output to, and <option>[locale-name]</option> additionally specifies the
|
||
locale details you wish to use.</para>
|
||
|
||
<para>Examples of its usage are:</para>
|
||
|
||
<para><programlisting>spelldump en.dict en.aff
|
||
spelldump ru.dict ru.aff ru.txt ru_RU.CP1251
|
||
spelldump ru.dict ru.aff ru.txt .1251</programlisting></para>
|
||
|
||
<para>The results file will contain a list of all the words in the
|
||
dictionary in alphabetical order, output in the format of a wordforms
|
||
file, which you can use to customise for your specific
|
||
circumstances.</para>
|
||
|
||
<para>An example of the result file:</para>
|
||
|
||
<para><programlisting>zone > zone
|
||
zoned > zoned
|
||
zoning > zoning </programlisting></para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Options</title>
|
||
|
||
<variablelist remap="IP">
|
||
<varlistentry>
|
||
<term><option>-c</option> <replaceable>[FILE]</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>specifies a file for case conversion details.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Author</title>
|
||
|
||
<para>Andrey Aksenoff (<email>shodan@sphinxsearch.com</email>). This
|
||
manual page is written by Alexey Vinogradov
|
||
(<email>klirichek@sphinxsearch.com</email>). Permission is granted to
|
||
copy, distribute and/or modify this document under the terms of the GNU
|
||
General Public License, Version 2 any later version published by the
|
||
Free Software Foundation.</para>
|
||
|
||
<para>On Debian systems, the complete text of the GNU General Public
|
||
License can be found in
|
||
<filename>/usr/share/common-licenses/GPL</filename>.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>See also</title>
|
||
|
||
<para><citerefentry>
|
||
<refentrytitle>indexer</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>indextool</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>.</para>
|
||
|
||
<para>Sphinx and it's programs are documented fully by the <emphasis
|
||
remap="I">Sphinx reference manual</emphasis> available in
|
||
<filename>/usr/share/doc/sphinxsearch</filename>.</para>
|
||
</refsect1>
|
||
</refentry>
|
||
|
||
<refentry id="indextool">
|
||
<refmeta>
|
||
<refentrytitle>indextool</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
|
||
<refmiscinfo class="manual">Sphinxsearch</refmiscinfo>
|
||
|
||
<refmiscinfo class="version">2.0.2</refmiscinfo>
|
||
</refmeta>
|
||
|
||
<refnamediv>
|
||
<refname>indextool</refname>
|
||
|
||
<refpurpose>Sphinxsearch tool dump miscellaneous debug information about
|
||
the physical index.</refpurpose>
|
||
</refnamediv>
|
||
|
||
<refsynopsisdiv>
|
||
<cmdsynopsis>
|
||
<command>indextool</command>
|
||
|
||
<arg choice="req">command</arg>
|
||
|
||
<arg>options</arg>
|
||
</cmdsynopsis>
|
||
</refsynopsisdiv>
|
||
|
||
<refsect1>
|
||
<title>Description</title>
|
||
|
||
<para>Sphinx is a collection of programs that aim to provide high
|
||
quality fulltext search.</para>
|
||
|
||
<para><command>indextool</command> is one of the helper tools within the
|
||
Sphinx package. It is used to dump miscellaneous debug information about
|
||
the physical index. Apart ghe dumping <command>indextool</command> can
|
||
perform index verification, hence the indextool name rather than just
|
||
indexdump.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Commands</title>
|
||
|
||
<para>The commands are as follows:</para>
|
||
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term><option>--dumpheader</option>
|
||
<replaceable>FILENAME.sph</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>quickly dumps the provided index header file without
|
||
touching any other index files or even the configuration file. The
|
||
report provides a breakdown of all the index settings, in
|
||
particular the entire attribute and field list. Prior to
|
||
0.9.9-rc2, this command was present in CLI search utility.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dumpconfig</option>
|
||
<replaceable>FILENAME.sph</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>dumps the index definition from the given index header file
|
||
in (almost) compliant <filename>sphinx.conf</filename> file
|
||
format.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dumpheader</option>
|
||
<replaceable>INDEXNAME</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>dumps index header by index name with looking up the header
|
||
path in the configuration file.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dumpdocids</option>
|
||
<replaceable>INDEXNAME</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>dumps document IDs by index name. It takes the data from
|
||
attribute (.spa) file and therefore requires
|
||
<option>docinfo=extern</option> to work.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dumphitlist</option>
|
||
<replaceable>INDEXNAME</replaceable>
|
||
<replaceable>KEYWORD</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>dumps all the hits (occurences) of a given keyword in a
|
||
given index, with keyword specified as text.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--dumphitlist</option>
|
||
<replaceable>INDEXNAME</replaceable> <option>--wordid</option>
|
||
<replaceable>ID</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>dumps all the hits (occurences) of a given keyword in a
|
||
given index, with keyword specified as internal numeric ID.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--htmlstrip</option> INDEXNAME</term>
|
||
|
||
<listitem>
|
||
<para>filters stdin using HTML stripper settings for a given
|
||
index, and prints the filtering results to stdout. Note that the
|
||
settings will be taken from <filename>sphinx.conf</filename>, and
|
||
not the index header.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--check</option>
|
||
<replaceable>INDEXNAME</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>checks the index data files for consistency errors that
|
||
might be introduced either by bugs in <command>indexer</command>
|
||
and/or hardware faults.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
|
||
<varlistentry>
|
||
<term><option>--strip-path</option></term>
|
||
|
||
<listitem>
|
||
<para>strips the path names from all the file names referenced
|
||
from the index (stopwords, wordforms, exceptions, etc). This is
|
||
useful for checking indexes built on another machine with possibly
|
||
different path layouts.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Options</title>
|
||
|
||
<para>The only currently available option applies to all commands and
|
||
lets you specify the configuration file:</para>
|
||
|
||
<variablelist remap="IP">
|
||
<varlistentry>
|
||
<term><option>--config</option> <replaceable>CONFIGFILE</replaceable>,
|
||
<option>-c</option> <replaceable>CONFIGFILE</replaceable></term>
|
||
|
||
<listitem>
|
||
<para>overrides the built-in config file names.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>Author</title>
|
||
|
||
<para>Andrey Aksenoff (<email>shodan@sphinxsearch.com</email>). This
|
||
manual page is written by Alexey Vinogradov
|
||
(<email>klirichek@sphinxsearch.com</email>). Permission is granted to
|
||
copy, distribute and/or modify this document under the terms of the GNU
|
||
General Public License, Version 2 any later version published by the
|
||
Free Software Foundation.</para>
|
||
|
||
<para>On Debian systems, the complete text of the GNU General Public
|
||
License can be found in
|
||
<filename>/usr/share/common-licenses/GPL</filename>.</para>
|
||
</refsect1>
|
||
|
||
<refsect1>
|
||
<title>See also</title>
|
||
|
||
<para><citerefentry>
|
||
<refentrytitle>indexer</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>searchd</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry>, <citerefentry>
|
||
<refentrytitle>search</refentrytitle>
|
||
|
||
<manvolnum>1</manvolnum>
|
||
</citerefentry></para>
|
||
|
||
<para>Sphinx and it's programs are documented fully by the <emphasis
|
||
remap="I">Sphinx reference manual</emphasis> available in
|
||
<filename>/usr/share/doc/sphinxsearch</filename>.</para>
|
||
</refsect1>
|
||
</refentry>
|
||
</appendix>
|