A hand-written query parser built on modular plug-ins. The default configuration implements a powerful fielded query language similar to Lucene’s.
You can use the plugins argument when creating the object to override the default list of plug-ins, and/or use add_plugin() and/or remove_plugin_class() to change the plug-ins included in the parser.
>>> from whoosh import qparser
>>> parser = qparser.QueryParser("content", schema)
>>> parser.remove_plugin_class(qparser.WildcardPlugin)
>>> parser.add_plugin(qparser.PrefixPlugin())
>>> parser.parse(u"hello there")
And([Term("content", u"hello"), Term("content", u"there")])
Parameters: |
|
---|
Returns a query for multiple texts. This method implements the intention specified in the field’s multitoken_query attribute, which specifies what to do when strings that look like single terms to the parser turn out to yield multiple tokens when analyzed.
Parameters: |
|
---|
Parses the input string and returns a whoosh.query.Query object/tree.
Parameters: |
|
---|---|
Return type: |
Returns a group of syntax nodes corresponding to the given text, tagged by the plugin Taggers and filtered by the plugin filters.
Parameters: |
|
---|
Removes any plugins of the class of the given plugin and then adds it. This is a convenience method to keep from having to call remove_plugin_class followed by add_plugin each time you want to reconfigure a default plugin.
>>> qp = qparser.QueryParser("content", schema)
>>> qp.replace_plugin(qparser.NotPlugin("(^| )-"))
Returns a group of syntax nodes corresponding to the given text, created by matching the Taggers provided by the parser’s plugins.
Parameters: |
|
---|
The following functions return pre-configured QueryParser objects.
Returns a QueryParser configured to search in multiple fields.
Instead of assigning unfielded clauses to a default field, this parser transforms them into an OR clause that searches a list of fields. For example, if the list of multi-fields is “f1”, “f2” and the query string is “hello there”, the class will parse “(f1:hello OR f2:hello) (f1:there OR f2:there)”. This is very useful when you have two textual fields (e.g. “title” and “content”) you want to search by default.
Parameters: |
|
---|
Returns a QueryParser configured to support only +, -, and phrase syntax, and which converts individual terms into DisjunctionMax queries across a set of fields.
Parameter: | fieldboosts – a dictionary mapping field names to boosts. |
---|
Base class for parser plugins.
Should return a list of (filter_function, priority) tuples to add to parser. Lower priority numbers run first.
Filter functions will be called with (parser, groupnode) and should return a group node.
Adds the ability to specify prefix queries by ending a term with an asterisk.
This plugin is useful if you want the user to be able to create prefix but not wildcard queries (for performance reasons). If you are including the wildcard plugin, you should not include this plugin as well.
>>> qp = qparser.QueryParser("content", myschema)
>>> qp.remove_plugin_class(qparser.WildcardPlugin)
>>> qp.add_plugin(qparser.PrefixPlugin())
>>> q = qp.parse("pre*")
Adds the ability to specify regular expression term queries.
The default syntax for a regular expression term is r"termexpr".
>>> qp = qparser.QueryParser("content", myschema)
>>> qp.add_plugin(qparser.RegexPlugin())
>>> q = qp.parse('foo title:r"bar+"')
Adds the ability to boost clauses of the query using the circumflex.
>>> qp = qparser.QueryParser("content", myschema)
>>> q = qp.parse("hello there^2")
Adds the ability to specify the field of a clause.
Parameters: |
|
---|
By default, adds the AND, OR, ANDNOT, ANDMAYBE, and NOT operators to the parser syntax. This plugin scans the token stream for subclasses of Operator and calls their Operator.make_group() methods to allow them to manipulate the stream.
There are two levels of configuration available.
The first level is to change the regular expressions of the default operators, using the And, Or, AndNot, AndMaybe, and/or Not keyword arguments. The keyword value can be a pattern string or a compiled expression, or None to remove the operator:
qp = qparser.QueryParser("content", schema)
cp = qparser.OperatorsPlugin(And="&", Or="\|", AndNot="&!",
AndMaybe="&~", Not=None)
qp.replace_plugin(cp)
You can also specify a list of (OpTagger, priority) pairs as the first argument to the initializer to use custom operators. See Creating custom operators for more information on this.
Adds the ability to use + and - in a flat OR query to specify required and prohibited terms.
This is the basis for the parser configuration returned by SimpleParser().
Allows the user to use greater than/less than symbols to create range queries:
a:>100 b:<=z c:>=-1.4 d:<mz
This is the equivalent of:
a:{100 to] b:[to z] c:[-1.4 to] d:[to mz}
The plugin recognizes >, <, >=, <=, =>, and =< after a field specifier. The field specifier is required. You cannot do the following:
>100
This plugin requires the FieldsPlugin and RangePlugin to work.
Converts any unfielded terms into OR clauses that search for the term in a specified list of fields.
>>> qp = qparser.QueryParser(None, myschema)
>>> qp.add_plugin(qparser.MultifieldPlugin(["a", "b"])
>>> qp.parse("alfa c:bravo")
And([Or([Term("a", "alfa"), Term("b", "alfa")]), Term("c", "bravo")])
This plugin is the basis for the MultifieldParser.
Parameters: |
|
---|
Adds the ability to use “aliases” of fields in the query string.
This plugin is useful for allowing users of languages that can’t be represented in ASCII to use field names in their own language, and translate them into the “real” field names, which must be valid Python identifiers.
>>> # Allow users to use 'body' or 'text' to refer to the 'content' field
>>> parser.add_plugin(FieldAliasPlugin({"content": ["body", "text"]}))
>>> parser.parse("text:hello")
Term("content", "hello")
Looks for basic syntax nodes (terms, prefixes, wildcards, phrases, etc.) occurring in a certain field and replaces it with a group (by default OR) containing the original token and the token copied to a new field.
For example, the query:
hello name:matt
could be automatically converted by CopyFieldPlugin({"name", "author"}) to:
hello (name:matt OR author:matt)
This is useful where one field was indexed with a differently-analyzed copy of another, and you want the query to search both fields.
You can specify a different group type with the group keyword. You can also specify group=None, in which case the copied node is inserted “inline” next to the original, instead of in a new group:
hello name:matt author:matt
Parameters: |
|
---|
Base class for nodes that make up the abstract syntax tree (AST) of a parsed user query string. The AST is an intermediate step, generated from the query string, then converted into a whoosh.query.Query tree by calling the query() method on the nodes.
Instances have the following required attributes:
Sets the boost associated with this node.
For nodes that don’t have a boost, this is a no-op.
Sets the fieldname associated with this node. If override is False (the default), the fieldname will only be replaced if this node does not already have a fieldname set.
For nodes that don’t have a fieldname, this is a no-op.
Intermediate base class for basic nodes that search for text, such as term queries, wildcards, prefixes, etc.
Instances have the following attributes:
Base class for abstract syntax tree node types that group together sub-nodes.
Instances have the following attributes:
This class implements a number of list methods for operating on the subnodes.
Base class for PrefixOperator, PostfixOperator, and InfixOperator.
Operators work by moving the nodes they apply to (e.g. for prefix operator, the previous node, for infix operator, the nodes on either side, etc.) into a group node. The group provides the code for what to do with the nodes.
Parameters: |
|
---|
Parameters: |
|
---|
Parameters: |
|
---|
Parameters: |
|
---|