filedb.filetables module
This module defines writer and reader classes for a fast, immutable
on-disk key-value database format. The current format is based heavily on
D. J. Bernstein’s CDB format (http://cr.yp.to/cdb.html).
Hash file
-
class whoosh.filedb.filetables.HashWriter(dbfile, magic='HSH3', hashtype=0)
Implements a fast on-disk key-value store. This hash uses a two-level
hashing scheme, where a key is hashed, the low eight bits of the hash value
are used to index into one of 256 hash tables. This is basically the CDB
algorithm, but unlike CDB this object writes all data serially (it doesn’t
seek backwards to overwrite information at the end).
Also unlike CDB, this format uses 64-bit file pointers, so the file length
is essentially unlimited. However, each key and value must be less than
2 GB in length.
Parameters: |
- dbfile – a StructFile object
to write to.
- magic – the format tag bytes to write at the start of the file.
- hashtype – an integer indicating which hashing algorithm to use.
Possible values are 0 (MD5), 1 (CRC32), or 2 (CDB hash).
|
-
add(key, value)
- Adds a key/value pair to the file. Note that keys DO NOT need to be
unique. You can store multiple values under the same key and retrieve
them using HashReader.all().
-
add_all(items)
- Convenience method to add a sequence of (key, value) pairs. This
is the same as calling HashWriter.add() on each pair in the
sequence.
-
class whoosh.filedb.filetables.HashReader(dbfile, length=None, magic='HSH3', startoffset=0)
Reader for the fast on-disk key-value files created by
HashWriter.
Parameters: |
- dbfile – a StructFile object
to read from.
- length – the length of the file data. This is necessary since the
hashing information is written at the end of the file.
- magic – the format tag bytes to look for at the start of the
file. If the file’s format tag does not match these bytes, the
object raises a FileFormatError exception.
- startoffset – the starting point of the file data.
|
-
all(key)
- Yields a sequence of values associated with the given key.
-
classmethod open(storage, name)
- Convenience method to open a hash file given a
whoosh.filedb.filestore.Storage object and a name. This takes
care of opening the file and passing its length to the initializer.
-
ranges_for_key(key)
- Yields a sequence of (datapos, datalength) tuples associated
with the given key.
Ordered Hash file
-
class whoosh.filedb.filetables.OrderedHashWriter(dbfile)
- Implements an on-disk hash, but requires that keys be added in order.
An OrderedHashReader can then look up “nearest keys” based on
the ordering.
-
class whoosh.filedb.filetables.OrderedHashReader(dbfile, length=None, magic='HSH3', startoffset=0)
Parameters: |
- dbfile – a StructFile object
to read from.
- length – the length of the file data. This is necessary since the
hashing information is written at the end of the file.
- magic – the format tag bytes to look for at the start of the
file. If the file’s format tag does not match these bytes, the
object raises a FileFormatError exception.
- startoffset – the starting point of the file data.
|