Package WordUtils :: Module tree :: Class TernarySearchTree
[show private | hide private]
[frames | no frames]

Class TernarySearchTree


Class representing a ternary search tree.
Method Summary
  __init__(self, type, db, wl)
Default constructor.
  add(self, key)
Adds a key (word) to an existing tree.
  compress(self, ratio)
Compresses a tree, optionally returning compression ratio.
  list(self)
Returns a sorted list representation of the tree.
  loadFile(self, db, wl)
Reads a wordlist or database file into a tree.
  loadList(self, list)
Reads a Python list into a tree.
  patternSearch(self, pattern)
Searches for a pattern in a tree.
  remove(self, key)
Removes a key (word) from a tree.
  save(self, db, wl, overwrite)
Writes a tree to disk as a wordlist and/or in a portable "database" format.
  search(self, key)
Searches for a complete string in a tree.
  traverse(self, func)
Traverses a tree and executes func(word)for each word.

Class Variable Summary
NoneType db
NoneType root
NoneType type
NoneType wl

Method Details

__init__(self, type=0, db=None, wl=None)
(Constructor)

Default constructor.

See the loadFile method for details on how the db and wl arguments are used.
Parameters:
type - type of tree to be created
           (type=one of [IN_MEMORY, ON_DISK])
db - database on disk to be used to create tree
           (type=valid filesytem path as a string)
wl - wordlist on disk to be used to create tree
           (type=valid filesystem path as a string)
Raises:
TreeError - under exception circumstances

add(self, key)

Adds a key (word) to an existing tree.

This function inserts into a tree by building a list of the words in the tree, inserting the new key into the list, and then re-building the tree based on the new list. Obviously, this is rather memory intensive and inefficient. The whole point of using a tree in the first place is to avoid having a list of words around. However, I haven't found any other good way to do this.

You may not really need this functionality. Normally, a tree would be filled using the loadFile method. This add method would only be used to add a new word to an existing tree. If you only need to add a few words to an existing tree, consider whether you might be better off keeping around a small Python list or dictionary to hold the extra word, instead of adding to the tree.
Raises:
DawgError - under exception circumstances

compress(self, ratio=0)

Compresses a tree, optionally returning compression ratio.

This method may only be used on IN_MEMORY trees.

This method has not yet been implemented, and exists only as a stub. The method and its regression tests are in place to support future functionality, if I can every come up with a way to make compression work.
Returns:
  • compression relative to original, as a percentage, if ratio is True
  • None if ratio is False
Raises:
TreeError - under exception circumstances

list(self)

Returns a sorted list representation of the tree.
Returns:
sorted list of words in the tree
Raises:
DawgError - under exception circumstances

loadFile(self, db=None, wl=None)

Reads a wordlist or database file into a tree.

Either a database or a wordlist may be used, but not both. If a wordlist is used, it is assumed to be properly sorted on disk (i.e. the words must be in alphabetical order). Neither file will ever be written to.

Calling this function on a tree that has been previously initialized will overwrite the contents of that tree.

You may not choose to load a wordlist into an ON_DISK tree. First, create an IN_MEMORY tree using the wordlist. Then, save the IN_MEMORY tree as a database. Then, you can load use the database on disk as the source for your ON_DISK tree.
Parameters:
db - database on disk to be used to initialize tree
           (type=valid filesytem path as a string)
wl - wordlist on disk to be used to initialize tree
           (type=valid filesystem path as a string)
Raises:
TreeError - under exception circumstances

loadList(self, list)

Reads a Python list into a tree.

Calling this function on a tree that has been previously initialized will overwrite the contents of that tree.

You may only load a list into an IN_MEMORY tree. If you need the list in an on-disk tree, create an IN_MEMORY tree using the list. Then, save the IN_MEMORY tree as a database. Then, you can load the database on disk as the source for your ON_DISK tree.
Parameters:
list - Python list to be used to initialize tree
           (type=list of strings)
Raises:
TreeError - under exception circumstances

patternSearch(self, pattern)

Searches for a pattern in a tree.

A pattern uses the . (period) character to represent a wildcard. So, for instance, the pattern '.ai.' would match the words 'rail', 'hail', 'mail', etc. if those words were in the tree.
Returns:
list of words in the tree which match the pattern
Raises:
TreeError - under exception circumstances

remove(self, key)

Removes a key (word) from a tree.

This function removes a word from a tree by building a list of the words in the tree, removing the key from the list, and then re-building the tree based on the new list. Obviously, this is rather memory intensive and inefficient. The whole point of using a tree in the first place is to avoid having a list of words around. However, I haven't found any other good way to do this.

You may not really need this functionality. If you only need to remove a few words from an existing tree, consider whether you might be better off just keeping around a small Python list or dictionary to hold those words. You could then use the list or dictionary to validate the results returned from the patternSearch method.
Raises:
DawgError - under exception circumstances

save(self, db=None, wl=None, overwrite=0)

Writes a tree to disk as a wordlist and/or in a portable "database" format.

Both the wl and db arguments may be provided. This will cause the method to write both a wordlist and a database to disk.

Note: if the tree was created as ON_DISK, then this method cannot be used to overwrite the in-use database on disk.
Parameters:
db - database to be written to on disk
           (type=valid filesytem path as a string)
wl - wordlist to be written to on disk
           (type=valid filesystem path as a string)
overwrite - indicates whether to overwrite an existing file on disk
           (type=boolean True/False)
Raises:
TreeError - under exception circumstances

search(self, key)

Searches for a complete string in a tree.
Parameters:
key - string to search for
           (type=string, of any length)
Returns:
boolean indicating whether the key was found in the tree.
Raises:
TreeError - under exception circumstances

traverse(self, func=None)

Traverses a tree and executes func(word) for each word.

If no function is provided, then the default function is:
  lambda x: sys.stdout.write("%s\n" % x)
which will print out each word on its own line (the words will be unordered).
Parameters:
func - function to be called when an endpoint is reached
           (type=reference to a function which takes one argument, a word)
Raises:
DawgError - under exception circumstances

Class Variable Details

db

Type:
NoneType
Value:
None                                                                   

root

Type:
NoneType
Value:
None                                                                   

type

Type:
NoneType
Value:
None                                                                   

wl

Type:
NoneType
Value:
None                                                                   

Generated by Epydoc 1.1 on Mon Sep 8 16:37:49 2003 http://epydoc.sf.net