Metadata-Version: 1.0
Name: experimental.catalogqueryplan
Version: 1.1
Summary: Static query optimized with one plan
Home-page: http://www.jarn.com/
Author: Jarn AS
Author-email: info@jarn.com
License: ZPL
Description: Introduction
        ============
        
        While the catalog tool in Zope is immensely useful, we have seen some slowdowns
        in large Plone sites with a combination of additional indexes and lots of
        content.
        
        The catalog implementation is using BTree set operations like union, multiunion
        and intersection. Those operations are fairly fast, especially when everything
        is in memory. However, the catalog implementation is rather naive which leads
        to lots of set operations on rather big sets.
        
        Query plan
        ==========
        
        Search engines and databases uses query optimizers to select query plans that
        will minimize the result set as early as possible, because working with large
        amounts of data is time consuming.
        
        What we want to do is to search against the indexes giving the smallest result
        set first. However, for that to be useful, we need to pass that result along
        into the indexes to allow the indexes to limit the result set as soon as
        possible internally. When calculating a path search, there is no need to look
        in all 150000 results if the portal type index has already limited the possible
        result to 10000. If we have already limited the result to 10000 results, all
        set operations are going to be significantly faster.
        
        We identify different searches by the list of indexes that are searched. If
        there are no query plans for a set of indexes, the query is run like normal
        while storing the number of results for each index. When all indexes have been
        checked, the list is sorted on number of results and stored as a query
        plan. Next time a search on the same indexes comes in, the query plan is
        looked up.
        
        To get different query plans for similar queries, you can provide additional
        bogus index names. They will be ignored by the catalog, but will become part of
        the key. This means that if you search for Documents in draft state for a
        worklist, you can have a different ordering than when searching for published
        Documents, as there are likely to be very few items in draft state, but many in
        published state.
        
        Testing
        =======
        
        To test, import the monkey patch in other tests, like CMFPlone::
        
        import experimental.catalogqueryplan
        
        and run the test.
        
        Changelog
        =========
        
        1.1 - 2009-01-02
        ----------------
        
        * Made the set monkeypatches temporary to avoid zc.relationship
        trying to persist the set methods.
        [tesdal]
        
        1.0 - 2009-01-02
        ----------------
        
        * Removed redundant intersections, added type checking to difference
        [tesdal]
        
        * Add alternative weightedIntersection, and reuse BTree tests
        [tesdal]
        
        * Don't monkeypatch intersection as zc.relationship will try
        to pickle the function.
        Added new ExtendedPathIndex code.
        [tesdal]
        
        * Optimize UnIndex.apply_index internally, sort sets for AND,
        use multiunion for OR.
        [tesdal]
        
        * Limit the number of if-statements in intersection,
        and added test for fastest way of finding max and min.
        [tesdal]
        
        * Monkeypatch difference to handle big/tiny difference in Python
        This doesn't belong in queryplan, as it's only a BTree patch,
        and should be refactored out.
        [tesdal]
        
        * Added performance tests.
        [tesdal]
        
        * Fixed a bug with UnIndex return result missing index id
        [tesdal]
        
        * Added tests for intersection, fixed a bug with empty second argument set
        [tesdal]
        
        * Monkeypatch intersect to handle big/tiny intersects in Python
        [tesdal]
        
        * Improved UnIndex query, to avoid redundant intersections
        [tesdal]
        
        * Clarified LanguageIndex support. We are missing fallback support right now
        and now disable the optimization when fallback is enabled.
        [hannosch, mj]
        
        0.9 - 2008-10-18
        ----------------
        
        * Added support for LinguaPlone's LanguageIndex.
        [hannosch]
        
        0.8 - 2008-09-03
        ----------------
        
        * Let each index patch register itself with the ADVANCEDTYPES list.
        This should enable patching of other indexes as well, and remove
        the dependency on ExtendedPathIndex.
        [tesdal]
        
        0.7 - 2008-08-22
        ----------------
        
        * Check whether we're supposed to use daterangeindex
        at all before retrieving cached data.
        [tesdal]
        
        0.6 - 2008-07-03
        ----------------
        
        * Use a volatile instance variable to store the prioritymap.
        [mj]
        
        0.5 - 2008/06/23
        ----------------
        
        * DateRangeIndex shouldn't overwrite the semi-request passed into the
        apply_index method.
        [mj]
        
        0.4 - 2008/06/23
        ----------------
        
        * DateRangeIndex now doesn't assume that REQUEST is available.
        [tesdal]
        
        0.3
        ---
        
        * Handle request being a dictionary.
        [tesdal]
        
        0.3
        ---
        
        * Refactored patches into multiple files.
        [tesdal]
        
        * Dynamic query optimization based on result set analysis
        from queries against the same indexes.
        [tesdal]
        
        * Manual query optimization based on typical usage pattern.
        [tesdal]
        
        0.1
        ---
        
        * Initial release
        
        
Keywords: plone catalog search
Platform: UNKNOWN
Classifier: Framework :: Plone
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development :: Libraries :: Python Modules
