Version 13 (modified by csa, 15 years ago)
--

SEARCH SubSystem

ADEI has a modular search subsystem. The search capabilities are provided by the search engines which providing one-or-more search modules. Besides the search term, the search string may specify the search modules to perform search and number of limits to filter results. The module parameters could be specified along with the modules.

The search subsystem is implemented around three main classes:

  • SEARCHEngine - Search engine providing one-or-more search modules, defined in classes/searchengine.php
  • SEARCHFilter - Search filter providing an interface to restrict search results (like google's site:kernel.org), defined in classes/searchfilter.php
  • SEARCHResults - Provides the results of search

The user supplies to the search subsystem:

  • A list of search modules to perform search along with parameters
  • Global search options
  • Search string
  • A set of limits to restrict search results

For each specified module, the search subsystem identifies a search engine providing this module and executes the Search function of the engine. The function returns a SEARCHResults with results or false if nothing is found. Finally, the search subsystem merges the results from individual modules and returns the merged SEARCHResults object or false if no module provided results.

The SEARCHResults object able to store the results of two different types (simultaneously):

  • Standard results: each result item is described by the associative array. The following members are defined
    • title - short title describing the result item
    • description - the longer description of the result item, an HTML content is allowed
    • props - the associative array with standard ADEI properties describing the item
    • certain - this option indicates what the search module is completely certain what it is this record what the user is actually looking for
    • Arbitrary number of other properties which are used by the search engine internally (for record matching, for example)

Just an example, of associative array describing a found time interval:

   array(
     'title' => 'January 2005',
     'props' => array(
         'window' => "1104537600-1107216000"
     ),
     'description' => false,
     'certain' => true
  )
  • Custom results: The search engine is mainly used to provide results in ADEI web display. In some cases the results are provided by 3rd party applications which doesn't respect the ADEI way of structuring results, but provide just an HTML page with all results in ready to display form. In order to support such third party application, ADEI search module may return a SEARCHResults module with custom results. In this case, instead of per-item associative array, the SEARCHResults will store the XHTML content representing all results provided by the search module.

The merged results of multiple search module may contain both per-item associative arrays for results of some modules and XHTML results for others.

Default Implementation

The search modules are implemented using SEARCHEngines?. Each SEARCHEngine could provide one or more search module. The !SEARCHEngines are placed in classes/search folder in ADEI source tree. They should implement a Search function which accepts four parameters (module, search string, search filter, global options) and returns the SEARCHResults? object with results or false if nothing found. However, standard modules can reuse default Search function implemented in base class classes/searchengine.php. The following procedure is executed in this case:

  • Search function of Search Engine is executed with four parameters: module, search string, search filter, global options.
  • GetList? function is called to get complete associative list of elements. In this list the key is element identificator and value contains an associative array with terms to check against the search terms. Besides
    • uid - record unique identificator if any (used for matching)
    • name - record short name (used for matching)
    • title - title to use to present this record in the results
    • description - longer description (html content is allowed)
    • props - an associative array containing standard ADEI properties fully describing this record. Fro example, for found data item the props array will contain: db_server, db_name, db_group, and db_mask properties. For found interval, it would be just property window.
  • CheckString? function is called on each element of the list, the elements for which the non-zero rating is returned are checked against filters and added to the search results
  • To prevent duplicating results, the SEARCHResults::Accept function is used. The results are compared using GetCmpFunction?.

The CheckString? is working in following way:

  • The search string is splited in phrases and for each phrase CheckPhrase? function is called.
  • Depending on the used module, the CheckPhrase? function is selecting from the associative array describing record a single string value and passes it to the CheckTitlePhrase? function.
  • CheckTitlePhrase? checks if passed string is fitting to the current search phrase and returns the rating. The matching is performed in one of 4 supported modes depending on the match modifiers and global options
    • defualt - The beginning of any word should match search phrase. The word sinus cosinusfff matches the phrase sinus cosinus, but xsinus cosinus - not.
    • word match - The words should match completely. The word sinus cosinus fff matches, and sinus cosinusfff - not.
    • fuzzy match - The words boundaries are not important and even xsinus cosinusx matches the sinus cosinus search phrase.
    • regex match - In this mode the search phrase considered regular expression and this regular expression is matched against passed string
  • Finally the rating computed for all search phrases are reconciled in overall rating using rules described in the section above.

Search Filters

The filters are used to reject part of the search results as well as to add/modify information associated with found record. The filters are specified at the search string as follows:

   interval:June 2005

If such filter is found, the INTERVALSearchFilter object (from classes/search/intervalfilter.php) is constructed. This object will get the filter value (June 2005) as a single parameter to its constructor. And it should implement a single function: FilterResult? which should return true if the current record should be filtered out or false otherwise. The FilterResult? receives two parameters:

  • associative array with information on current record
  • a number between 0 and 1 with the rating of match

Both these parameters can be altered by FilterResult? function.

Example. Lets consider standard item search used in conjunction with interval filter. The search will provide multiple records describing found item (i.e. the associative array with information will contain standard properties: db_server, db_name, db_group, and db_mask). The interval filter is intended to limit the display interval. Therefore, when the FilterResult? function is called, it will add the window property to the associative array limiting display window to June 2005.

If multiple filters are specified they executed sequentially until any filter will not reject the current record.

New Search Engine

  • The search engine should provide a list of supported modules in the modules member of class. It is associative array where the key is module id and the value is module title.
  • It should define either special Search function or provide at least the GetList? function to be used in conjunction with the approach described above.

GetList? function should return array containing the records. Each record is represented by associative array with following members:

  • title - the title used to describe record in the search results
  • description - the longer description of the record, HTML content is allowed
  • props - the associative array with standard ADEI properties describing the record
  • certain - this option indicates what the search module is completely certain what it is this record what he user is actually looking for
  • Arbitrary properties used by the search engine for record matching

Example:

 array(
   array(
     'title' => 'January 2005',
     'props' => array(
         'window' => "1104537600-1107216000"
     ),
     'description' => false,
     'certain' => true
  )
)

Besides GetList? function it is highly desirable to provide CheckPhrase? function which will check the record info against the search phrase and return the match rating, from 0 (not matched) to 1 (fully matched). The CheckPhrase? function accepts the following parameters

  • The associative array with information described above
  • The phrase to match
  • Type of match: SEARCH::WORD_MATCH, SEARCH::FUZZY_MATCH, SEARCH::REGEX_MATCH, false (default)
  • The search module
  • The global options

The special search engines intended to return custom XHTML content should use following approach in the Search function:

  $result = new SEARCHResults(NULL, $this, $module, "");
  $result->Append("<XHTML content>");
  return $result;

The <?xml?> should not be included into the content.

INTERVALSearch Engine

Provided Modules:

  • interval - Tries to parse the time interval from textual representation given in search string. The only property window is returned with interval of UNIX timestmaps.

Supported Filters:

  • interval - allows to find intersection of two intervals

ITEMSearch Engine

Provided Modules:

  • channel - Searches items by uid only
  • item - Searches items by uid and name
  • group - Searches groups by name
  • mask - Searches masks by name
  • control - Searches controls by uid only
  • control_item - Searches controls by uid and name
  • control_group - Searches control groups by name

Supported Filters:

  • interval - adds window property to the items specification

PROXYSearch Engine

Provided Modules:

  • proxy - downloads XML document from the specified location and applying XSLT stylesheet to convert it into the XHTML. Accepts several parameters:
    • xml - the service to obtain XML document from (mandatory)
    • xslt - the stylesheet to apply to XML, could be omitted if the service returns XHTML directly
    • noprops - instructs ADEI to not add current properties when calling the service, otherwise the passsed db_server, db_name, and other properties would be added in the end of service request.

Supported Filters:

  • interval - adds window property to the XML service request

Example usage:

proxy(xml=katrin.php?target=runs;xslt=katrinsearch;noprops)} interval:1218431322-1253472677

String Analysis

At the moment performed by DetectModule? funcion defined in classes/search.php. Should be extended by searchengines claiming the search string.

adeiSEARCH/String