== SEARCH !SubSystem == ADEI has a modular search subsystem. The search capabilities are provided by the search engines which providing one-or-more search modules. Besides the search term, the search string may specify the search modules to perform search and number of limits to filter results. The module parameters could be specified along with the modules. === Format of search string === The search string consists of four components: * The first component defines type of the search. Examples are ''item search'', ''channel value search'', ''datetime search''. * Second component provides some options. For example, demands exact or fuzzy match * Third and fourth components are type-dependent and containing search string and additional limits '''I''': The format is as follows: {{{ [type/module specification] [global flags] [limits] }}} Everything besides search string is optional. By default if the type is not specified, the search string is analyzed. Analysis routine guesses the type of search and executes a default set of modules for this type. The default behavior is to search for channel and group names. See [wiki:adeiSEARCH#StringAnalysis String Analysis ] section for details. The search type is specified in the curly brackets in the beginning of the search string. The search module available in the ''classes/search'' should be indicated (name of the class should be specified). Optional options for the class constructor could be indicated as well. If multiple modules are specified, the multiple searches are performed sequentially. The following format is expected: {{{ {module_name(opt1=value1,opt=value2), another_module(...)} }}} '''II''': The global options are going next to the search type and specified in the square brackets. This options then passed to the search modules with the search string and handled by the module code. The following options are supported: * ''='' - Exact match, this means what the search string is matched completely without splitting into the phrases * ''w'' - Word match, if not overridden by match modifiers, see below * ''~'' - Fuzzy match, if not overridden by match modifiers, see below '''III''': Then the search string is follows. If the ''Exact match'' flag is not specified, it consists of the phrases. The phrase is * words consisting of alphanumeric symbols, dash and underscore symbols (''-'',''_'') * multiple words enclosed in singular or double quotes('") * regular expressions enclosed in ''/'' from both ends This is an example of a search string consisting of 4 components: two words, one phrase, and a regular expression: {{ {word1 word2 "phrase 3" /regexp/ }}} Before each phrase, a match modifier could be specified. The following match modifiers are supported * if no modifier is specified, the phrases starting from search term will be matched * ''='' - full match, the whole words are matched * ''~'' - fuzzy match, any part of a word could be matched Please consider following example to understand the meaning of match modifiers. By default if a search for '''sin''' is performed the words '''sin''' and '''sinus''' will be matched, but ''cosinus'' - not. However, if a fuzzy search is given ('''~sin'''), the '''cosinus''' will be matched as well. On other hand if a full match is required ('''=sin'''), only '''sin''' will be matched. Both '''sinus''' and '''cosinus''' will be rejected. The match of each phrase against data records produces ratings from ''0'' to ''1'' indicating match quality. The value ''0'' means what the record is not matched and value ''1'' indicates a full match. If several phrases are listed in search string, the ratings of each phrase match are multiplied to produce overall rating. For example, if phrase1 matched with rating ''0.70'', phrase2 matched with rating ''0.30'' and word3 is fully matched, the overall rating would be: 0.21 = 0.70 * 0.30 * 1. Rating computation could be altered using unary and binary operations. Lets assume what ''[word]'' is a rating of ''word'', then the ratings of these operations are computed as follows: * ''! word'' - The resulting rating would be 1 - [word] * ''+ word'' - The rating below 1 will be cat to 0 * ''- word'' - All non-zero ratings will be cut to zero, and zero rating will be replaced with 1 * ''(word1|word2)'' - The maximal rating amongst [word1] and [word2] Few examples of complex search strings: {{{ =sinus | cos1 }}} {{{ !"a b c" ~d -e +('f g' !(!i (k))) "m n" }}} '''IV''': On-or-more limits can be set in the last part of the search string. The following format is expected {{{ limit_name:limit_value another_limit:another_limit_value }}} The limits handling is completely module specific. Example: {{{ +sinus | cos1 interval:2006 }}} == Default Implementation == The search modules are implemented using [wiki:adeiClassSEARCHEngine SEARCHEngines]. Each SEARCHEngine could provide one or more search module. The !SEARCHEngines are placed in ''classes/search'' folder in ADEI source tree. They should implement a ''Search'' function which accepts four parameters (module, search string, search filter, global options) and returns the [wiki:adeiClassSEARCHResults SEARCHResults] object with results or ''false'' if nothing found. However, standard modules can reuse default ''Search'' function implemented in base class ''classes/searchengine.php''. The following procedure is exeuted in this case: * ''Search'' function of ''Search Engine'' is executed with four parameters: module, search string, search filter, global options. * ''GetList'' function is called to get complete associative list of elements. In this list the key is element identificator and value contains an associative array with terms to check against the search terms. Besides * ''uid'' - record unique identificator if any (used for matching) * ''name'' - record short name (used for matching) * ''title'' - title to use to present this record in the results * ''description'' - longer description (html content is allowed) * ''props'' - an associative array containing [wiki:adeiDG standard ADEI properties] fully describing this record. Fro example, for found data item the props array will contain: ''db_server'', ''db_name'', ''db_group'', and ''db_mask'' properties. For found interval, it would be just property ''window''. * ''CheckString'' function is called on each element of the list, the elements for which the non-zero rating is returned are checked against filters and added to the search results * To prevent duplicating results, the ''SEARCHResults::Accept'' function is used. The results are compared using '''GetCmpFunction'''. The ''CheckString'' is working in following way: * The search string is splited in phrases and for each phrase ''CheckPhrase'' function is called. * Depending on the used module, the ''CheckPhrase'' function is selecting from the associative array describing record a single string value and passes it to the ''CheckTitlePhrase'' function. * ''CheckTitlePhrase'' checks if passed string is fitting to the current search phrase and returns the rating. The matching is performed in one of 4 supported modes depending on the match modifiers and global options * ''defualt'' - The beginning of any word should match search phrase. The '''word sinus cosinusfff''' matches the phrase '''sinus cosinus''', but '''xsinus cosinus''' - not. * ''word match'' - The words should match completely. The '''word sinus cosinus fff''' matches, and '''sinus cosinusfff''' - not. * ''fuzzy match'' - The words boundaries are not important and even '''xsinus cosinusx''' matches the '''sinus cosinus''' search phrase. * ''regex match'' - In this mode the search phrase considered regular expression and this regular expression is matched against passed string * Finally the rating computed for all search phrases are reconciled in overall rating using rules described in the section above. == Search Filters == The filters are used to reject part of the search results as well as to add/modify information associated with found record. The filters are specified at the search string as follows: {{{ interval:June 2005 }}} If such filter is found, the ''INTERVALSearchFilter'' object (from ''classes/search/intervalfilter.php'') is constructed. This object will get the filter value (''June 2005'') as a single parameter to its constructor. And it should implement a single function: ''FilterResult'' which should return ''true'' if the current record should be filtered out or ''false'' otherwise. The ''FilterResult'' receives two parameters: * associative array with information on current record * a number between 0 and 1 with the rating of match Both these parameters can be altered by ''FilterResult'' function. '''Example'''. Lets consider standard ''item'' search used in conjunction with ''interval'' filter. The search will provide multiple records describing found item (i.e. the associative array with information will contain standard properties: ''db_server'', ''db_name'', ''db_group'', and ''db_mask''). The ''interval'' filter is intended to limit the display interval. Therefore, when the ''FilterResult'' function is called, it will add the ''window'' property to the associative array limiting display window to ''June 2005''. If multiple filters are specified they executed sequentially until any filter will not reject the current record. == New Search Engine == * The search engine should provide a list of supported modules in the ''modules'' member of class. It is associative array where the key is module id and the value is module title. * It should define either special ''Search'' function or provide at least the ''GetList'' function to be used in conjunction with the approach described above. ''GetList'' function should return array containing the records. Each record is represented by associative array with following members: * ''title'' - the title used to describe record in the search results * ''description'' - the longer description of the record, HTML content is allowed * ''props'' - the associative array with standard ADEI properties describing the record * ''certain'' - this option indicates what the search module is completely certain what it is this record what he user is actually looking for * Arbitrary properties used by the search engine for record matching Example: {{{ array( array( 'title' => 'January 2005', 'props' => array( 'window' => "1104537600-1107216000" ), 'description' => false, 'certain' => true ) ) }}} Besides ''GetList'' function it is highly desirable to provide ''CheckPhrase'' function which will check the record info against the search phrase and return the match rating, from ''0'' (not matched) to ''1'' (fully matched). The ''CheckPhrase'' function accepts the following parameters * The associative array with information described above * The phrase to match * Type of match: ''SEARCH::WORD_MATCH'', ''SEARCH::FUZZY_MATCH'', ''SEARCH::REGEX_MATCH'', ''false'' (default) * The search module * The global options == INTERVALSearch Engine == '''Provided Modules''': * ''interval'' - Tries to parse the time interval from textual representation given in search string. The only property ''window'' is returned with interval of UNIX timestmaps. '''Supported Filters''': * ''interval''' - allows to find intersection of two intervals == ITEMSearch Engine == '''Provided Modules''': * ''channel'' - Searches items by uid only * ''item'' - Searches items by uid and name * ''group'' - Searches groups by name * ''mask'' - Searches masks by name * ''control'' - Searches controls by uid only * ''control_item'' - Searches controls by uid and name * ''control_group'' - Searches control groups by name '''Supported Filters''': * ''interval''' - adds window property to the items specification == PROXYSearch Engine == '''Provided Modules''': * ''proxy'' - '''Supported Filters''': * ''interval''' == String Analysis ==