Changes between Version 7 and Version 8 of adeiSEARCH

Show
Ignore:
Author:
csa (IP: 217.112.40.22)
Timestamp:
09/11/09 18:17:11 (15 years ago)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • adeiSEARCH

    v7 v8  
    99 * Third and fourth components are type-dependent and containing search string and additional limits 
    1010 
    11 The format is as follows: 
     11'''I''': The format is as follows: 
    1212  {{{ [type/module specification] [global flags] <search string> [limits] }}} 
    1313Everything besides search string is optional. By default if the type is not specified, the search string is analyzed. Analysis routine guesses the type of search and executes a default set of modules for this type. The default behavior is to search for channel and group names. See [wiki:adeiSEARCH#StringAnalysis String Analysis ] section for details. 
    1616  {{{ {module_name(opt1=value1,opt=value2), another_module(...)} }}} 
    1717 
    18 The global options are going next to the search type and specified in the square brackets. This options then passed to the search modules with the search string and handled by the module code. The following options are supported: 
     18'''II''': The global options are going next to the search type and specified in the square brackets. This options then passed to the search modules with the search string and handled by the module code. The following options are supported: 
    1919 * ''='' - Exact match, for most modules this means what the search string is matched completely without splitting into the phrases 
    2020 * ''w'' - Word match,  
    2121 * ''~'' - Fuzzy match 
    2222 
    23 Then the search string is follows. If the ''Exact match'' flag is not specified, it consists of the phrases.  
     23'''III''': Then the search string is follows. If the ''Exact match'' flag is not specified, it consists of the phrases. The phrase is 
     24 * words consisting of alphanumeric symbols, dash and underscore symbols (''-'',''_'') 
     25 * multiple words enclosed in singular or double quotes('") 
     26 * regular expressions enclosed in ''/'' from both ends 
    2427 
    25 Filters are implemented as classes 
     28This is an example of a search string consisting of 4 components: two words, one phrase, and a regular expression: 
     29{{{word1 word2 "phrase 3" /regexp/}}} 
    2630 
    27 == Implementation == 
     31Before each phrase, a match modifier could be specified. The following match modifiers are supported 
     32 * if no modifier is specified, the phrases starting from search term will be matched 
     33 * ''='' - full match, the whole words are matched 
     34 * ''~'' - fuzzy match, any part of a word could be matched 
     35 
     36Please consider following example to understand the meaning of match modifiers. By default if a search for '''sin''' is performed the words '''sin''' and '''sinus''' will be matched, but ''cosinus'' - not. However, if a fuzzy search is given ('''~sin'''), the '''cosinus''' will be matched as well. On other hand if a full match is required ('''=sin'''), only '''sin''' will be matched. Both '''sinus''' and '''cosinus''' will be rejected. 
     37 
     38The match of each phrase against data records produces ratings from ''0'' to ''1'' indicating match quality. The value ''0'' means what the record is not matched and value ''1'' indicates a full match. If several phrases are listed in search string, the ratings of each phrase match are multiplied to produce overall rating. For example, if phrase1 matched with rating ''0.70'', phrase2 matched with rating ''0.30'' and word3 is fully matched, the overall rating would be: 0.21 = 0.70 * 0.30 * 1. 
     39 
     40Rating computation could be altered using unary and binary operations. Lets assume what ''[word]'' is a rating of ''word'', then the ratings of these operations are computed as follows: 
     41 * ''! word'' - The resulting rating would be 1 - [word] 
     42 * ''+ word'' - The rating below 1 will be cat to 0 
     43 * ''- word'' - All non-zero ratings will be cut to zero, and zero rating will be replaced with 1 
     44 * ''(word1|word2)'' - The maximal rating amongst [word1] and [word2] 
     45 
     46'''IV''': On-or-more limits can be set in the last part of the search string. The following format is expected 
     47{{{ 
     48 limit_name:limit_value another_limit:another_limit_value 
     49}}} 
     50 
     51The limits handling is completely module specific. 
     52 
     53== Default Implementation == 
    2854The following procedure is executed for each module: 
    2955 * ''Search'' function of ''Search Engine'' is executed with four parameters: module, search string, search filter, global options. 
    3662 
    3763 *  
     64Filters are implemented as classes 
    3865== Providing New Search Engine == 
    3966