Class FuzzyLikeThisQuery

java.lang.Object
org.apache.lucene.search.Query
org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery

public class FuzzyLikeThisQuery extends Query
Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms. In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with special consideration of fuzzy scoring factors. This generally produces good results for queries where users may provide details in a number of fields and have no knowledge of boolean query syntax and also want a degree of fuzzy matching and a fast query.

For each source term the fuzzy variants are held in a BooleanQuery with no coord factor (because we are not looking for matches on multiple variants in any one doc). Additionally, a specialized TermQuery is used for variants and does not use that variant term's IDF because this would favour rarer terms eg misspellings. Instead, all variants use the same IDF ranking (the one for the source query term) and this is factored into the variant's boost. If the source query term does not exist in the index the average IDF of the variants is used.

  • Field Details

  • Constructor Details

    • FuzzyLikeThisQuery

      public FuzzyLikeThisQuery(int maxNumTerms, Analyzer analyzer)
      Parameters:
      maxNumTerms - The total number of terms clauses that will appear once rewritten as a BooleanQuery
  • Method Details

    • hashCode

      public int hashCode()
      Description copied from class: Query
      Override and implement query hash code properly in a subclass. This is required so that QueryCache works properly.
      Specified by:
      hashCode in class Query
      See Also:
    • equals

      public boolean equals(Object other)
      Description copied from class: Query
      Override and implement query instance equivalence properly in a subclass. This is required so that QueryCache works properly.

      Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical to those of the other instance. Utility methods are provided for certain repetitive code.

      Specified by:
      equals in class Query
      See Also:
    • equalsTo

      private boolean equalsTo(FuzzyLikeThisQuery other)
    • addTerms

      public void addTerms(String queryString, String fieldName, float minSimilarity, int prefixLength)
      Adds user input for "fuzzification"
      Parameters:
      queryString - The string which will be parsed by the analyzer and for which fuzzy variants will be parsed
      minSimilarity - The minimum similarity of the term variants; must be 0, 1 or 2 (see FuzzyTermsEnum)
      prefixLength - Length of required common prefix on variant terms (see FuzzyTermsEnum)
    • addTerms

      Throws:
      IOException
    • newTermQuery

      private Query newTermQuery(IndexReader reader, Term term) throws IOException
      Throws:
      IOException
    • visit

      public void visit(QueryVisitor visitor)
      Description copied from class: Query
      Recurse through the query tree, visiting any child queries.
      Specified by:
      visit in class Query
      Parameters:
      visitor - a QueryVisitor to be called by each query in the tree
    • rewrite

      public Query rewrite(IndexSearcher indexSearcher) throws IOException
      Description copied from class: Query
      Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be rewritten into a BooleanQuery that consists of TermQuerys.

      Callers are expected to call rewrite multiple times if necessary, until the rewritten query is the same as the original query.

      The rewrite process may be able to make use of IndexSearcher's executor and be executed in parallel if the executor is provided.

      Overrides:
      rewrite in class Query
      Throws:
      IOException
      See Also:
    • toString

      public String toString(String field)
      Description copied from class: Query
      Prints a query to a string, with field assumed to be the default field and omitted.
      Specified by:
      toString in class Query
    • isIgnoreTF

      public boolean isIgnoreTF()
    • setIgnoreTF

      public void setIgnoreTF(boolean ignoreTF)