Search

In WAF there are two primary ways you can query for data. One is the AQL queries. AQL queries use the database engine and are therefore suited to queries related to specific field values. Database engines are not efficient for doing free text searches across multiple properties. Because of this WAF WAF comes built-in with a free text index engine optimized for fast text searches across all properties. The engine is based on the Open Source project named “Lucene”.

Index queries

You can think of the Lucene index as one large table. The table contains fields for some of the key content properties like “Name”, NodeId, Class Type etc. and two larger text fields. One referred to as “public” and one referred to as “protected”. The text field contains a combined string of the values of all properties marked as being part of the public or the protected index.

The public index is designed to be used on the open visit

The key object is called “IndexQuery<>”. This object defines the query parameters. Here is an example:

var q = new IndexQuery<ContentBase>();

q.BodySearch = "microsoft";

var results = WAFContext.Session.Search<ContentBase>(q);

StringBuilder html = new StringBuilder();

foreach (var r in results) {

   html.Append(r.Name);

   html.Append("<br />");

}

Response.Write(html.ToString());

 

This query will search all objects inheriting from “ContentBase” (which is everything) and return contents that contain the word “microsoft”.  Lucene supports wild card operators, “fussy searches, boolean expressions and more. Here are a few examples:

q.BodySearch = "microsoft"; // the whole term "microsoft"

q.BodySearch = "microsoft*"; // terms that starts with "microsoft"

q.BodySearch = "microsoft~"; // terms that are similar to "microsoft"

q.BodySearch = "microsoft apple"; // "microsoft" OR "apple"

q.BodySearch = "+microsoft +apple"; // "microsoft" AND "apple"

 

For more information look at this page, or search on Google:

http://lucene.apache.org/java/2_3_2/queryparsersyntax.html


The search result is by default sorted by relevance. The relevance is based on a number of factors. Matches in the content name are weighted higher than matches in the content property. Matches in shorter texts are weighted higher that matches in longer texts. The weighting of each term can be set individually using search expressions such as:

q.BodySearch ="apple^4 microsoft";

This will weigh the term "apple" 4 times higher than “microsoft”.

The result of the “.Search()” is a list of objects of the type “SearchResult<T>”. Each “SearchResult<T>” object is a reference to the row in the search index that represents one content object. The object has property with direct access to basic properties like its ids, name, indexed text, type, etc. The object also has a property called “.Content” that retrieves the content the result represents. Here you can access all of the properties of the content object. Going via the “.Content” property on the SearchResult object to access content properties is a little slower than accessing the properties directly on the SearchResult object. That is because the direct properties are retrieved directly from the Lucene Index, while the others require the system to first retrieve the content object. In most cases this is not a major concern, but if the property you want is available directly on the result object uses this instead. Here is an example of properties that is directly available on the “SearchResult<T>” object:

var q = new IndexQuery<ContentBase>();

q.BodySearch = "microsoft";

var results = WAFContext.Session.Search<ContentBase>(q);

 

StringBuilder html = new StringBuilder();

foreach (var r in results) {

   html.Append(r.Name);

   html.Append("<br />");

}

Response.Write(html.ToString());

The search uses “Generics” in C# to both make it easy for you to filter out the type you want in the result and to provide correct typing on the “.Content” property. The type you specify in the index query instantiation is also used in the call to “WAFSession.Search<T>()” and the return list of “SearchResult<T>()” objects.