RavenDB's hidden features
In this article, which got published some time ago in a developers' magazine in Norway, we are going to explore two of the lesser-known features of RavenDB, and show how they can greatly improve your application and your business.
Suggesting alternate spellings for search terms
Imagine the following scenario: You go on a website to look for some info on an acquaintance. You search once, twice, but can’t find it, so you give up. A few days later you discover Danny is actually spelled Danni, but it is already too late. The website you gave up on lost you – a potential user, or in some cases a paying customer.
Sounds familiar? This happens time and again for too many websites and applications. Trying to guess what the user was actually looking for and trying to provide him with meaningful alternatives is being considered by many developers overkill. By doing so, they don’t realize the full potential of their application and lose both customers and business.
A very popular approach by search engines is to try and guess term suggestions when they detect search results may not be satisfactory. You are probably familiar with Google’s “did you mean?” when you make an accidental typo. Well, it’s not that Google is trying to mock you or anything; it’s just that it was able to find a higher scoring term with a certain edit distance from the term you actually typed.
RavenDB provides a very easy and intuitive way of providing alternative terms for queries that returned little or no results, just like Google’s “Did You Mean?”. When a full-text query returns zero results, or when you have other indication of bad results being returned for a query, you can ask RavenDB to provide suggestions for the term or terms used in that query:
var query = session.Query<Book>("BooksIndex").Search(x => x.Author, "brwon");
var results = query.ToList();
if (results.Count == 0)
{
var suggestions = query.Suggest();
foreach (string suggestion in suggestions.Suggestions)
{
Console.WriteLine(suggestion);
}
}
In the code above, we created a query and issued it to get results, and we keep it aside so we can reuse it for suggestions if necessary. If no results are found for “brwon" in our data set, so we ask RavenDB for suggestions. Suggestions will return a list of terms. Each of them can be used to notify the user, or even re-issue the query, all depends on what you see fit for your application.
Finding related documents using MoreLikeThis
Many big online stores - Amazon for instance - maximize their profits by showing users “related products” in product pages and during check-out. Similarly, websites like CNN can get users to spend more time on their website by showing links to related content at the bottom of an article. Unlike what you might think, this doesn’t require a full editorial staff to through all your content. It can be done fairly easily by comparing data in one content entity to the rest of the content, and showing the highest ranking ones to the user. So the question remains – how can you do this easily and efficiently?
RavenDB exposes Lucene’s MoreLikeThis functionality, which creates a full-text search query from a document and uses that to find similar documents. The result is documents that are similar to the original document, based on the terms in the query document and their frequency. To do this you need to approach the RavenDB server with a document ID, and tell it which index to use for the comparison:
var list = session.Advanced.MoreLikeThis<Book>("BooksIndex",
new MoreLikeThisQuery
{
DocumentId = "books/2",
Fields = new[] {"Title", "Author", "Description"},
MinimumWordLength = 2,
}
);
The result of calling this method is an array of book objects RavenDB deems similar to the book used as a query. To get the most out of this feature, you want the lookup to be performed on text properties like title and description, and make sure they were indexed as Analyzed. Doing so will utilize RavenDB’s full-text search capability behind the scenes, and will maximize relevance of the products considered relevant.
It is also possible to perform fine-tuning and adjustments, for example to hand-pick what properties to use for the actual comparison, and what is the minimum or maximum word length. All of those are passed as parameters to the MoreLikeThisQuery object that is passed to the method, like shown above.