Thursday, 7 December, 2017 UTC


Summary

Being able to execute Full Text Search queries in Couchbase without the need for additional tooling such as Elastic is huge for NoSQL.
About a year ago, I had written about using Full Text Search (FTS) in Couchbase Server with the Node.js SDK. This is back when FTS was in developer preview. While still very valid, it doesn’t encapsulate the true power of what you can do with Full Text Search. Take facets for example. Facets are aggregate information collected on a result set and are useful when it comes to categorization of result data.
The above image shows an Amazon search. Let’s say we searched for Pokemon. The categories on the left, such as, Books or Movies & TV, can be considered search facets.
We’re going to see how to leverage this faceted search functionality in a Node.js application.
Going forward, you should already have Node.js as well as Couchbase Server 5.0+ installed and configured. We’re going to focus on the code and index creation to get faceted FTS working in our application.
Preparing a Sample Bucket with Sample Data
Instead of creating our own data to work with, we’re going to leverage optional sample data made available to anyone using Couchbase. We’re going to leverage the beer-sample bucket.
If you’re not sure how to install this bucket, from the Couchbase administrative dashboard, choose Settings and then choose Sample Buckets. The sample bucket will give us around ~8,000 documents to work with.
Creating an Index for Full Text Search with Couchbase NoSQL
Before any searching can happen against the database, a special FTS index must be created. The purpose of this index is to choose two properties in any given document to search against. One property will represent what we wish to search and the other will represent our facets as well as what we wish to search against.
Within the administrative dashboard, choose Search and then choose Add Index.
This is where things can get a little strange if this is your first time playing around with Full Text Search in Couchbase.
When designing the index, you’ll want to give it a name. I’m using the name, beer-search, but you can use whatever you want. Just make sure it is for the correct beer-sample bucket. Leave the Type Identifier as the default and jump into the Type Mappings section.
We plan to search against documents that have a type property that matches beer hence the mapping we created in the above image. It is important that we are only indexing the specified fields that follow, not the entire document. Under the new mapping, we need to create child fields. These child fields represent what we can facet and what we can search.
The description field will use the defaults, but we are selecting the store option. This will allow us to access it in the result. The category field will use the keyword analyzer and the store option. Because our categories contain multiple words per category, the analyzer needs to know how to handle the space delimited text. The keyword analyzer will allow us to work with these terms.
Finally, save the index and all documents should be indexed after a small period of time.
Executing a Full Text Search Query with Facets in Node.js
We’re going to query this newly created beer-search index in two parts to mimic how it’d be done on a site like Amazon. First we’re going to execute a query based on term and show the results as well as the facets. These facets will set us up for the second part.
Assuming you have a properly configured Node.js project available, add the following JavaScript code:
const Couchbase = require("couchbase");

const SearchQuery = Couchbase.SearchQuery;
const SearchFacet = Couchbase.SearchFacet;

const cluster = new Couchbase.Cluster("couchbase://localhost");
cluster.authenticate("demo", "123456")
const bucket = cluster.openBucket("beer-sample");

var tq1 = SearchQuery.term("coffee").field("description");

var query1 = SearchQuery.new("beer-search", tq1);
query1.addFacet("categories", SearchFacet.term("category", 5));
query1.limit(3);
bucket.query(query1, (error, result, meta) => {
    for(var i = 0; i < result.length; i++) {
        console.log("HIT: ", result[i].id);
        console.log("FACETS: ", meta.facets["categories"].terms);
    }
});
Most of the above code is related to establishing a connection to the cluster and preparing the searching that follows. What we’re most interested in is the following:
var tq1 = SearchQuery.term("coffee").field("description");

var query1 = SearchQuery.new("beer-search", tq1);
query1.addFacet("categories", SearchFacet.term("category", 5));
query1.limit(3);
bucket.query(query1, (error, result, meta) => {
    for(var i = 0; i < result.length; i++) {
        console.log("HIT: ", result[i].id);
        console.log("FACETS: ", meta.facets["categories"].terms);
    }
});
In the above code we’re defining a search term called tq1 that searches against the description property for coffee. When we create our search query, we define the index that we had previously created and add the search term.
We are adding a facet called categories, which is a name we just made up. The term that categories maps to is the category property within the document. We are also saying that we don’t want more than five facets to appear in our results.
When we execute the code, we should get something that looks like the following:
HIT:  lagunitas_brewing_company-cappuccino_stout
FACETS:  [ { term: 'North American Ale', count: 50 },
  { term: 'Irish Ale', count: 19 },
  { term: 'British Ale', count: 11 },
  { term: 'German Lager', count: 4 },
  { term: 'Belgian and French Ale', count: 3 } ]
HIT:  terrapin_beer_company-terrapin_coffee_oatmeal_imperial_stout
FACETS:  [ { term: 'North American Ale', count: 50 },
  { term: 'Irish Ale', count: 19 },
  { term: 'British Ale', count: 11 },
  { term: 'German Lager', count: 4 },
  { term: 'Belgian and French Ale', count: 3 } ]
HIT:  humboldt_brewing-black_xantus
FACETS:  [ { term: 'North American Ale', count: 50 },
  { term: 'Irish Ale', count: 19 },
  { term: 'British Ale', count: 11 },
  { term: 'German Lager', count: 4 },
  { term: 'Belgian and French Ale', count: 3 } ]
Notice that the document key is printed as well as any facets that come up in the search. The facet term also includes how many times they happen. If we wanted to, we could have included other fields in the result, but the document key and facets are fine for this example.
Now that we know our results, let’s narrow down our query.
Performing a Conjunctive Query with Multiple Search Terms in Node.js
Let’s assume that our user has searched for coffee, but has also chosen to narrow down the beer select to German Lager. This means on some front-end, the user has selected one of the facets after searching.
Let’s take a look at the following JavaScript code:
const Couchbase = require("couchbase");

const SearchQuery = Couchbase.SearchQuery;
const SearchFacet = Couchbase.SearchFacet;

const cluster = new Couchbase.Cluster("couchbase://localhost");
cluster.authenticate("demo", "123456")
const bucket = cluster.openBucket("beer-sample");

var tq1 = SearchQuery.term("coffee").field("description");
var tq2 = SearchQuery.term("German Lager").field("category");
var conjunction = SearchQuery.conjuncts(tq1, tq2);

query2 = SearchQuery.new("beer-search", conjunction);
query2.addFacet("categories", SearchFacet.term("category", 5));
query2.limit(3);
bucket.query(query2, (error, result, meta) => {
    for(var i = 0; i < result.length; i++) {
        console.log("HIT: ", result[i].id);
        console.log("FACETS: ", meta.facets["categories"].terms);
    }
});
We’ve made some changes.
Instead of having a single search term, we now have two search terms. One will search against the description, like the previous, but the new term will search against the category property. To search using two terms we have to perform what’s called a conjunctive query.
Executing the above code would yield a result that looks like the following:
HIT:  sprecher_brewing-black_bavarian_lager
FACETS:  [ { term: 'German Lager', count: 4 } ]
HIT:  red_oak_brewery-battlefield_bock
FACETS:  [ { term: 'German Lager', count: 4 } ]
HIT:  four_peaks_brewing-black_betty_schwartzbier
FACETS:  [ { term: 'German Lager', count: 4 } ]
Notice that our results only contain German Lagers in comparison to the variety we saw previously. This is because we were able to use the facet information to narrow down our results with a secondary query.
The Full JavaScript Code of the Example
To see everything put together, it would look like the following:
const Couchbase = require("couchbase");

const SearchQuery = Couchbase.SearchQuery;
const SearchFacet = Couchbase.SearchFacet;

const cluster = new Couchbase.Cluster("couchbase://localhost");
cluster.authenticate("demo", "123456")
const bucket = cluster.openBucket("beer-sample");

var tq1 = SearchQuery.term("coffee").field("description");

var query1 = SearchQuery.new("beer-search", tq1);
query1.addFacet("categories", SearchFacet.term("category", 5));
query1.limit(3);
bucket.query(query1, (error, result, meta) => {
    for(var i = 0; i < result.length; i++) {
        console.log("HIT: ", result[i].id);
        console.log("FACETS: ", meta.facets["categories"].terms);
    }
});

var tq2 = SearchQuery.term("German Lager").field("category");
var conjunction = SearchQuery.conjuncts(tq1, tq2);

query2 = SearchQuery.new("beer-search", conjunction);
query2.addFacet("categories", SearchFacet.term("category", 5));
query2.limit(3);
bucket.query(query2, (error, result, meta) => {
    for(var i = 0; i < result.length; i++) {
        console.log("HIT: ", result[i].id);
        console.log("FACETS: ", meta.facets["categories"].terms);
    }
});
Take note that the above code is not realistic. For one it is asynchronous. The reality of things would be controlled by user interaction. User performs one search, alters something, then performs a secondary search like previously described.
Conclusion
You just saw how to search with facets in Couchbase using Full Text Search (FTS) and the Node.js SDK. FTS is a way to query natural language and is very different than N1QL. FTS is very powerful and a lot of great things can be done.
For more information on using Full Text Search with Couchbase, check out the Couchbase Developer Portal.
The post Using Facets in a Couchbase NoSQL Full Text Search Query appeared first on The Couchbase Blog.