Tuesday, 18 July, 2017 UTC


Summary

Querying and indexing document arrays is one of the most powerful features of Couchbase.  Finding array entries within a specific date range is a common requirement.  Consider the following use case.
User Story: “I want to index an embedded account history array within documents in my database so that I can perform a range query for array entries within a specific date range.”
Consider the following json structure:
{ 
 "name":"", 
 "username":"", 
 "email":"", 
 "address":{ 
 "streetA":"", 
 "streetC":"", 
 "streetD":"", 
 "city":"", 
 "state":"", 
 "country":"", 
 "zipcode":"" 
 "geo": {
 "lat":"", 
 "lng":"" 
 }
 },
 "phone":"", 
 "website":"", 
 "company": {
 "name":"", 
 "catchPhrase":"", 
 "bs":"" 
 },
 "posts": [
 {
 "words":"", 
 "sentence":"", 
 "sentences":"", 
 "paragraph":"" 
 },...
 ],
 "accountHistory": [
 {
 "amount":"", 
 "date":"", 
 "business":"", 
 "name":"", 
 "type":"", 
 "account":"" 
 },...
 ]
}
I have an array called accountHistory that could include 0-N number of entries.   If I want to query a certain date range, I can easily do this by defining a secondary index on the date field.
CREATE INDEX date_range ON
 default(DISTINCT ARRAY v.date FOR v IN accountHistory END)
I’ve created a unique index entry by using DISTINCT for each date field in each accountHistory entry.   This means every document in the cluster that has an array of accountHistory entries will be included in my index as long as the accountHistory entry has a field called “date“.  It’s possible I could have multiple types of items in the accountHistory array and I only want items that have a date field to be included.  If there’s no date field, the indexer will not include that entry in the index.  Now I can include date ranges in my query predicate:
SELECT default.email, v.account, v.type, v.amount, v.name, v.email
 FROM default
 UNNEST accountHistory v
WHERE v.date BETWEEN '2016-02-01T00:00:00.000Z' AND '2017-06-01T00:00:00.000Z'
I’ve used one of my other favorite features of N1QL in the query: UNNEST.   It allows me to shape the JSON and include root document level fields back in my results.   In my query each entry returned back will include the account history information that is within the date range of my predicate.  I also want the email address from the document it came from to also be included, and I don’t want to have to write additional JSON parsing logic to peel that out in my application.   That is the power of using unnest.  The query returns the following:
"results": [
 {
 "account": "68475391",
 "amount": "416.37",
 "date": "2016-02-01T00:00:00.000Z",
 "email": "[email protected]",
 "name": "Credit Card Account 0008",
 "type": "invoice"
 },...
Try It Out: Docker is my favorite way to spin up a development environment.   An easy to use repo for the above examples is on github: n1ql-query-nodejs .  It uses docker-compose to build two services:
  1. A single node Couchbase cluster service.
  2. A nodejs service to provision the Couchbase cluster with 250,000 user profiles and indexes for several examples, including date range queries for document arrays.
The post Querying Date Ranges within Embedded JSON Document Arrays. A Simple Example Using N1QL, Nodejs and Docker appeared first on The Couchbase Blog.