Integrating MongoDB and Amazon Kinesis for Intelligent, Durable Streams

Share this article

Integrating MongoDB and Amazon Kinesis for Intelligent, Durable Streams

This article was originally published on MongoDB. Thank you for supporting the partners who make SitePoint possible.

You can build your online, operational workloads atop MongoDB and still respond to events in real time by kicking off Amazon Kinesis stream processing actions, using MongoDB Stitch Triggers.

Let’s look at an example scenario in which a stream of data is being generated as a result of actions users take on a website. We’ll durably store the data and simultaneously feed a Kinesis process to do streaming analytics on something like cart abandonment, product recommendations, or even credit card fraud detection.

We’ll do this by setting up a Stitch Trigger. When relevant data updates are made in MongoDB, the trigger will use a Stitch Function to call out to AWS Kinesis, as you can see in this architecture diagram:

What you’ll need to follow along

  1. An Atlas instance
    If you don’t already have an application running on Atlas, you can follow our getting started with Atlas guide here. In this example, we’ll be using a database called streamdata, with a collection called clickdata where we’re writing data from our web-based e-commerce application.
  2. An AWS account and a Kinesis stream
    In this example, we’ll use a Kinesis stream to send data downstream to additional applications such as Kinesis Analytics. This is the stream we want to feed our updates into.
  3. A Stitch application
    If you don’t already have a Stitch application, log into Atlas, and click Stitch Apps from the navigation on the left, then click Create New Application.

Create a Collection

The first step is to create a database and collection from the Stitch application console. Click Rules from the left navigation menu and click the Add Collection button. Type streamdata for the database and clickdata for the collection name. Select the template labeled Users can only read and write their own data and provide a field name where we’ll specify the user id.

Figure 2. Create a collection

Configuring Stitch to Talk to AWS

Stitch lets you configure Services to interact with external services such as AWS Kinesis. Choose Services from the navigation on the left, and click the Add a Service button, select the AWS service and set AWS Access Key ID, and Secret Access Key.

Figure 3. Service Configuration in Stitch

Services use Rules to specify what aspect of a service Stitch can use, and how. Add a rule which will enable that service to communicate with Kinesis by clicking the button labeled NEW RULE. Name the rule “kinesis” as we’ll be using this specific rule to enable communication with AWS Kinesis. In the section marked Action, select the API labeled Kinesis and select All Actions.

Figure 4. Add a rule to enable integration with Kinesis

Write a Function that Streams Documents into Kinesis

Now that we have a working AWS service, we can use it to put records into a Kinesis stream. The way we do that in Stitch is with Functions. Let’s set up a putKinesisRecord function.

Select Functions from the left-hand menu, and click Create New Function. Provide a name for the function and paste the following in the body of the function.

Figure 5. Example Function - putKinesisRecord

exports = function(event){
 const awsService = context.services.get('aws');
try{
   awsService.kinesis().PutRecord({
     Data: JSON.stringify(event.fullDocument),
     StreamName: "stitchStream",
     PartitionKey: "1"
      }).then(function(response) {
        return response;
      });
}
catch(error){
  console.log(JSON.parse(error));
}
};

Test Out the Function

Let’s make sure everything is working by calling that function manually. From the Function Editor, Click Console to view the interactive javascript console for Stitch.

Functions called from Triggers require an event. To test execution of our function, we’ll need to pass a dummy event to the function. Creating variables from the console in Stitch is simple. Simply set the value of the variable to a JSON document. For our simple example, use the following:

event = {
   "operationType": "replace",
   "fullDocument": {
       "color": "black",
       "inventory": {
           "$numberInt": "1"
       },
       "overview": "test document",
       "price": {
           "$numberDecimal": "123"
       },
       "type": "backpack"
   },
   "ns": {
       "db": "streamdata",
       "coll": "clickdata"
   }
}
exports(event);

Paste the above into the console and click the button labeled Run Function As. Select a user and the function will execute.

Ta-da!

Putting It Together with Stitch Triggers

We’ve got our MongoDB collection living in Atlas, receiving events from our web app. We’ve got our Kinesis stream ready for data. We’ve got a Stitch Function that can put data into a Kinesis stream.

Configuring Stitch Triggers is so simple it’s almost anticlimactic. Click Triggers from the left navigation, name your trigger, provide the database and collection context, and select the database events Stitch will react to with execution of a function.

For the database and collection, use the names from step one. Now we’ll set the operations we want to watch with our trigger. (Some triggers might care about all of them – inserts, updates, deletes, and replacements – while others can be more efficient because they logically can only matter for some of those.) In our case, we’re going to watch for insert, update and replace operations.

Now we specify our putKinesisRecord function as the linked function, and we’re done.

Figure 6. Trigger Configuration in Stitch

As part of trigger execution, Stitch will forward details associated with the trigger event, including the full document involved in the event (i.e. the newly inserted, updated, or deleted document from the collection.) This is where we can evaluate some condition or attribute of the incoming document and decide whether or not to put the record onto a stream.

Test the Trigger!

Amazon provides a dashboard which will enable you to view details associated with the data coming into your stream.

Figure 7. Kinesis Stream Monitoring

As you execute the function from within Stitch, you’ll begin to see the data entering the Kinesis stream.

Building More Functionality

So far our trigger is pretty basic – it watches a collection and when any updates or inserts happen, it feeds the entire document to our Kinesis stream. From here we can build out some more intelligent functionality. To wrap up this post, let’s look at what we can do with the data once it’s been durably stored in MongoDB and placed into a stream.

Once the record is in the Kinesis Stream you can configure additional services downstream to act on the data. A common use case incorporates Amazon Kinesis Data Analytics to perform analytics on the streaming data. Amazon Kinesis Data Analytics offers pre-configured templates to accomplish things like anomaly detection, simple alerts, aggregations, and more.

For example, our stream of data will contain orders resulting from purchases. These orders may originate from point-of-sale systems, as well as from our web-based e-commerce application. Kinesis Analytics can be leveraged to create applications that process the incoming stream of data. For our example, we could build a machine learning algorithm to detect anomalies in the data or create a product performance leaderboard from a sliding, or tumbling window of data from our stream.

Figure 8. Amazon Data Analytics - Anomaly Detection Example

Wrapping Up

Now you can connect MongoDB to Kinesis. From here, you’re able to leverage any one of the many services offered from Amazon Web Services to build on your application. In our next article in the series, we’ll focus on getting the data back from Kinesis into MongoDB. In the meantime, let us know what you’re building with Atlas, Stitch, and Kinesis!

Resources

MongoDB Atlas

MongoDB Stitch

Amazon Kinesis

Frequently Asked Questions on Integrating MongoDB and Amazon Kinesis

What are the benefits of integrating MongoDB with Amazon Kinesis?

Integrating MongoDB with Amazon Kinesis offers several benefits. Firstly, it allows for real-time data streaming, which is crucial for applications that require immediate insights from data. Secondly, it provides a durable and scalable solution for handling large volumes of data. This integration also enables intelligent data processing, allowing businesses to make data-driven decisions effectively. Lastly, it offers a cost-effective solution for data management, as both MongoDB and Amazon Kinesis are designed to reduce operational costs.

How does MongoDB work with Amazon Kinesis?

MongoDB works with Amazon Kinesis through a process known as data streaming. Data from MongoDB is sent to Amazon Kinesis in real-time, where it is processed and analyzed. This data can then be used for various purposes, such as real-time analytics, dashboarding, and decision-making.

What is the cost of integrating MongoDB with Amazon Kinesis?

The cost of integrating MongoDB with Amazon Kinesis can vary depending on several factors, including the volume of data being processed, the number of read and write operations, and the region in which your data is stored. It’s recommended to check the pricing details on the official MongoDB and Amazon Kinesis websites for the most accurate information.

Is it difficult to integrate MongoDB with Amazon Kinesis?

The integration process between MongoDB and Amazon Kinesis can be complex, especially for those without prior experience. However, with the right guidance and resources, it can be accomplished effectively. It’s recommended to follow a step-by-step guide or tutorial to ensure a successful integration.

Can I use MongoDB with Amazon Kinesis for my small business?

Yes, MongoDB and Amazon Kinesis can be used for businesses of all sizes. These platforms are scalable, meaning they can handle data volumes of any size. This makes them suitable for small businesses that may have lower data volumes but still require effective data management solutions.

What are the alternatives to Amazon Kinesis for MongoDB integration?

There are several alternatives to Amazon Kinesis for MongoDB integration, including Apache Kafka, Google Cloud Pub/Sub, and Azure Event Hubs. These platforms offer similar features to Amazon Kinesis, such as real-time data streaming and scalable solutions.

How secure is the integration of MongoDB with Amazon Kinesis?

The integration of MongoDB with Amazon Kinesis is designed to be secure. Both platforms offer robust security features, including encryption, access control, and auditing capabilities. However, it’s important to follow best practices for data security to ensure your data is protected.

Can I integrate MongoDB with Amazon Kinesis without coding knowledge?

While it’s possible to integrate MongoDB with Amazon Kinesis without coding knowledge, it may be more challenging. The integration process typically involves some level of coding, particularly when setting up data streaming. However, there are resources and tutorials available that can guide you through the process.

What kind of data can I process with MongoDB and Amazon Kinesis?

MongoDB and Amazon Kinesis can process a wide range of data types, including structured and unstructured data. This includes text, images, audio, video, and more. This makes them suitable for a wide range of applications, from analytics to machine learning.

How can I troubleshoot issues with my MongoDB and Amazon Kinesis integration?

If you encounter issues with your MongoDB and Amazon Kinesis integration, there are several steps you can take. Firstly, check the error logs for any clues about what might be causing the issue. You can also consult the official documentation or reach out to the support teams for both platforms.

Michael LynnMichael Lynn
View Author

Michael Lynn is the Worldwide Director of Developer Advocacy at MongoDB. Previously, Michael worked as a Senior Solutions Architect at MongoDB, helping users optimize MongoDB for scale and performance.

joelfmongodbsponsored
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week