JSFeeds: blog.getstream.io - The Engine That Powers Winds ��

Monday, 27 August, 2018 UTC

The Engine That Powers Winds ��

Summary

For those of you who don’t know, Winds (the popular open-source RSS and Podcast application) is powered by Stream – a SaaS offering that is specifically dedicated to powering news and activity feeds. If you’d like to get jumpstart your knowledge of feeds, we have an awesome 5-minute tutorial that outlines how to use Stream. If you’re already familiar with Stream and/or the wonders of feed technology, read on!

Stream allows Winds to have capabilities like:

The ability to follow RSS feeds or podcasts.
Real-time notifications about feed changes, enabling Winds to immediately surface new content, whenever it is available.
Recommendations for new RSS feeds and podcasts to follow.

Building activity feeds that are both scalable and relevant is difficult. Traditionally, companies have relied on Cassandra or Redis to build their feeds. Building feeds in-house is time-consuming, expensive and hard to maintain. Stream makes it extremely easy and cost effective to build a scalable, relevant feed. Feeds typically load in 11ms and with 300+ million end users, Stream has been battle-tested in and withstood some of the roughest conditions.

In this short post, we’ll dive into how we’re using Stream to power follow relationships, real-time functionality, and content discovery in Winds, as well as how it allows our team to confidently scale the application seamlessly and offer up fresh, relevant content to our users in the blink of an eye. Enjoy!

What’s a Feed?

A “feed” describes the structure that you see on many popular social media apps today; feeds allow users to scroll through and interact with content as they view it. Technically speaking, feeds are a part of the Activity Stream spec; there is an official spec for Activity Streams (also known as “Activity Feeds”). The official documentation can be found here.

At a high level, the spec outlines how to properly send JSON in a representation that is suitable for building an activity feed. At Stream, we follow the spec closely and provide the required parameters that should be sent, but also offer the ability to send custom data.

Stream provides the following “Feed Types”:

Flat – The most common style of feed. This feed type allows you to write to a specific feed (e.g. timeline, and display the contents in a chronological order). The flat feed can also be followed by other feeds, as well as surface real-time notifications (this is done using a websocket connection baked right into our JavaScript SDK).
Aggregated – This type of feed is the advanced feed, allowing for activities to be grouped and displayed using an “Aggregation Format”.
Notification – Think of this feed type as an “aggregated feed” with extended functionality. It can be modified so that items within the feed can be marked as seen or read (think Facebook’s notification feed).

As a real-world example, Facebook’s entire application is almost one giant news feed that uses custom algorithms to surface content that you are most likely to interact with (Stream can do that too with in-house, customized personalization).

The notification feed (the drop-down with updates), keeps users engaged with the content and automatically marks items as seen or read based on their interaction with the content. Facebook is one example, however, Twitter and Pinterest are other examples of popular apps that use feed technology.

Adding Activities to a Stream Feed in Winds

Winds has a rather robust backend in place to power all of the functionality that you see surfaced on the client side of the application. For example, aside from the frontend code, we have an API and a set of several workers churning through content every so often.

With that said, every time our workers parse RSS or Podcast content, they create a record in MongoDB. Once a callback from the MongoDB insert is returned, the workers then start churning through all articles (for RSS feeds) or episodes (for podcast feeds), storing those in MongoDB, and finally storing them in Stream. Similar to above, the database returns a unique _id value that we use as the “foreign_id” in Stream.

Fortunately, Stream makes adding activities to our feeds extremely straightforward. It can be done with the REST API or via any SDK available by Stream. With the JavaScript SDK, adding an article activity to our RSS feed looks something like this:

View the code on Gist.

Let’s break the example down:

actor is the user (or system) performing the activity
verb is the action taken by the user (or system)
object is a reference to the object of the activity (in our case, the unique ID)
time is a required value and is the time of the activity (when it was created) – this value ensures uniqueness and provides the ability to later modify the activity if necessary
foreign_id: is the unique identifier from the application’s database for the activity (and is used for lookups if you need to make a change at a later date)

Note: The term feed is used a lot in this section and can look as if it can be interchangeable. Please don’t get mixed up – an RSS or Podcast feed is the content available at a given feed URL (e.g. https://somewebsite.com/rss.xml), whereas an activity feed is specific to Stream.

If you’re interested in reading a more thorough breakdown, we have a full list with descriptions on the Stream site here.

Following Feeds

Follow relationships are fundamental, if not the most important part of social networks and many other applications that utilize feeds. A follow relationship allows one feed to link to another feed, causing activities to be visible in all feeds that are bound by the follow relationship. For example, when an activity is added to a feed, it is automatically added to any other feeds that follow the parent feed.

In Winds, we have many relationships. The most important and easy to understand relationships are between users and the RSS and Podcast “feeds” they follow.

Note: Feeds in this sense is referencing feed groups in Stream and NOT a feed URL.

It’s within the feed groups “user_article” and “user_episode” where follow relationships are held. Below is a simple example script that shows you how to create a follow relationship between multiple items to the parent RSS feed via the “followMany” command in the JavaScript SDK:

View the code on Gist.

Note: Only “flat” feed types may be followed. Additionally, a feed cannot follow itself.

One of the cool features that Winds has to offer is a “.OPML” file import. OPML is the standard for importing and exporting RSS feeds. It’s written in XML so we do all of the parsing, create the feeds as mentioned above, and then do what’s called a bulk follow.

If you’re interested in a bulk follow, have a look at the following snippet for an example:

View the code on Gist.

For additional information on following feeds, have a look at our documentation.

Stream Feed Structure for Winds

If you’ve had a chance to inspect the Winds codebase or use the application, you know that it’s a complex one. In order to facilitate the various functionality within Winds, we rely heavily on Stream to handle our feeds. You can think of a single feed group as a table in your database – where each row is an activity.

Here’s a quick rundown on the Winds feed group structure:

podcast (flat)
user_article (flat)
user_episode (flat)
rss (flat)

As suggested, each feed holds onto associated data. For example, when we parse an RSS feed or a podcast feed, we insert a new activity into the corresponding feeds. All followed articles and episodes are stored in either user_article or user_episode – both of which are connected to the RSS and podcast feeds via “follow relationships” in Stream. Follow relations are what allow us to make an association between a user and the content that they have opted in to consume.

Now that we have follow relationships in place, it’s as easy as issuing a GET request to the Stream API to receive the follows. Once we receive the follows, we issue an API call to our database for Winds (MongoDB) and use the response to “enrich” the data. Once it’s has been enriched, we can display all of the news feed data to the user inside of Winds.

Note: Enrichment is the process of taking a small subset of data, (e.g. an object with an ID for our podcast) and expanding it with data from our database. This is an important process because it allows us to keep our payload to a manageable size, thus reducing the network I/O and increasing transfer speed. Also, know that it’s important to never store personally identifiable information (PII) inside of Stream.

How We Use Feeds in Winds

Feeds are heavily used within Winds. In fact, we use feeds to display almost everything within Winds. In the screen below, we use the following feeds:

RSS (recent articles)
Podcast (recent episodes)
Discover is a combination of RSS & Podcast feeds powered by Stream Personalization.

One interesting piece of Winds to point out is that the “Discover” section is showing a mashup of recommended RSS and Podcast feeds; the RSS and Podcast feed recommendations are powered by two personalization endpoints provided by Stream. By using personalization, we are able to surface content with which the user is most likely to interact, based on their previous clicks, reads, listens, and overall content throughout the Winds application.

Real-time & Web Sockets

Best of all, when an update comes through from one of our scraping workers, we receive a real-time notification from Stream and let the user know that they should refresh the application for updated content.

Retrieving Recommendations from Stream

Stream makes it easy to add personalized feeds to your application. As your users interact with your application, Stream starts to understand what they are interested in. With insights, the possibilities are endless. Here are just a few use-cases that we’ve seen in the wild:

Personalize Feeds
Create Follow Suggestions
Optimize Emails
Product Recommendations
Content Recommendations

Note: Stream’s Personalization functionality is specific to each app. For that reason, you need to first reach out to our data science and sales team so that we can better understand your applications needs and better cater our Personalization functionality to your application before using this feature.

Personalization plays a strong role within Winds. It powers our content discovery in entirety.

The discovery section is based on your interests set during the creation of your account, clicks, reads, and listens. With all of this data, Stream goes to work and churns through several complex algorithms to recommend content.

To load the content into the view, we hit a unique endpoint that is proxied through our API, then merge the data client side. Do note that this also goes through an “enrichment” process, where we take the recommended unique identifier of the feed, and perform a lookup against our database.

Thank You!

Thank you for taking the time to read this. I hope that this example walkthrough gave you a better understanding of how feeds work.

If you’re curious about Stream and want to give our API a try, we have a 5-minute tutorial that will walk you through all of the steps for the various feed types outlined above. I strongly recommend giving it a shot. Also, How Stream uses RocksDB, Raft and Go to power the feeds for over 300 million users is a great read if you’re interested in architecture powering Stream.

If you haven’t downloaded or signed up for Winds yet, it’s available for the web, macOS, Linux, and Windows – you can get started with Winds here.

As always, if you have any questions or comments, please drop them in the comments below !

The post The Engine That Powers Winds 🚂 appeared first on The Stream Blog.

... more @ blog.getstream.io

blog.getstream.io