To show off the power of Cloud ML Engine we built two versions of the model independently—one in Scikit-learn and one in TensorFlow—and built a web app to easily generate predictions from both versions. Because these models were built with entirely different frameworks and have different dependencies, it previously required a lot of code to build even a simple app that queried both models. Cloud ML Engine provides a centralized place for us to host multiple types of models, and streamlines the process of querying them.
And before we get into the details, you may be wondering why you’d need multiple versions of the same model. If you’ve got data scientists or ML engineers on your team, they may want to experiment independently with different model inputs and frameworks. Or, maybe they’ve built an initial prototype of a model and will then obtain additional training data and train a new version. A web app like the one we’ve built provides an easy way to compare output, or even load test across multiple versions.
For the frontend, we needed a way to make predictions directly from our web app. Because we wanted the demo to focus on Cloud ML Engine serving, and not on boilerplate details like authenticating our Cloud ML Engine API request, Cloud Functions was a great fit. The frontend consists of a single HTML page hosted on Cloud Storage. When a user enters a movie description in the web app and clicks “Get Prediction,” it invokes a cloud function using an HTTP trigger. The function sends the text to ML Engine, and parses the genres returned from the model to display them in the web UI.
Here’s an architecture diagram of how it all fits together: