Production-ready chatbot in GCP for less than a dollar
We have all been there — having a nice idea for a hackathon, hobby or a side project and having a burning desire to start coding as soon as possible. And how many possibilities (Heroku, Glitch and others) are there now to bootstrap your app and deploy it immediately.
Finding the balance between overcomplicated structures and oversimplified solutions with no security is a particularly challenging task sometimes. I think that modern Google Cloud Platform ecosystem provides a nice toolset for solving this problem without vendor lock-in (almost, of course we'll use some specific services but they are easy to migrate from).
As an example for an app I will use the chatbot application. It's open sourced and hosted in GitHub.
What's better than a live demo? A live demo that is available on demand! Go to Facebook page for the chatbot and drop a few lines to it. Notice how the first response takes some time and after that the responses are smooth and fast. (Of course, if somebody used the app in the past 15 minutes there would be no warm-up time).
What just happened here? The Facebook Messenger Platform got your message to the app page and hit the webhook for the chatbot backend. The Cloud Run service which is responsible for backend in our app spun up a new instance of the Docker container with our app and processed the request.
Let's take a look at the overall app architecture in the GCP and then go through the deployment pipeline.
The bot itself is using the Python framework Flask for handling requests. For storage the NoSQL Firestore is used as it's a good fit for small number of concurrent requests and there are no complicated relational queries.
The code is packaged within the Docker image which is built by the Cloud Build service and pushed to the GCP Container Registry. The Cloud Run service is basically the light version for the Kubernetes. It is managing the service exposure to the world, autoscaling, versioning and SSL keys rotation. Each deployed service deployed to the Cloud Run gets a service domain name with HTTPS enabled but you can also use the custom domain name and have SSL encryption enabled as well.
The Cloud Run service is responsible for handling incoming requests. The app inside the Docker container can be anything you need (in terms of languages, libraries or other internal dependencies). The only thing you must remember is that apps inside Cloud Run should be stateless. For attaching volumes you would need to use the usual Kubernetes or other services. Storing data in this case is conveniently done with managed SQL or NoSQL services.
The deployment is triggered automatically with push to the master branch. After that Cloud Build picks up the source code, builds a new image and stores it in the GCR. Later Cloud Build creates a new revision of the service in Cloud Run and switches traffic to it. You could also configure a partial switch here for implementing a gradual rollout and detect anomalies before the service goes live for every user.
How to use it?
We'll go through the process of setting up the similar solution by forking the chatbot app in this article.
We'll fork it and clean up so you could have a quick start but if you need a more complex example feel free to return to the original repository for reference.
- Fork the repository with the chatbot application as it contains all the necessary files.
- In the GCP enable the Firestore, Cloud Build and Cloud Run APIs.
- Create a new project in GCP and go to https://console.cloud.google.com/cloud-build
- Connect a forked repository.
- Create a new trigger for this repository and specify the following Substitution variables:
_SERVICE_NAME (for the name of the service in Cloud Run)
_REGION (region to deploy the service)
_IMAGE_NAME (image name to store in GCR)
- Clone the forked repository
- Go inside the repository folder and delete the files that are specific for the chatbot:
rm config.py fb.py gcp.py logging_handler.py skills.json
Replace the content of the app.py file with the following snippet:
import os from flask import request from flask import Flask app = Flask(__name__) @app.route("/", methods=["GET"]) def hello_there(): return "Hello Cloud Run", 200 if __name__ == "__main__": app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
We prepared our repository for continuous delivery of our application. Now when you'll commit the changes and push them to a forked repository the Cloud Build service will build and deploy a new application.
- Commit all the changes and push it to the remote forked repository.
- Go to https://console.cloud.google.com/run and retrieve the URL for a service. Check that it's working and we receive the "Hello Cloud Run" message.
- Go to https://console.cloud.google.com/cloud-build to see the information about the build or any problems that occured.
Now let's take a look at the most interesting files here:
- cloudbuild.yaml contains all the build and deployment steps. We specify the docker images used for each build step and provide arguments to it. The last step is creating a new revision with a fresh Docker image and finishing up the deployment.
- Dockerfile contains the information about how container is executed. In this example we are using gunicorn webserver and serving our Flask app.
Billing and costs
Each service in this demo except for the Secrets Manager has a free tier. Cloud Run is not running 24/7 but only spins up a container instance when there is a need. This enables us to save a lot of resources, gives us more control over the application (it's not a serverless function or similar solution) and grants the autoscaling capabilities.
The total cost for running this project with a low load (it's a simple chatbot that is not interacting with user all the time) is the cost of 2 secrets in the Secrets Manager service which is 0.12 $ per month. Everything else is covered by the free tier. Of course this applies only if the app is not having a surge in users but even in this case it could automatically scale up and down without intervention from our side.
Building small applications for fun, side project or as a hackathon entry is always a tricky task. We could have no setup at all and the app would never even had a change to communicate with users. We could have an extremely expensive setup living on a trial money from GCP or other provider which would be shutdown as soon as there are no more credits. We could also deploy it as a serverless app or use a PaaS provider but it would mean we lose a major control over the app and if we would like to migrate in the future it would require some time to do this.
The system architecture in this article is an attempt to make small projects available for a longer periods of time while keeping them a little more secure and ready for the potential growth.