A DeepDream Web Service for $5 a Month
Google’s DeepDream neural net image processing library is a stunning application of advanced technology. If you haven’t heard of it, DeepDream uses an image recognition system in reverse - instead of trying to identify which objects are in a photo, it accentuates what it sees, producing extremely trippy visuals:
While DeepDream is cool, it’s also notoriously difficult to set up, as it was built by researchers with exceedingly complex software tools. Shortly after its launch, Matthew Ogle and myself decided to put together a web interface - http://deepdre.am to make the process simpler.
The site itself is pretty trivial - one page, with three options and one upload button. The fun part wasn’t the visual design or the user experience, but rather the scalable backend services that adapt the system to varying amounts of load without costing much more than a fancy coffee each month.
Really Tiny Microservices
“Microservice” is a buzzword that’s used exceedingly often in today’s Hacker News. The concept is simple - to break apart disparate functions of your application as a whole into small services that can be updated, scaled and managed independently. In theory, this reduces the “blast radius” of a single system failure (be it a hardware failure, network instability, logical error, or any other problem) to at most a single service. In practice, microservices often become tightly intertwined with one another, preventing this goal from being achieved.)
When building deepdre.am, I decided to naïvely try out this concept and split out each logical component of the application into its own service. This resulted in 7 services, 5 of which are front-facing web services:
upload, which accepts image uploads, validates their format and adds them to the queue of images to be processed
progress, which provides progress updates (via long polling)
abuse, which allows users to report images that violate the TOS
monitor, which provides an administrative status dashboard
- removes images and metadata from a queue
- runs the DeepDream algorithm on each image
- emails the uploader to notify them that their image is done
scale, which observes the queue and spins up cheap Amazon EC2 spot instances as necessary to process images
All of these microservices are implemented in Go, Google’s light, simple, and highly concurrent programming language. Using Go ensures some modicum of type safety, allowing me to catch trivial errors at compile time. Go is also trivial to deploy (binaries are generally statically linked and dependency-free) and extremely lightweight.
Each of the services listed above consumes around 4MB of memory when serving HTTP requests - less than half the memory used by Ruby just to load the interpreter, not to mention loading Rails or Sinatra. When running on a bare-bones web server to keep costs down, this tiny memory footprint makes an extremely noticeable difference in website performance. On average, both static assets and API requests are served within 30 milliseconds - unheard of when using a large framework like Rails.
While microservices are generally supposed to be fairly isolated, each with their own data stores and infrastructure, this project is small enough that I opted to share data stores. In this case, Redis is used for ephemeral data (queueing tasks to be processed1, progress updates, and notifications between processes) while MySQL2 is used for more permanent data. This approach allowed me to keep many of the advantages of microservices - including independent scalability, quick development, and tiny codebases - without using too many resources by spinning up multiple databases.
Hey Amazon, can you spot me $5?
As with all of my side projects, my primary goal when building deepdre.am was not just to create a service, but to do so at absolutely minimal cost. My target is to spend under $10 each month on everything - instance hosting, amortized domain costs, S3 usage, and bandwidth. (I’ve found DigitalOcean3 to be powerful enough to support tens of thousands of monthly active users for $5/month, but there are countless hosts at similar price points.)
Hosting Redis, MySQL, and a handful of Go-based microservices on a $5 cloud host is trivial and speedy enough to support thousands of hits per minute. DeepDream, however, is a computationally taxing algorithm that requires a lot of processing power, and - ideally - a GPU to execute on.
This presented me with a hard problem. How do you provide quick response times without paying $468/month for a
g2.2xlarge instance on Amazon EC2?
The answer turned out to be simple: use a combination of a task queue and EC2 spot instances. When load on the system is low, the $5 VPS can slowly process images, saving money. When load on the system grows, however, spot instances are used to speed things up:
g2.2xlargespot instance can process approximately one image per second, and costs on the ballpark of $0.10/hour. However, as spot instances can be terminated at any time, applications must be termination-aware. (Amazon has a new “Spot Instance Termination Notice” feature that can come in handy here, but processes can also simply respond quickly to
SIGTERMsignals to clean up before an instance is terminated4.)
- As EC2 instances are billed rounding up to the hour, committing to spawning an instance will cost at least $0.105 and process at most 3600 images per hour.
- To maximize value, an instance should be spawned when the number of images waiting in the queue to be processed approaches 36006.
- If an instance is spawned but is no longer necessary (due to the queue being emptied quickly) then it should remain running until its age reaches 59 minutes, as Amazon bills for instances by the hour and rounds up.5
To spawn spot instances, I used Mitchell Hashimoto’s
goamz package, which is a thin Go wrapper around Amazon’s AWS APIs. A combination of a custom private AMI (that includes all of the required software) and cloud-init user data (to force a code update from source control) allows an instance to boot, connect to its data stores, and begin processing tasks in less than 120 seconds.
Practically, this combination of queueing and spot instances keeps expenses for deepdre.am extremely low - somewhere between $5 and $10 per month, depending on system load. (Amazon’s Billing Alerts also allow me to keep an eye on my usage, to avoid unexpected spending - and to respond to spikes in traffic as necessary.)
Don’t forget about taxes
The title of this post is mostly true - when load on the system is low, costs are around $5 each month. However, small additional costs do add up:
- The domain, deepdre.am, costs $73/year, or $6/month. (Special thanks to Matthew Ogle for buying an expensive Armenian domain name on a whim in response to a tweet.
- I lazily used Amazon S3 to store and serve images, which costs $0.0007 per image in both storage and transfer fees. (That’s approximately 1,428 images per dollar, which can go away once the S3 dependency is removed altogether.)
- Assuming that the site waxes and wanes in popularity in a given month, it’s reasonable to expect about 20 hours of
g2.2xlargespot instance usage, which costs approximately $2.
Give it a try!
While the code’s not (yet) open source, the site is currently up and running - give it a try at deepdre.am and transform your images, if for no other reason than to stress test the system!
Special thanks to Malcolm Ocean for reviewing this post.
As one does, I built my own Redis-backed queueing library with Go bindings that came in very handy here. Many better alternatives exist - I would recommend using something more well supported like Github’s Resque or Salvatore Sanfilippo’s disque. ↩
Yup, MySQL. I had a Puppet manifest laying around for a well-configured MySQL instance, and saved a grand total of 30 minutes by bolting together existing components rather than switching to Postgres. Such is the nature of quick hack projects. ↩
Terminating an instance seems to result in the normal ACPI shutdown process, which sends
SIGTERMto all processes, allowing processes to finish their tasks or put them back into queues if necessary. This is a terrible practice, as instances could suffer non-graceful failures at any time, and should not be relied upon to put their tasks back into queues - but for an application as frivolous and simple as deepdre.am, the 2-second delay between receiving
SIGTERMand losing power to an instance seems to allow for enough cleanup. ↩
Instances that are terminated by Amazon before their first hour has elapsed are free, so it’s also possible that an instance could cost $0.00. This means that it’s advantageous to wait until the 59 minute mark before terminating any spot instance, to increase the likelihood that Amazon will terminate the instance for you, making the entire hour free. ↩
This is a knob that can be tweaked - the two extremes are “spend lots of money and have images process very quickly” and “spend very little money and use a spot instance only when the queue becomes huge.” ↩