Posts Tagged Infrastructure

Data science on demand: spinning up a Wallaroo cluster

Data science on demand: spinning up a Wallaroo cluster

This guest post is from Simon Zelazny of Wallaroo Labs, and previously appeared on the Wallaroo Labs blog. Find out how Wallaroo powered their cluster provisioning with Pulumi, for data science on demand.

Oh no, more data!

Last month, we took a long-running pandas classifier and made it run faster by leveraging Wallaroo’s parallelization capabilities. This time around, we’d like to kick it up a notch and see if we can keep scaling out to meet higher demand. We’d also like to be as economical as possible: provision infrastructure as needed and de-provision it when we’re done processing.

If you don’t feel like reading the post linked above, here’s a short summary of the situation: there’s a batch job that you’re running every hour, on the hour. This job receives a CSV file and classifies each row of the file, using a Pandas-based algorithm. The run-time of the job is starting to near the one-hour mark, and there’s concern that the pipeline will break down once the input data grows past a particular point.

In the blog post, we show how to split up the input data into smaller dataframes, and distribute them among workers in an ad-hoc Wallaroo cluster, running on one physical machine. Parallelizing the work in this manner buys us a lot of time, and the batch job can continue processing increasing amounts of data.

Read more →

Simple, Reproducible Kubernetes Deployments

Simple, Reproducible Kubernetes Deployments

Kubernetes is a powerful container orchestrator for cloud native applications that can run on any cloud – AWS, Azure, GCP – in addition to hybrid and on-premises environments. Its CLI, kubectl, offers basic built-in support for performing deployments, but intentionally stops short here. In particular, it doesn’t offer diffs and previews, the ability to know when a deployment has succeeded or failed, and why, and/or sophisticated deployment orchestration.

In this post, we’ll see how Pulumi, an open source cloud native development platform, can not only let you express Kubernetes programs in real programming languages, like TypeScript, instead of endless YAML templates, but also how Pulumi delivers simple and reproducible, yet powerful, Kubernetes deployment workflows.

Read more →

Creating and Reusing Cloud Components using Package Managers

Creating and Reusing Cloud Components using Package Managers

Hello! A few weeks back I wrote a post on serving static websites on AWS with Pulumi detailing how to host a static website on AWS. Pulumi allowed me to wire four different AWS products together in only 200 lines of code. It would be a shame, however if I needed to copy and paste that code every time I wanted to to stand up a new website. Instead, we can package up, share, and reuse our code just like any other Node.js library. It just so happens that this one can be used to create cloud infrastructure.

Read more →

Program the Cloud with 12 Pulumi Pearls

In this post, we’ll look at 12 “pearls” – bite-sized code snippets – that demonstrate some fun ways you can program the cloud using Pulumi. In my introductory post, I mentioned a few of my “favorite things”. Now let’s dive into a few specifics, from multi-cloud to cloud-specific, spanning containers, serverless, and infrastructure, and generally highlighting why using real languages is so empowering for cloud scenarios. Since Pulumi lets you do infrastructure-as-code from the lowest-level to the highest, we will cover a lot of interesting ground in short order.

Read more →

Serving a Static Website on AWS with Pulumi

Hello! This post covers using Pulumi to create the infrastructure for serving a static website on AWS. The full source code for this example is available on GitHub.

Setting up the infrastructure to serve a static website doesn’t sound like it would be all that difficult, but when you consider HTTPS certificates, content distribution networks, and attaching it to a custom domain, integrating all the components can be quite daunting.

Fortunately this is a task where Pulumi really shines. Pulumi’s code-centric approach not only makes configuring cloud resources easier to do and maintain, but it also eliminates the pain of integrating multiple products together.

This isn’t a hypothetical benefit of using the Pulumi programming model. We use a setup similar to the one described in this post for powering our own static websites, like www.pulumi.com and get.pulumi.com.

Read more →

How we use Pulumi to build Pulumi

How we use Pulumi to build Pulumi

Here at Pulumi we are (perhaps unsurprisingly!) huge fans of using Pulumi to manage our cloud infrastructure and services. We author our infrastructure in strongly-typed programming languages, which allows us to to benefit from rich tooling - documenting and factoring our infrastructure using the same software engineering practices we apply to our application code. This also allows us to create reusable abstractions which accelerate our ability to deliver new features and services, and our ability to standardize and refactor infrastructure patterns across our services with relative ease.

Like other users, we use Pulumi at a variety of levels of abstraction. We use Pulumi for raw infrastructure provisioning, defining the core networking layer for our AWS-based backend infrastructure. And we use Pulumi to define how our application services are deployed into ECS using just a few lines of code. Pulumi hosts and manages static content for www.pulumi.com and get.pulumi.com. We use Pulumi to define the CloudWatch dashboards connected to our infrastructure. And for monitoring, Pulumi defines metrics and notifications/alarms in PagerDuty and Slack.

Best of all, we’ve been able to take things we’ve learned from these use cases, and others we’ve worked with beta users on over the last few months (thank you!), and factor common patterns out into reusable libraries like @pulumi/aws-infra and @pulumi/cloud for ourselves and others to build upon.

In this post, we’ll do a deeper dive into each of these use cases, highlighting unique aspects of how we use Pulumi itself, and some of our engineering processes around how we integrate Pulumi into the rest of our toolchain.

Read more →

Build a Video Thumbnailer on AWS

Build a Video Thumbnailer on AWS

Pulumi makes it easy to build cloud applications that use a combination of containers, lambdas, and connected data services and infrastructure: Colada apps.

An example of a Colada app is extracting a thumbnail from a video. A serverless function can only run for 5 minutes, so we’ll run a container in AWS Fargate to do the video processing.

In this app, a Lambda function is triggered whenever a new video is uploaded to S3. This function launches a task in Fargate that uses FFmpeg to extract a video thumbnail. A second Lambda function is triggered when a new thumbnail has been created.

Read more →