I want to build a Serverless SaaS product that can be deployed into a customer’s own AWS account. But what exactly does it mean to be Serverless? AWS (other cloud providers are available) has over 200 different services. How many of those are serverless?

Like most things, there’s a spectrum. There are varying opinions about where the dividing line is. It’s definitely not serverless if you have to install and patch operating systems on individual servers.

What about a managed service like Amazon MQ? The servers are managed for you. There’s nothing to install or configure. However, you do have to decide how many servers you want and which instance types to use.

At the other end of the scale you have services like S3. There are no decisions to make about numbers or types of instance. The API and console expose nothing related to servers at all. However, we all know there’s a massive fleet of servers somewhere behind the scenes.

For me the best lens to look at this with is cost. Cost is the reason we have multi-tenant architectures in the first place. To make self deployable SaaS work you need true pay as you go costs with no fixed costs when the infrastructure is sitting there doing nothing. The cost model exposes how good the cloud provider’s multi-tenancy system is. If there are dedicated per customer servers somewhere behind the scenes, that will be reflected in the cost.

Serverless Cost Model
×
Serverless Cost Model

A truly serverless service will have zero cost when under zero load. If there are dedicated servers, you need a minimum number running at all times waiting for incoming requests. If you care about availability and fault tolerance, the minimum will be more than one.

A truly serverless service will have costs that scale linearly with increasing load. If there are dedicated servers, there will be step changes in cost as the number of instances is scaled up to meet demand.

Reduced complexity is a side benefit driven by the serverless cost model. If costs scale linearly, the implementation is likely to be highly elastic with less messy edge cases to deal with. If there are no dedicated servers, then there are no servers for you to manage and no need to be concerned with instance types and numbers of servers. You have a simpler, higher level abstraction to deal with.

Let’s play a game of Serverless or Not.

Red Flags

Sometimes the cost model for a service can be a little opaque. It may look like there are zero fixed costs but when you look at practical implementations you find that you can’t do without what you thought were “optional” extra cost items. A good example (and my first red flag) is VPC. VPCs are free to create. However, almost all practical implementations will need a NAT gateway or VPC endpoints. Both are extra cost items with a fixed cost component.

Needing a VPC is a red flag that a service is not serverless. If you have to care about IP address ranges, you’re working at too low a level, stuck in the realm of physical servers.

The most obvious red flag is any service where you have to choose an instance type. That immediately rules out Amazon MQ. Easily confirmed by a quick look at the pricing page. Most of these managed services have transparent pricing. You directly pay for the underlying instances.

Compute

Service VPC Min Monthly Cost Mem vCPUs Cost Model Min Bill Period Server-less?
EC2 Y $181 0.5-1024 GB 0.1-128 Per instance per hour 60s2 Not
ECS Y $181 0.5-1024 GB 0.1-128 Per instance per hour 60s2 Not
EKS Y $903 0.5-1024 GB 0.1-128 Per cluster and instance per hour 60s2 Not
ECS Fargate Y $214 0.5-120 GB 0.25-16 Per vCPU and GB per hour 60s2 Not
EKS Fargate Y $935 0.5-120 GB 0.25-16 Per cluster, vCPU and GB per hour 60s2 Not
App Runner N $156 2-4 GB 1-2 Per vCPU hour for active instances, per GB per hour for active and provisioned instances 60s7 Not
Lambda N $0 128-10240 MB 0.072-5.798 $0.0000133 per GB-second and $0.2 per million requests 1ms
Amplify Web App Server Side Rendering N $0 Managed Managed $0.0000556 per GB-second and $0.3 per million requests 1ms
S3 Object Lambda N $0 128-10240 MB 0.072-5.798 $0.0000133 per GB-second, $19 per million requests, $0.005 per GB returned 1ms
S3 Select N $0 NA NA $0.4 per million requests, $0.002 per GB scanned, $0.0007 per GB returned NA

Fargate often appears in AWS presentations about serverless architecture. However, the cost model is not serverless. You need a minimum number of containers running at all times in case a request comes in. You can’t spin up a container on demand the first time a request arrives.

App Runner is built on Fargate and manages scaling and deployments for you. It has an interesting cost model where it can scale CPU down to zero by CPU throttling the minimum set of provisioned containers. In this state you pay only for the memory used by the containers. CPU can be throttled back up in response to incoming requests.

It can be hard to compare Lambda and instance based pricing. The closest configurations are a c6gd.medium (1 vCPU, 2 GB) at $0.0384 per hour and a 1769 MB Lambda (1 vCPU, 1769 MB) at $0.0829 per hour. That’s a little more than double the cost for Lambda. However, in practice, teams struggle to achieve anywhere close to 50% utilization when managing their own instances.

AWS Batch is a job management service that runs jobs on your choice of EC2 instances, Fargate or Lambda. There is no additional cost over that of the underlying compute.

File Storage

Service Min Monthly Cost AZs Dura-bility Max File Size Max Capacity Cost Model Min Bill Period Server-less?
EBS $1.6010 1 5 9s 64 TB 64 TB $0.08 per GB-month and $0.005 per IOPS provisioned 60s Not
EFS $0.311 3 11 9s 48 TB Unlimited $0.30 per GB-month, $0.03 per GB reads and $0.06 per GB writes 1 Hour
S3 $0 3 11 9s 5 TB Unlimited $0.023 per GB-month, $0.09 per GB transferred out to internet, $0.4 per million read requests, $5 per million write requests 1 Hour
S3 Express One Zone $0 1 11 9s 5 TB Unlimited $0.16 per GB-month, $0.09 per GB transferred out to internet, $0.2 per million read requests, $2.5 per million write requests 1 Hour

EBS is not serverless because the pricing model is based on provisioned capacity. Effectively you have to decide in advance how big you want your disk drive to be and how much IO you will be doing. The durability and availability model means you’ll also need to implement some form of RAID on top of your bare EBS volumes if storing customer data.

EFS has a non-zero monthly cost but is low enough for me to count it as serverless.

Database

Service VPC Min Monthly Cost Mem vCPUs Cost Model Min Bill Period Server-less?
RDS Y $27.6412 1-1024 GB 0.2-128 Per instance per hour, GB provisioned per month 10m Not
Aurora Y $106.1213 4-1024 GB 0.4-128 Per instance per hour, GB per month, per million IOPs 10m Not
Aurora Serverless Y $87.4014 1-256 GB 0.125-32 Per ACU per hour, GB per month, per million IOPs 10m Not
DocumentDB Y $109.9515 4-768 GB 0.4-96 Per instance per hour, GB per month, per million IOPs 10m Not
Neptune Y $134.9216 4-768 GB 0.4-96 Per instance per hour, GB per month, per million IOPs 10m Not
Neptune Serverless Y $579.8817 5-256 GB 0.625-32 Per NCU per hour, GB per month, per million IOPs 10m Not
OpenSearch Serverless Y $691.4418 24-240 GB 6-60 Per OCU per hour, GB per month 10m Not
DynamoDB N $0 NA NA $1.25 per million write requests, $0.25 per million read requests, $0.25 per GB-month 1 hour
TimeStream N $0 NA NA $0.50 per million write requests, $0.036 per GB-hour in memory, $0.03 per GB-month stored, $0.01 per GB scanned 1 hour

The big surprise here is that all the “Serverless” branded databases are not actually serverless. They would be better described as “Auto-vertical scaling of instance types” but I guess that’s not catchy enough.

Queues and Eventing

Service VPC Min Monthly Cost Cost Model Min Bill Period Server-less?
Amazon MQ Y $40.9419 Per instance per hour, GB per month, GB transferred between instances20 60s Not
Kinesis N $28.8021 $0.04 per stream per hour, $0.08 per GB ingested, $0.04 per GB retrieved 1 hour Not
SQS N $0 $0.40 per million 64KB requests NA
SNS N $0 $0.50 per million 64KB requests, $0.09 per GB transferred out to SQS or Lambda22 NA
EventBridge N $0 $1 per million 64KB events published NA

Functionally, Kinesis looks like it should be serverless. Again, the cost model reveals the existence of dedicated per stream infrastructure.

Orchestration

Service VPC Min Monthly Cost Cost Model Min Bill Period Server-less?
SWF Y $0 $100 per million workflow executions, $25 per million tasks NA Not
STEP Functions N $0 $25 per million state transitions NA
STEP Functions Express N $0 $0.00001667 per GB-second and $1 per million requests 100ms

In isolation, SWF looks serverless. However, SWF is useless without decision and task workers. SWF requires those decision and task workers to execute on instances which use long polling to communicate with SWF. That in turn makes any system that uses SWF not serverless.

Standard and Express STEP functions look very similar. Both implement orchestration logic based on the STEP state machine definition. The cost models reveal that the implementations are completely different. Standard STEP functions provide exactly once semantics and have a cost model that suggests they’re implemented on top of SWF or something with an equivalent architecture. Express STEP functions have at most once or at least once semantics. Their cost model suggests they’re implemented using an SQS queue of workflow instances with a lambda that reads an instance and then executes the entire workflow.

Cache

Service VPC Min Monthly Cost Mem vCPUs Cost Model Min Bill Period Server-less?
MemoryDB for Redis Y $69.1223 1.37-419.09 GB 0.4-64 Per instance per hour, $0.20 per GB written 60s Not
Elasticache Y $23.0424 0.5-635.61 GB 0.2-96 Per instance per hour 60s Not
Elasticache Serverless Y $9025 1GB-5TB Unknown Per ECPU, GB per hour 1 hour Not
DAX (DynamoDB accelerator) Y $57.6026 2-768 GB 0.4-96 Per instance per hour 60s Not
API Gateway Caching N $14.4027 0.5-237 GB Unknown Per hour 60s Not

Surprisingly, there is no serverless application cache available from AWS. Even the DynamoDB specific DAX and API Gateway integrated caching are instance based.

The only configuration for the API Gateway cache is capacity. However, the documentation makes it clear that there is a dedicated instance behind the scenes. It even suggests that you run a load test to ensure that the instance you implicitly selected (based on cache capacity) will cope with your traffic.

Quoted memory sizes for MemoryDB, Elasticache and API Gateway are memory available for caching. DAX quotes the overall memory on the instance, not all of which will be available for caching.

Gateway

Service Min Monthly Cost Cost Model Min Bill Period Server-less?
Load Balancer $16.20 $0.0225 per hour, $0.008 per LCU hour 1 hour Not
CloudFront $0 $1 per million https requests, $0.085 per GB transferred out to internet NA
Amplify Web App Hosting $0 $0.15 per GB served NA
API Gateway (REST API) $0 $3.5 per million API calls received, $0.09 per GB transferred out to internet NA
API Gateway (HTTP API) $0 $1 per million 512KB API calls received, $0.09 per GB transferred out to internet NA
API Gateway (WebSocket API) $0 $1 per million 32KB messages sent or received by client, $0.25 per million connection minutes NA
AppSync (queries and mutations) $0 $4 per million requests received, $0.09 per GB transferred out to internet NA
AppSync (subscriptions) $0 $2 per million messages received by client, $0.08 per million connection minutes NA
Lambda Function URLs $0 No additional charge above the cost of invoking the lambda NA
IoT Core $0 $0.30 per million 5KB messages ingested, $1 per million 5KB messages received by client, $0.08 per million connection minutes NA
Cognito User Pools $0 $0.0055 per MAU (first 50k free indefinitely) NA

I’m as surprised as you are that load balancers are not serverless. Functionally it looks serverless - no configuration of instance types, smooth and elastic scaling under load. However, the cost model makes it clear there must be some dedicated per customer infrastructure behind the scenes.

Despite the name, IOT Core is a general purpose asynchronous messaging gateway. Messages from clients can be ingested at scale and routed to S3, SNS, SQS, Lambda, DynamoDB, STEP Functions and many more.

Whatever you use to implement your gateway, you’ll need some form of user authentication. Cognito User Pools is the AWS solution and has an unusual high level cost model aligned with how SaaS vendors typically think about their cost and revenue.

Revisions

  • 2023-12-21 Added ElastiCache Serverless, S3 Express One Zone
  • 2022-12-16 Added API Gateway REST api
  • 2022-12-13 Added S3 Object Lambda, S3 Select
  • 2022-12-12 Added App Runner, Amplify Web App Server Side Rendering, Neptune Serverless, OpenSearch Serverless, Amplify Web App Hosting, AppSync, Lambda Function URLs, Cognito User Pools

Footnotes

All costs correct at time of writing based on AWS US East region.

  1. Min config is 3 x t4g.nano burstable instances (2 vCPU at 5% utilization, 0.5GB) at $0.0042 each per hour (cheapest instance available) with 30GB EBS volumes (base size for AWS Linux) at $3 per month  2

  2. You are charged for the time it takes for the OS and language stack to boot up, scale up is far from instant  2 3 4 5

  3. Min config is 1 cluster at $0.1 per hour and 3 x t4g.nano with 30GB EBS volumes 

  4. Min config is 3 x (0.25vCPU, 0.5GB, Linux/Arm) at $0.099 each per hour 

  5. Min config is 1 cluster at $0.1 per hour and 3 x (0.25vCPU, 0.5GB, Linux/Arm) at $0.099 per hour 

  6. Min config is 3 x (1vCPU,2GB) provisioned at $0.007 per GB-hour when idle 

  7. Responds instantly to incoming requests from provisioned capacity, scales back down to zero active instances after 60s idle 

  8. Lambdas have access to 2-6 vCPUs but are throttled based on memory size  2

  9. $0.4 for the incoming S3 request, $0.2 for invoking the lambda, $0.4 for the S3 request from the lambda 

  10. Minimum size is 20GB 

  11. An empty file system occupies some space so will be charged for at least 1 GB 

  12. Min Multi-AZ config is 2 x db.t4g.micro burstable instances (2 vCPU at 10% utilization, 1GB) at $0.016 each per hour with 20GB of storage at $0.23 per GB-month 

  13. Min Multi-AZ config is 2 x db.t4g.medium burstable instances (2 vCPU at 20% utilization, 4GB) at $0.073 each per hour with 10GB of storage at $0.1 per GB-month 

  14. Min Multi-AZ config is 2 x 0.5 ACU (0.125 vCPU, 1 GB) at $0.12 per ACU hour with 10GB of storage at $0.1 per GB-month 

  15. Min Multi-AZ config is 2 x db.t4g.medium burstable instances (2 vCPU at 20% utilization, 4GB) at $0.07566 each per hour with 10GB of storage at $0.1 per GB-month 

  16. Min Multi-AZ config is 2 x db.t4g.medium burstable instances (2 vCPU at 20% utilization, 4GB) at $0.093 each per hour with 10GB of storage at $0.1 per GB-month 

  17. Min Multi-AZ config is 2 x 2.5 NCU (0.625 vCPU, 5 GB) at $0.1608 per NCU hour with 10GB of storage at $0.1 per GB-month 

  18. Min Multi-AZ config is 4 OCU (6 vCPU, 24 GB) at $0.24 per OCU hour with 10GB of storage at $0.024 per GB-month 

  19. Min Multi-AZ config is 2 x mq.t3.micro burstable instances (2 vCPU at 10% utilization, 1GB) at $0.02704 each per hour with 20GB of storage at $0.1 per GB-month 

  20. $0.10 per GB for EBS storage, $0.30 for EFS. $0.01 per GB transferred between brokers in multi-az setup 

  21. One stream at $0.04 per hour 

  22. $ 0.09 per GB transferred equivalent to $5 per million 64KB events or $0.09 per million 1KB events 

  23. Min Multi-AZ config is 2 x db.t4g.small burstable instances (2 vCPU at 20% utilization, 1.37GB) at $0.048 each per hour 

  24. Min Multi-AZ config is 2 x cache.t4g.micro burstable instances (2 vCPU at 10% utilization, 0.5GB) at $0.016 each per hour 

  25. Minimum charge is for 1 GB of storage at $0.125 per GB-hour 

  26. Min Multi-AZ config is 2 x dax.t3.small burstable instances (2 vCPU at 20% utilization, 2GB) at $0.04 each per hour 

  27. Min config is 0.5GB cache capacity at $0.02 per hour