Amazon SageMaker pricing
Pricing overview
Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML. SageMaker supports the leading ML frameworks, toolkits, and programming languages.
With SageMaker, you pay only for what you use. You have two choices for payment: an On-Demand Pricing that offers no minimum fees and no upfront commitments, and the SageMaker Savings Plans that offer a flexible, usage-based pricing model in exchange for a commitment to a consistent amount of usage.
Amazon SageMaker Free Tier
Amazon SageMaker is free to try. As part of the AWS Free Tier, you can get started with Amazon SageMaker for free. Your free tier starts from the first month when you create your first SageMaker resource. The details of the free tier for Amazon SageMaker are in the table below.
Amazon SageMaker capability | Free Tier usage per month for the first 2 months |
Studio notebooks, and notebook instances | 250 hours of ml.t3.medium instance on Studio notebooks OR 250 hours of ml.t2 medium instance or ml.t3.medium instance on notebook instances |
RStudio on SageMaker | 250 hours of ml.t3.medium instance on RSession app AND free ml.t3.medium instance for RStudioServerPro app |
Data Wrangler | 25 hours of ml.m5.4xlarge instance |
Feature Store | 10 million write units, 10 million read units, 25 GB storage (standard online store) |
Training | 50 hours of m4.xlarge or m5.xlarge instances |
Amazon SageMaker with TensorBoard | 300 hours of ml.r5.large instance |
Real-Time Inference | 125 hours of m4.xlarge or m5.xlarge instances |
Serverless Inference | 150,000 seconds of on-demand inference duration |
Canvas | 160 hours/month for session time |
HyperPod | 50 hours of m5.xlarge instance |
On-Demand Pricing
-
Studio Classic
-
JupyterLab
-
Code Editor
-
RStudio
-
Notebook Instances
-
Processing
-
TensorBoard
-
Data Wrangler
-
Feature Store
-
Training
-
MLflow
-
Real-Time Inference
-
Asynchronous Inference
-
Batch Transform
-
Serverless Inference
-
JumpStart
-
Profiler
-
HyperPod
-
Inference optimization
-
Studio Classic
-
Amazon SageMaker Studio Classic
Studio Classic offers one-step Jupyter notebooks in our legacy IDE experience. The underlying compute resources are fully elastic and the notebooks can be easily shared with others, allowing seamless collaboration. You are charged for the instance type you choose, based on the duration of use. -
JupyterLab
-
Amazon SageMaker JupyterLab
Launch fully managed JupyterLab in seconds. Use the latest web-based interactive development environment for notebooks, code, and data. You are charged for the instance type you choose, based on the duration of use. -
Code Editor
-
Amazon SageMaker Code Editor
Code Editor, based on Code-OSS (Visual Studio Code – Open Source), enables you to write, test, debug, and run your analytics and ML code. It is fully integrated with SageMaker Studio and supports IDE extensions available in the Open VSX extension registry. -
RStudio
-
RStudio
RStudio offers on-demand cloud compute resources to accelerate model development and improve productivity. You are charged for the instance types you choose to run the RStudio Session app and the RStudio Server Pro app.
RStudioServerPro App
-
Notebook Instances
-
Notebook Instances
Notebook instances are compute instances running the Jupyter notebook app. You are charged for the instance type you choose, based on the duration of use.
-
Processing
-
Amazon SageMaker Processing
Amazon SageMaker Processing lets you easily run your pre-processing, post-processing, and model evaluation workloads on fully managed infrastructure. You are charged for the instance type you choose, based on the duration of use.
-
TensorBoard
-
Amazon SageMaker with TensorBoard
Amazon SageMaker with TensorBoard provides a hosted TensorBoard experience to to visualize and debug model convergence issues for Amazon SageMaker training jobs.
-
Data Wrangler
-
Amazon SageMaker Data Wrangler
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning from weeks to minutes. You pay for the time used to cleanse, explore, and visualize data. Customer running SageMaker Data Wrangler instances are subject to the pricing below.* Customers running SageMaker Data Wrangler on SageMaker Canvas workspace instances are subject to SageMaker Canvas pricing. See SageMaker Canvas pricing page for more details.
Amazon SageMaker Data Wrangler Jobs
An Amazon SageMaker Data Wrangler job is created when a data flow is exported from SageMaker Data Wrangler. With SageMaker Data Wrangler jobs, you can automate your data preparation workflows. SageMaker Data Wrangler jobs help you reapply your data preparation workflows on new datasets to help save you time, and are billed by the second.
-
Feature Store
-
Amazon SageMaker Feature Store
Amazon SageMaker Feature Store is a central repository to ingest, store and serve features for machine learning. You are charged for feature group writes, reads, and data storage in SageMaker Feature Store, with different pricing for the standard online store and in-memory online store.For the standard online store, data storage is charged per GB per month. For throughput, you can choose between on-demand or provisioned capacity mode. For on-demand, writes are charged as write request units per KB and reads are charged as read request units per 4 KB. For provisioned capacity mode, you specify the read and write capacity that you expect your application to require. Sagemaker Feature Store charges one WCU for each write per second (upto 1 KB) and and one RCU for each read per second (up to 4 KB). You will be charged for the throughput capacity (reads and writes) you provision for your feature group, even if you do not fully utilize the provisioned capacity.
For the in-memory online store, writes are charged as write request units per KB with a minimum of 1 unit per write, reads are charged as read request units per KB with a minimum of 1 unit per read, and data storage is charged per GB per hour. There is a minimum data storage charge of 5 GiB (5.37 GB) per hour for the in-memory online store.
-
Training
-
Amazon SageMaker Training
Amazon SageMaker makes it easy to train machine learning (ML) models by providing everything you need to train, tune, and debug models. You are charged for usage of the instance type you choose. When you use Amazon SageMaker Debugger to debug issues and monitor resources during training, you can use built-in rules to debug your training jobs or write your own custom rules. There is no charge to use built-in rules to debug your training jobs. For custom rules, you are charged for the instance type you choose, based on the duration of use.
-
MLflow
-
Amazon SageMaker with MLflow
Amazon SageMaker with MLflow allows customers to pay only for what you use. Customers pay for MLflow Tracking Servers based on compute and storage costs.
Customers will pay for compute based on the size of the Tracking Server and number of hours it was running. In addition, customers will pay for any metadata stored on the MLflow Tracking Server.
-
Real-Time Inference
-
Amazon SageMaker Hosting: Real-Time Inference
Amazon SageMaker provides real-time inference for your use cases needing real-time predictions. You are charged for usage of the instance type you choose. When you use Amazon SageMaker Model Monitor to maintain highly accurate models providing real-time inference, you can use built-in rules to monitor your models or write your own custom rules. For built-in rules, you get up to 30 hours of monitoring at no charge. Additional charges will be based on duration of usage. You are charged separately when you use your own custom rules.
-
Asynchronous Inference
-
Amazon SageMaker Asynchronous Inference:
Amazon SageMaker Asynchronous Inference is a near-real time inference option that queues incoming requests and processes them asynchronously. Use this option when you need to process large payloads as the data arrives or run models that have long inference processing times and do not have sub-second latency requirements. You are charged for the type of instance you choose. -
Batch Transform
-
Amazon SageMaker Batch Transform
Using Amazon SageMaker Batch Transform, there is no need to break down your data set into multiple chunks or manage real-time endpoints. SageMaker Batch Transform allows you to run predictions on large or small batch datasets. You are charged for the instance type you choose, based on the duration of use.
-
Serverless Inference
-
Amazon SageMaker Serverless Inference
Amazon SageMaker Serverless Inference enables you to deploy machine learning models for inference without configuring or managing any of the underlying infrastructure. You can either use on-demand Serverless Inference or add Provisioned Concurrency to your endpoint for predictable performance.With on-demand Serverless Inference, you only pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. The compute charge depends on the memory configuration you choose.
Provisioned ConcurrencyOptionally, you can also enable Provisioned Concurrency for your serverless endpoints. Provisioned Concurrency allows you to deploy models on serverless endpoints with predictable performance, and high scalability by keeping your endpoints warm for specified number of concurrent requests and specified time. As with on-demand Serverless Inference, when Provisioned Concurrency is enabled, you pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. You also pay for Provisioned Concurrency usage, based on the memory configured, duration provisioned, and amount of concurrency enabled.
-
JumpStart
-
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart helps you quickly and easily get started with machine learning with one-click access to popular model collections (also known as “model zoos”). Jumpstart also offers end-to-end solutions that solve common ML use cases which can be customized for your needs. There is no additional charge for using JumpStart models or solutions. You will be charged for the underlying Training and Inference instance hours used the same as if you had created them manually.
-
Profiler
-
Amazon SageMaker Profiler collects system-level data for visualization of high-resolution CPU and GPU trace plots. This tool is designed to help data scientists and engineers identify hardware related performance bottlenecks in their deep learning models, saving end to end training time and cost. Currently SageMaker Profiler only supports profiling of training jobs leveraging ml.g4dn.12xlarge, ml.p3dn.24xlarge and ml.p4d.24xlarge training compute instance types.
Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Ireland), and Israel (Tel Aviv).Amazon SageMaker Profiler is currently in preview and available without cost to customers in supported regions.
-
HyperPod
-
Amazon SageMaker HyperPod
Amazon SageMaker HyperPod is purpose-built to accelerate foundation model (FMs) development. To make FM training more resilient, it continuously monitors cluster health, repairs and replaces faulty nodes on-the-fly, and saves frequent checkpoints to automatically resume training without losing progress. SageMaker HyperPod is pre-configured with SageMaker distributed training libraries that enable you to improve FM training performance while fully utilizing the cluster’s compute and network infrastructureNote: SageMaker HyperPod pricing does not cover the charges for services connected to HyperPod clusters, such as Amazon EKS, Amazon FSx for Lustre, and Amazon Simple Storage Service (Amazon S3).
-
Inference optimization
-
Inference optimization toolkit makes it easy for you to implement the latest inference optimization techniques in order to achieve state-of-the-art (SOTA) cost performance on Amazon SageMaker, while saving months of developer time. You can choose from a menu of popular optimization techniques provided by SageMaker and run optimization jobs ahead of time, benchmark the model for performance and accuracy metrics, and then deploy the optimized model to a SageMaker endpoint for inference.
Instance details
Amazon SageMaker P5 instance product details
Instance Size | vCPUs | Instance Memory (TiB) | GPU Model | GPU | Total GPU memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | GPUDirect RDMA | GPU Peer to Peer | Instance Storage (TB) | EBS Bandwidth (Gbps) |
ml.p5.48xlarge | 192 | 2 | NVIDIA H100 | 8 | 640 HBM3 | 80 | 3200 EFAv2 | Yes | 900 GB/s NVSwitch | 8x3.84 NVMe SSD | 80 |
Amazon SageMaker P4d instance product details
Instance Size | vCPUs | Instance Memory (GiB) | GPU Model | GPUs | Total GPU memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | GPUDirect RDMA | GPU Peer to Peer | Instance Storage (GB) | EBS Bandwidth (Gbps) |
ml.p4d.24xlarge | 96 | 1152 | NVIDIA A100 | 8 | 320 HBM 2 | 40 | 400 ENA AND EFA | Yes | 600 GB/s NVSwitch | 8x1000 NVMe SSD | 19 |
ml.p4de.24xlarge | 96 | 1152 | NVIDIA A100 | 8 | 640 HNM2e | 80 | 400 ENA and EFA | Yes | 600 GB/s NVSwitch | 8X1000 NVMe SSD | 19 |
Amazon SageMaker P3 instance product details
Instance Size | vCPUs | Instance Memory (GiB) | GPU Model | GPUs | Total GPU memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | GPU Peer to Peer | Instance Storage (GB) | EBS Bandwidth (Gbps) |
ml.p3.2xlarge | 8 | 61 | NVIDIA V100 | 1 | 16 | 16 | Up to 10 | N/A | EBS-Only | 1.5 |
ml.p3.8xlarge | 32 | 244 | NVIDIA V100 | 4 | 64 | 16 | 10 | NVLink | EBS-Only | 7 |
ml.p3.16xlarge | 64 | 488 | NVIDIA V100 | 8 | 128 | 16 | 25 | NVLink | EBS-Only | 14 |
ml.p3dn.24xlarge | 96 | 768 | NVIDIA V100 | 8 | 256 | 32 | 100 | NVLink | 2 x 900 NVMeSSD | 19 |
Amazon SageMaker P2 instance product details
Instance Size | vCPUs | Instance Memory (GiB) | GPU Model | GPUs | Total GPU memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | EBS Bandwidth (Gbps) |
ml.p2.xlarge | 4 | 61 | NVIDIA K80 | 1 | 12 | 12 | Up to 10 | High |
ml.p2.8xlarge | 32 | 488 | NVIDIA K80 | 8 | 96 | 12 | 10 | 10 |
ml.p2.16xlarge | 64 | 732 | NVIDIA K80 | 16 | 192 | 12 | 25 | 20 |
Amazon SageMaker G4 instance product details
Instance Size | vCPUs | Instance Memory (GiB) | GPU Model | GPUs | Total GPU memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | Instance Storage (GB) | EBS Bandwidth (Gbps) |
ml.g4dn.xlarge | 4 | 16 | NVIDIA T4 | 1 | 16 | 16 | Up to 25 | 1 x 125 NVMe SSD | Up to 3.5 |
ml.g4dn.2xlarge | 8 | 32 | NVIDIA T4 | 1 | 16 | 16 | Up to 25 | 1 x 125 NVMe SSD | Up to 3.5 |
ml.g4dn.4xlarge | 16 | 64 | NVIDIA T4 | 1 | 16 | 16 | Up to 25 | 1 x 125 NVMe SSD | 4.75 |
ml.g4dn.8xlarge | 32 | 128 | NVIDIA T4 | 1 | 16 | 16 | 50 | 1 x 900 NVMe SSD | 9.5 |
ml.g4dn.16xlarge | 64 | 256 | NVIDIA T4 | 1 | 16 | 16 | 50 | 1 x 900 NVMe SSD | 9.5 |
ml.g4dn.12xlarge | 48 | 192 | NVIDIA T4 | 4 | 64 | 16 | 50 | 1 x 900 NVMe SSD | 9.5 |
Amazon SageMaker G5 instance product details
Instance Size | vCPUs | Instance Memory (GiB) | GPU Model | GPUs | Total GPU Memory (GB) | Memory per GPU (GB) | Network Bandwidth (Gbps) | EBS Bandwidth (Gbps) | Instance Storage (GB) |
ml.g5n.xlarge | 4 | 16 | NVIDIA A10G | 1 | 24 | 24 | Up to 10 | Up to 3.5 | 1x250 |
ml.g5.2xlarge | 8 | 32 | NVIDIA A10G | 1 | 24 | 24 | Up to 10 | Up to 3.5 | 1x450 |
ml.g5.4xlarge | 16 | 64 | NVIDIA A10G | 1 | 24 | 24 | Up to 25 | 8 | 1x600 |
ml.g5.8xlarge | 32 | 128 | NVIDIA A10G | 1 | 24 | 24 | 25 | 16 | 1x900 |
ml.g5.16xlarge | 64 | 256 | NVIDIA A10G | 1 | 24 | 24 | 25 | 16 | 1x1900 |
ml.g5.12xlarge | 48 | 192 | NVIDIA A10G | 4 | 96 | 24 | 40 | 16 | 1x3800 |
ml.g5.24xlarge | 96 | 384 | NVIDIA A10G | 4 | 96 | 24 | 50 | 19 | 1x3800 |
ml.g5.48xlarge | 192 | 768 | NVIDIA A10G | 8 | 192 | 24 | 100 | 19 | 2x3800 |
Amazon SageMaker Trn1 instance product details
Instance Size | vCPUs | Memory (GiB) | Trainium Accelerators | Total Accelerator Memory (GB) | Memory per Accelerator (GB) | Instance Storage (GB) | Network Bandwidth (Gbps) | EBS Bandwidth (Gbps) |
ml.trn1.2xlarge | 8 | 32 | 1 | 32 | 32 | 1 x 500 NVMe SSD | Up to 12.5 | Up to 20 |
ml.trn1.32xlarge | 128 | 512 | 16 | 512 | 32 | 4 x 2000 NVMe SSD | 800 | 80 |
Amazon SageMaker Inf1 instance product details
Instance Size | vCPUs | Memory (GiB) | Inferentia Accelerators | Total Accelerator Memory (GB) | Memory per Accelerator (GB) | Instance Storage | Inter-accelerator Interconnect | Network Bandwidth (Gbps) | EBS Bandwidth (Gbps) |
ml.inf1.xlarge | 4 | 8 | 1 | 8 | 8 | EBS only | N/A | Up to 25 | Up to 4.75 |
ml.inf1.2xlarge | 8 | 16 | 1 | 8 | 8 | EBS only | N/A | Up to 25 | Up to 4.75 |
ml.inf1.6xlarge | 24 | 48 | 4 | 32 | 8 | EBS only | Yes | 25 | 4.75 |
ml.inf1.24xlarge | 96 | 192 | 16 | 128 | 8 | EBS only | yes | 100 | 19 |
Amazon SageMaker Inf2 instance product details
Instance Size | vCPUs | Memory (GiB) | Inferentia Accelerators | Total Accelerator Memory (GB) | Memory per Accelerator (GB) | Instance Storage | Inter-accelerator Interconnect | Network Bandwidth (Gbps) | EBS Bandwidth (Gbps) |
ml.inf2.xlarge | 4 | 16 | 1 | 32 | 32 | EBS only | N/A | Up to 25 | Up to 10 |
ml.inf2.8xlarge | 32 | 128 | 1 | 32 | 32 | EBS only | N/A | Up to 25 | 10 |
ml.inf2.24xlarge | 96 | 384 | 6 | 196 | 32 | EBS only | Yes | 50 | 30 |
ml.inf2.48xlarge | 192 | 768 | 12 | 384 | 32 | EBS only | Yes | 100 | 60 |
Amazon SageMaker Studio
Amazon SageMaker Studio is a single web-based interface for complete ML development, offering a choice of fully managed integrated development environments (IDEs) and purpose-built tools. You can access SageMaker Studio free of charge. You are only charged for the underlying compute and storage that you use for different IDEs and ML tools within SageMaker Studio.
You can use many services from SageMaker Studio, AWS SDK for Python (Boto3), or AWS Command Line Interface (AWS CLI), including the following:
- IDEs on SageMaker Studio to perform complete ML development with a broad set of fully managed IDEs, including JupyterLab, Code Editor based on Code-OSS (Visual Studio Code – Open Source), and RStudio
- SageMaker Pipelines to automate and manage ML workflows
- SageMaker Autopilot to automatically create ML models with full visibility
- SageMaker Experiments to organize and track your training jobs and versions
- SageMaker Debugger to debug anomalies during training
- SageMaker Model Monitor to maintain high-quality models
- SageMaker Clarify to better explain your ML models and detect bias
- SageMaker JumpStart to easily deploy ML solutions for many use cases. You may incur charges from other AWS services used in the solution for the underlying API calls made by Amazon SageMaker on your behalf.
- SageMaker Inference Recommender to get recommendations for the right endpoint configuration
You pay only for the underlying compute and storage resources within SageMaker or other AWS services, based on your usage.
To use Amazon Q Developer Free Tier on Jupyter Lab and Code Editor, follow the instructions here. To use Amazon Q Developer Pro on Jupyter Lab, you must subscribe to Amazon Q Developer. Amazon Q Developer pricing is available here.
Foundation model evaluations
SageMaker Clarify supports foundation model evaluations with both automatic and human-based evaluation methods. Each of these has different pricing. If you are evaluating a foundation model from Amazon SageMaker JumpStart that is not yet deployed to your account, SageMaker will temporarily deploy the JumpStart model on a SageMaker instance for the duration of the inference. The specific instance will conform to the instance recommendation provided by JumpStart for that model.
Automatic evaluation:
Foundation model evaluations run as SageMaker processing job. The evaluation job will invoke SageMaker Inference. Customers are charged for the inference and for the evaluation job. Customers are charged only for the duration of the evaluation job. The cost of the evaluation job would be the sum of the cost per hour of the evaluation instance and the sum of the cost per hour of the hosting instance.
Human-based evaluation:
When you use the human-based evaluation feature where you bring your own workforce, you are charged for three items: 1) SageMaker instance used for inference, 2) the instance used to run the SageMaker Processing Job that hosts the human evaluation, and 3) a charge of $0.21 per completed human evaluation task. A human task is defined as an occurrence of a human worker submitting an evaluation of a single prompt and its associated inference responses in the human evaluation user interface. The price is the same whether you have 1 or 2 models in your evaluation job or you bring your own inference and also the same regardless of how many evaluation dimensions and rating methods you include. The $0.21 per task pricing is the same for all AWS regions. There is no separate charge for the workforce, as the workforce is supplied by you.
AWS-managed evaluation:
For an AWS-managed expert evaluation, pricing is customized for your evaluation needs in a private engagement while working with the AWS expert evaluations team.
Amazon SageMaker Studio Lab
You can build and train ML models using Amazon SageMaker Studio Lab for free. SageMaker Studio Lab offers developers, academics, and data scientists a no-configuration development environment to learn and experiment with ML at no additional charge.
Amazon SageMaker Canvas
Amazon SageMaker Canvas expands ML access by providing business analysts the ability to generate accurate ML predictions using a visual point-and-click interface—no coding or ML experience required.
Amazon SageMaker Data Labeling
Amazon SageMaker Data Labeling provides two data labeling offerings, Amazon SageMaker Ground Truth Plus and Amazon SageMaker Ground Truth. You can learn more about Amazon SageMaker Data Labeling, a fully managed data labeling service that makes it easy to build highly accurate training datasets for ML.
Amazon SageMaker shadow testing
SageMaker helps you run shadow tests to evaluate a new ML model before production release by testing its performance against the currently deployed model. There is no additional charge for SageMaker shadow testing other than usage charges for the ML instances and ML storage provisioned to host the shadow model. The pricing for ML instances and ML storage dimensions is the same as the real-time inference option specified in the preceding pricing table. There is no additional charge for data processed in and out of shadow deployments.
Amazon SageMaker Edge
Learn more about pricing for Amazon SageMaker Edge to optimize, run, and monitor ML models on fleets of edge devices.
Amazon SageMaker Savings Plans
Amazon SageMaker Savings Plans help to reduce your costs by up to 64%. The plans automatically apply to eligible SageMaker ML instance usage, including SageMaker Studio notebooks, SageMaker notebook instances, SageMaker Processing, SageMaker Data Wrangler, SageMaker Training, SageMaker Real-Time Inference, and SageMaker Batch Transform regardless of instance family, size, or Region. For example, you can change usage from a CPU instance ml.c5.xlarge running in US East (Ohio) to a ml.Inf1 instance in US West (Oregon) for inference workloads at any time and automatically continue to pay the Savings Plans price.
Total cost of ownership (TCO) with Amazon SageMaker
Amazon SageMaker offers at least 54% lower total cost of ownership (TCO) over a three-year period compared to other cloud-based self-managed solutions. Learn more with the complete TCO analysis for Amazon SageMaker.
Pricing examples
-
Pricing example #1: JupyterLab
As a data scientist, you spend 20 days using JupyterLab for quick experimentation on notebooks, code, and data for 6 hours per day on an ml.g4dn.xlarge instance. You create and then run a JupyterLab space to access the JupyterLab IDE. The compute is only charged for the instance used when the JupyterLab space is running. Storage charges for a JupyterLab space accrued until it is deleted.Compute
Instance Duration Days Total duration Cost per hour Total ml.g4dn.xlarge 6 hours 20 6 * 20 = 120 hours $0.7364 $88.368 Storage
You will be using General Purpose SSD storage for 480 hours (24 hours * 20 days). In a Region that charges $0.1125 per GB-month:
$0.112 per GB-month * 5 GB * 480 / (24 hours/day * 30-day month) = $0.373 -
Pricing example #2: Code Editor
As an ML engineer, you spend 20 days using Code Editor for ML production code editing, execution, and debugging for 6 hours per day on an ml.g4dn.xlarge instance. You create and then run a Code Editor space to access the Code Editor IDE. The compute is only charged for the instance used when the Code Editor space is running. Storage charges for a Code Editor space accrued until it is deleted.Compute
Instance Duration Days Total duration Cost per hour Total ml.g4dn.xlarge 6 hours 20 6 * 20 = 120 hours $0.7364 $88.368 Storage
You will be using General Purpose SSD storage for 480 hours (24 hours * 20 days). In a Region that charges $0.1125 per GB-month:
$0.112 per GB-month * 5 GB * 480 / (24 hours/day * 30-day month) = $0.373
-
Pricing example #3: Studio Classic
A data scientist goes through the following sequence of actions while using notebooks in Amazon SageMaker Studio Classic.
- Opens notebook 1 in a TensorFlow kernel on an ml.c5.xlarge instance and then works on this notebook for 1 hour.
- Opens notebook 2 on an ml.c5.xlarge instance. It will automatically open in the same ml.c5.xlarge instance that is running notebook 1.
- Works on notebook 1 and notebook 2 simultaneously for 1 hour.
- The data scientist will be billed for a total of 2 hours of ml.c5.xlarge usage. For the overlapped hour where she worked on notebook 1 and notebook 2 simultaneously, each kernel application will be metered for 0.5 hours and she will be billed for 1 hour.
Kernel application Notebook instance Hours Cost per hour Total TensorFlow ml.c5.xlarge 1 $0.204 $0.204 TensorFlow ml.c5.xlarge 0.5 $0.204 $0.102 Data Science ml.c5.xlarge 0.5 $0.204 $0.102 $0.408 -
Pricing example #4: RStudio
A data scientist goes through the following sequence of actions while using RStudio:
- Launches RSession 1 on an ml.c5.xlarge instance, then works on this notebook for 1 hour.
- Launches RSession 2 on an ml.c5.xlarge instance. It will automatically open in the same ml.c5.xlarge instance that is running RSession 1.
- Works on RSesssion 1 and RSession 2 simultaneously for 1 hour.
- The data scientist will be billed for a total of two (2) hours of ml.c5.xlarge usage. For the overlapped hour where she worked on RSession 1 and RSession 2 simultaneously, each RSession application will be metered for 0.5 hour and she will be billed for 1 hour.
Meanwhile, the RServer is running 24/7 no matter whether there are running RSessions or not. If the admin chooses “Small” (ml.t3.medium), then it is free of charge. If the admin chooses “Medium” (ml.c5.4xlarge) or “Large” (ml.c5.9xlarge), then it is charged hourly as far as RStudio is enabled for the SageMaker Domain.
RSession app RSession instance Hours Cost per hour Total Base R ml.c5.xlarge 1 $0.204 $0.204 Base R ml.c5.xlarge 0.5 $0.204 $0.102 Base R ml.c5.xlarge 0.5 $0.204 $0.102 $0.408 -
Pricing example #5: Processing
Amazon SageMaker Processing only charges you for the instances used while your jobs are running. When you provide the input data for processing in Amazon S3, Amazon SageMaker downloads the data from Amazon S3 to local file storage at the start of a processing job.
The data analyst runs a processing job to preprocess and validate data on two ml.m5.4xlarge instances for a job duration of 10 minutes. She uploads a dataset of 100 GB in S3 as input for the processing job, and the output data (which is roughly the same size) is stored back in S3.
Hours Processing instances Cost per hour Total 1 * 2 * 0.167 = 0.334 ml.m5.4xlarge $0.922 $0.308 General purpose (SSD) storage (GB) Cost per hour Total 100 GB * 2 = 200 $0.14 $0.0032 The subtotal for Amazon SageMaker Processing job = $0.308.
The subtotal for 200 GB of general purpose SSD storage = $0.0032.
The total price for this example would be $0.3112.
-
Pricing example #6: Data Wrangler
From the table, you use Amazon SageMaker Data Wrangler for a total of 18 hours over 3 days to prepare your data. Additionally, you create a SageMaker Data Wrangler job to prepare updated data on a weekly basis. Each job lasts 40 minutes, and the job runs weekly for one month.
Total monthly charges for using Data Wrangler = $16.596 + $2.461 = $19.097
Application SageMaker Studio instance Days Duration Total duration Cost per hour Cost sub-total SageMaker Data Wrangler ml.m5.4xlarge 3 6 hours 18 hours $0.922 $16.596 SageMaker Data Wrangler job ml.m5.4xlarge - 40 minutes 2.67 hours $0.922 $2.461 As a data scientist, you spend three days using Amazon SageMaker Data Wrangler to cleanse, explore, and visualize your data for 6 hours per day. To execute your data preparation pipeline, you then initiate a SageMaker Data Wrangler job that is scheduled to run weekly.
The table below summarizes your total usage for the month and the associated charges for using Amazon SageMaker Data Wrangler.
-
Pricing example #7: Feature Store
++ All fractional read units are rounded to the next whole number
Data storage
Total data stored = 31.5 GB
Monthly charges for data storage = 31.5 GB * $0.45 = $14.175Total monthly charges for Amazon SageMaker Feature Store = $56.875 + $3.185 + $14.175 = $74.235
Day of the month Total writes Total write units Total reads Total read units Days 1 to 10 100,000 writes
(10,000 writes * 10 days)2,500,000
(100,000 * 25KB )100,000
(10,000 * 10 days)700,000++
(100,000 * 25/4 KB )Day 11 200,000 writes 5,000,000
(200,000* 25KB)200,000 reads 1,400,000++
(200,000* 25/4KB)Days 12 to 30 1,520,000 writes
(80,000 * 19 days)38,000,000
(1,520,000 * 25KB)1,520,000 writes
(80,000 * 19 days)10,640,000++
(1,520,000 * 25/4KB)Total chargeable units 45,500,000 write units 12,740,000 read units Monthly charges for writes and reads $56.875
(45.5 million write units * $1.25 per million writes)$3.185
(12.74M read units * $0.25 per million reads)You have a web application that issues reads and writes of 25 KB each to the Amazon SageMaker Feature Store. For the first 10 days of a month, you receive little traffic to your application, resulting in 10,000 writes and 10,000 reads each day to the SageMaker Feature Store. On day 11 of the month, your application gains attention on social media and application traffic spikes to 200,000 writes and 200,000 reads that day. Your application then settles into a more regular traffic pattern, averaging 80,000 writes and 80,000 reads each day through the end of the month.
The table below summarizes your total usage for the month and the associated charges for using Amazon SageMaker Feature Store.
-
Pricing example #8: Training
The total charges for training and debugging in this example are $2.38. The compute instances and general purpose storage volumes used by Amazon SageMaker Debugger built-in rules do not incur additional charges.
General purpose (SSD) storage for training (GB) General purpose (SSD) storage for debugger built-in rules (GB) General purpose (SSD) storage for debugger custom rules (GB) Cost per GB-month Subtotal Capacity used 3 2 1 Cost $0 No additional charges for built-in rule storage volumes $0 $0.10 $0 Hours Training instance Debug instance Cost per hour Subtotal 4 * 0.5 = 2.00 ml.m4.4xlarge n/a $0.96 $1.92 4 * 0.5 * 2 = 4 n/a No additional charges for built-in rule instances $0 $0 4 * 0.5 = 2 ml.m5.xlarge n/a $0.23 $0.46 ------- $2.38 A data scientist has spent a week working on a model for a new idea. She trains the model 4 times on an ml.m4.4xlarge for 30 minutes per training run with Amazon SageMaker Debugger enabled using 2 built-in rules and 1 custom rule that she wrote. For the custom rule, she specified ml.m5.xlarge instance. She trains using 3 GB of training data in Amazon S3, and pushes 1 GB model output into Amazon S3. SageMaker creates general-purpose SSD (gp2) volumes for each training instance. SageMaker also creates general-purpose SSD (gp2) volumes for each rule specified. In this example, a total of 4 general-purpose SSD (gp2) volumes will be created. SageMaker Debugger emits 1 GB of debug data to the customer’s Amazon S3 bucket.
-
Pricing example #9: MLflow
You have two teams of data scientists. One team with 10 data scientists and the other team with 40 data scientists. To accommodate these two teams, you choose to enable two different MLflow Tracking Servers: one Small, and one Medium. Each team is conducting machine learning (ML) experiments and need to record the metrics, parameters, and artifacts produced by their training attempts. They want to use MLflow Tracking Servers for 160 hours per month. Assuming each Data Science team stores 1 GB of metadata to track runs in experiments. The bill at the end of month would be calculated as follows:
Compute charges for Small Instance: 160 * $0.60 = $96
Compute charges for Medium Instance: 160 * $1. 40 = $166.4
Storage charges for two teams: 2 * 1 * 0.10 = $0.20Total = $262.60
-
Pricing example #10: Real-time inference
The subtotal for training, hosting, and monitoring = $305.827. The subtotal for 3,100 MB of data processed In and 310MB of data processed Out for hosting per month = $0.054. The total charges for this example would be $305.881 per month.
Note, for built-in rules with ml.m5.xlarge instance, you get up to 30 hours of monitoring aggregated across all endpoints each month, at no charge.
Data In per month - hosting Data Out per month - hosting Cost per GB In or Out Total 100 MB * 31 = 3,100 MB $0.016 $0.0496 10 MB * 31 = 310 MB $0.016 $0.00496 Hours per month Hosting instances Model Monitor instances Cost per hour Total 24 * 31 * 2 = 1488 ml.c5.xlarge $0.204 $303.522 31*0.08 = 2.5 ml.m5.4xlarge $0.922 $2.305 The model in example #5 is then deployed to production to two (2) ml.c5.xlarge instances for reliable multi-AZ hosting. Amazon SageMaker Model Monitor is enabled with one (1) ml.m5.4xlarge instance and monitoring jobs are scheduled once per day. Each monitoring job take 5 minutes to complete. The model receives 100 MB of data per day, and inferences are 1/10 the size of the input data.
-
Pricing example #11: Asynchronous Inference
The subtotal for SageMaker Asynchronous Inference = $15.81 + $0.56 + 2 * .0048 = $16.38. The total Asynchronous Inference charges for this example would be $16.38 per month.
Data In per month Data Out per month Cost per GB In or Out Total 10 KB * 1,024 * 31 = 310 MB 10 KB * 1,024 * 31 = 310 MB $0.02 0.0048 10 KB * 1,024 * 31 = 310 MB $0.02 0.0048 General-purpose (SSD) storage (GB) Cost per Gb-month Total 4 $0.14 $0.56 Hours per month Hosting instances Cost per hour Total 2.5 * 31 * 1 = 77.5 ml.c5.xlarge $0.20 $15.81 Amazon SageMaker Asynchronous Inference charges you for instances used by your endpoint. When not actively processing requests, you can configure auto-scaling to scale the instance count to zero to save on costs. For input payloads in Amazon S3, there is no cost for reading input data from Amazon S3 and writing the output data to S3 in the same Region.
The model in example #5 is used to run an SageMaker Asynchronous Inference endpoint. The endpoint is configured to run on 1 ml.c5.xlarge instance and scale down the instance count to zero when not actively processing requests. The ml.c5.xlarge instance in the endpoint has a 4 GB general-purpose (SSD) storage attached to it. In this example, the endpoint maintains an instance count of 1 for 2 hours per day and has a cooldown period of 30 minutes, after which it scales down to an instance count of zero for the rest of the day. Therefore, you are charged for 2.5 hours of usage per day.
The endpoint processes 1,024 requests per day. The size of each invocation request/response body is 10 KB, and each inference request payload in Amazon S3 is 100 MB. Inference outputs are 1/10 the size of the input data, which are stored back in Amazon S3 in the same Region. In this example, the data processing charges apply to the request and response body, but not to the data transferred to/from Amazon S3.
-
Pricing example #12: Batch Transform
The total charges for inference in this example would be $2.88.
Hours Hosting instances Cost per hour Total 3 * 0.25 * 4 = 3 hours ml.m4.4xlarge $0.96 $2.88 The model in example #5 is used to run SageMaker Batch Transform. The data scientist runs four separate SageMaker Batch Transform jobs on 3 ml.m4.4xlarge for 15 minutes per job run. She uploads an evaluation dataset of 1 GB in S3 for each run, and inferences are 1/10 the size of the input data, which are stored back in S3.
-
Pricing example #13: On-demand Serverless Inference
Monthly data process charges
Data processing (GB) Cost per GB In or Out Monthly data processing charge 10 GB $0.016 $0.16 The subtotal for on-demand SageMaker Serverless Inference duration charge = $40. The subtotal for 10 GB data processing charge = $0.16. The total charges for this example would be $40.16.
Monthly compute charges
Number of requests Duration of each request Total inference duration (sec) Cost per sec Monthly inference duration charge 10 M 100 ms 1M $0.00004 $40
With on-demand Serverless Inference, you only pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. The compute charge depends on the memory configuration you choose.
If you allocated 2 GB of memory to your endpoint, executed it 10 million times in one month and it ran for 100 ms each time, and processed 10 GB of Data-In/Out total, your charges would be calculated as follows:
-
Pricing example #14: Provisioned Concurrency on Serverless Inference
Let’s assume you are running a chat-bot service for a payroll processing company. You expect a spike in customer inquiries at the end of March, before tax filing deadline. However, for rest of the month, the traffic is expected to be low. So, you deploy a serverless endpoint with 2GB memory and add Provisioned Concurrency of 100 for the last 5 days of the month for 9am-5pm(8hrs), during which your endpoint processes 10M requests and 10GBs of Data-In/Out total. Rest of the month, the chat-bot runs on on-demand Serverless Inference and processes 3 M requests and 3GB of Data-In/Out. Let’s assume duration of each request to be 100ms.
Provisioned Concurrency(PC) charges
PC price is $ 0.000010/ sec
PC usage duration (sec) = 5days* 100 PC* 8 hrs* 3600sec = 14,400,000 secs
PC usage charge = 14,400,000 secs* $ 0.000010/ sec = $144.Inference duration charges for traffic served by Provisioned Concurrency
Inference duration price is $ 0.000023/sec
Total Inference duration for PC (sec)= 10M*(100ms) /1000= 1M seconds.
Inference duration charges for PC= 1,000,000 sec * $ 0.000023/sec =$23On-demand inference duration charges
The monthly compute price is $0.00004/sec and the free tier provides 150k sec.
Total compute (sec) = (3) M * (100ms) /1000= 0.3M seconds.
Total compute – Free tier compute = Monthly billable compute in secs
0.3M sec – 150k sec = 150k sec
Monthly compute charges = 150k *$0.00004= $6Data Processing
Cost/GB of Data Processed In/Out = $0.016
Total GBs processed= 10+3=13
Total Cost= $0.016*13= $0.208
Total charges for March
Total charges = Provisioned Concurrency charges+ Inference duration for Provisioned Concurrency + Inference duration for On-demand compute + Data Processing charges
= $144+$23+ $6+ $0.208= $173.2 -
Pricing example #15: JumpStart
Customer uses JumpStart to deploy a pre-trained BERT Base Uncased model to classify customer review sentiment as positive or negative.
The customer deploys the model to two (2) ml.c5.xlarge instances for reliable multi-AZ hosting. The model receives 100 MB of data per day, and inferences are 1/10 the size of the input data.
Hours per month Hosting instances Cost per hour Total 24 * 31 * 2 = 1488 ml.c5.xlarge $0.204 $303.55 Data In per month - Hosting Data Out per month - Hosting Cost per GB In or Out
Total
100 MB * 31 = 3,100 MB $0.02 $0.06 10 MB * 31 = 310 MB $0.02 $0.01 The subtotal for training, hosting, and monitoring = $305.827. The sub-total for 3,100 MB of data processed In and 310 MB of data processed Out for Hosting per month = $0.06. The total charges for this example would be $305.887 per month.
-
Pricing example #16: HyperPod Cluster
Let's say you wanted to provision a cluster of 4 ml.g5.24xlarge for 1 month (30 days) with an additional 100 GB of storage per instance to support model development. The total charges for the cluster and additional storage in this example is $29,374.40.Compute
Instance Duration Instances Cost per hour Subtotal ml.g5.24xlarge 30 days * 24 hours = 720 hours 4 $10.18 $29,318.40 Storage
General purpose (SSD) storage Duration Instances Cost per GB-month Subtotal 100 GB 30 days * 24 hours = 720 hours 4 $0.14 $56.00 -
Pricing Example #17: Foundation model evaluations (automatic evaluation)
Foundation model evaluations with SageMaker Clarify only charges you for the instances used while your automatic evaluation jobs are running. When you select an automatic evaluation task and dataset, SageMaker loads the prompt dataset from Amazon S3 onto a SageMaker evaluations instance.
In the following example, an ML engineer runs a evaluation of Llama2 7B model in US-East (N. Virginia) for summarization task accuracy. The recommended instance type for inference for Llama 2 7B is ml.g5.2xlarge. The recommended minimum instance for an evaluation is ml.m5.2xlarge. In this example, the job runs for 45 minutes (depending on the size of the dataset). In this example, the cost would be $1.48 for the evaluation job and detailed results.Processing Job Hours (example)
Region
Instance Type
Instance
Cost per hour
Cost
0.45
US-east-1
LLM hosting
ml.g5.2xlarge
$1.52
$1.14
0.45
US-east-1
evaluation
ml.m5.2xlarge
$0.46
$0.35
Total
$1.48
In the next example, the same engineer in Virginia runs another evaluation job for summarization task accuracy, but uses a customized version of Llama 2 7B that is deployed to their account and up and running. In this case, because the model is already deployed to their account, the only incremental cost would be for the evaluation instance.
Processing Job Hours
Region
Instance Type
Instance
Cost per hour
Cost
0.45
US-east-1
evaluation
ml.m5.2xlarge
$0.46
$0.35
Total
$0.35
-
Pricing Example #18: Foundation model evaluations (human-based evaluation)
In the following example, a machine learning engineer in US East (N. Virginia) runs a human-based evaluation of Llama-2-7B for summarization task accuracy and uses their own private workforce to the evaluation. The recommended instance type for Llama-2-7B is ml.g5.2xlarge. The recommended minimum instance for a human-based evaluation Processing Job is ml.t3.medium. Inference on Llama-2-7B runs for 45 minutes (depends on size of dataset). The dataset contains 50 prompts, and the developer requires 2 workers to rate each prompt-response set (configurable in the evaluation job creation as “workers per prompt” parameter). There will be 100 tasks in this evaluation job (1 task for each prompt-response pair per each worker: 2 workers x 50 prompt-response sets = 100 human tasks). The human workforce takes one day (24 hours) to complete all 100 human evaluation tasks in the evaluation job (depends on number and skill level of workers, and the length/complexity of prompts and inference responses).
Compute Hours
Human tasks
Region
Instance Type
Instance
Cost per hour
Cost per human task
Total Cost
0.45
US East (N Virginia)
LLM hosting
ml.g5.2xlarge
$1.52
$1.14
24
US East (N Virginia)
Processing Job
ml.t3.medium
$0.05
$1.20
100
Any
$0.21
$21.00
Total
$23.34
In the next example, the same engineer in US East (N. Virginia) runs the same evaluation job but uses Llama-2-7B already deployed to their account and up and running. In this case, the only incremental cost would be for the evaluation processing job and human tasks.
Compute Hours
Human tasks
Region
Instance Type
Instance
Cost per hour
Cost per human task
Total Cost
24
US East (N Virginia)
Processing Job
ml.t3.medium
$0.05
$1.20
100
Any
$0.21
$21.00
Total
$22.20