AWS Cloud Operations Blog

Automating metrics collection on Amazon EKS with Amazon Managed Service for Prometheus managed scrapers

Managing and operating monitoring systems for containerized applications can be a significant operational burden for customers such as metrics collection. As container environments scale, customers have to split metric collection across multiple collectors, right-size the collectors to handle peak loads, and continuously manage, patch, secure, and operationalize these collectors. This overhead can detract from an […]

Ingesting administrative logs from Microsoft Azure to AWS CloudTrail Lake

In January 2023, AWS announced the support of ingestion for activity events from non-AWS sources using CloudTrail Lake. Making CloudTrail Lake a single location of immutable user and API activity events for auditing and security investigations. AWS CloudTrail Lake is a managed data lake for capturing, storing, accessing, and analyzing user and API activity on […]

Enable cloud operations workflows with generative AI using Agents for Amazon Bedrock and Amazon CloudWatch Logs

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible […]

Serverless Governance of Software Deployed with AWS Service Catalog

AWS Service Catalog (Service Catalog) is a powerful tool that empowers organizations to manage and govern approved services and resources. It significantly benefits platform engineering by standardizing environments, accelerating service delivery, and enhancing security. With its automated provisioning and resource management, Service Catalog supports infrastructure as code, enabling scalable, reliable deployments. Platform engineering teams are […]

Getting insights from Amazon Managed Service for Prometheus using natural language powered by Amazon Bedrock

As applications scale, customers need more automated practices to maintain application availability and reduce the time and effort spent detecting, debugging, and resolving operational issues. Organizations allocate money and developer time to deploy and manage various monitoring tools, while also dedicating considerable effort to training teams on their usage. When issues arise, operators navigate through […]

Gain visibility of AWS backup activities using Amazon Managed Grafana

AWS Backup is a comprehensive service that simplifies the process of centralizing and automating data protection across various AWS services, both in the cloud and on-premises, all managed seamlessly. Organizations have different requirements and want to track their backup, copy and restore activities across AWS cloud resources. Currently, in order to view status of resource […]

How Amazon CloudWatch Logs Data Protection can help detect and protect sensitive log data

Customer applications running on Amazon Web Services (AWS) often require handling sensitive data such as personally identifiable information (PII) or protected health information (PHI). As a result, sensitive log data can be intentionally or unintentionally logged as part of an application’s observability data. While comprehensive logging is important for application troubleshooting, monitoring and forensics, any […]

Blog Featured Image

Visualize AWS Systems Manager Patch Manager information using Amazon QuickSight

In this blog post, learn how to build an Amazon QuickSight dashboard to visualize critical patch and inventory information to speed up MTTR. Also, you can use filters to search for a specific AWS Account, specific AWS Region, Amazon Elastic Compute Cloud (Amazon EC2) name, or check installed/missed packages. You want to visualize system patching […]

Leveraging AWS CloudTrail Insights for Proactive API Monitoring and Cost Optimization

Leveraging AWS CloudTrail Insights for Proactive API Monitoring and Cost Optimization

AWS CloudTrail Insights is a powerful feature within AWS CloudTrail that helps organizations identify and respond to unusual operational activity in their AWS accounts. This includes identifying spikes in resource provisioning, bursts of IAM actions, or gaps in periodic maintenance activity. CloudTrail Insights continuously analyzes CloudTrail management events from trails and event data stores, establishing […]