AWS - S3 Storage

Scalable Data Storage for Modern Cloud Applications

What is Amazon S3?

Amazon Simple Storage Service (S3) is an Object Storage service that offers industry-leading scalability, data availability, security, and performance. Unlike Block Storage (EBS), which stores data in raw chunks on a disk, S3 stores data as "Objects" within "Buckets."

An Object consists of the file data, a unique Key (its name), and Metadata (information about the file). A Bucket is the container for these objects, much like a folder, but at a massive, global scale.

1. Managing Data in S3

While you don't "create volumes" in S3, you create Buckets. Every object in S3 must reside in a bucket.

How to "Provision" Storage (Create Bucket):
  1. Global Uniqueness: Every Bucket Name must be unique across all AWS accounts globally.
  2. Region Selection: Choose a specific AWS Region to store your data (e.g., us-east-1). This helps with latency and compliance.
  3. Access Control: By default, buckets are private. You must explicitly grant permissions to make them public or accessible to other services.

2. Backup Mechanisms (The "Snapshot" Equivalent)

In S3, you don't take "snapshots." Instead, you protect data using Versioning and Cross-Region Replication (CRR).

  • Versioning: When enabled, S3 keeps multiple variants of an object in the same bucket. If you overwrite a file, the old version is preserved.
  • Cross-Region Replication: S3 can automatically copy every object uploaded to a bucket in one region to a destination bucket in a completely different region.
Steps to "Protect" Data:
  1. Enable Versioning: Go to Bucket Properties and enable "Bucket Versioning."
  2. Configure Replication: In the Management tab, create a replication rule to copy data to another bucket for disaster recovery.

3. Restoring Data

Restoring data in S3 means retrieving a specific version of an object or moving data back from archival storage.

Steps to Restore a Version:
  1. List Versions: In the S3 console, toggle the "Show versions" switch in your bucket.
  2. Download/Copy: Find the version created at the specific point-in-time you need.
  3. Make Current: Copy the old version to be the "current" version to undo an accidental deletion or overwrite.

Note: For data archived in **S3 Glacier**, you must first initiate a "Restore Request" to move the data back to S3 Standard before it can be downloaded.

Common S3 Use Cases

Static Website Hosting

S3 can host entire websites consisting of HTML, CSS, and JavaScript files without needing a web server (like Apache or Nginx).

Data Lakes & Analytics

S3 is the primary storage for Big Data. Services like AWS Athena can run SQL queries directly on data stored in S3 buckets.

Backup and Archiving

Replacing physical tape libraries with S3 Glacier Deep Archive for long-term data retention at a fraction of the cost.

Content Distribution

Serving as the "Origin" for Amazon CloudFront to distribute images, videos, and software updates to users globally with low latency.

Summary Table

Feature EBS (Block) S3 (Object)
Storage Unit Volume Bucket / Object
Backup Method Snapshot Versioning / Replication
Accessibility Attached to 1 EC2 Instance Global (via HTTP/HTTPS)