Storage Gateway
AWS Storage Gateway is a hybrid storage service. “Hybrid” means it connects two worlds: the servers running in your own data center (on-premises) and storage in the AWS cloud. You install a small piece of software (a virtual appliance) in your data center, and it presents normal-looking file shares or disks to your existing applications. Behind the scenes, the data is stored in Amazon S3, EBS snapshots, or virtual tapes — so your old apps keep working unchanged while their data lives in the cloud. This page explains the three gateway types, when each fits, and the one mistake people make most: treating it as a fast local disk.
Why Storage Gateway exists
Many companies have applications that were written years ago to read and write to a local file server or a SAN (Storage Area Network — a pool of block disks shared over the network). You cannot easily rewrite those apps to talk to the S3 API. Storage Gateway solves this by giving those apps a familiar protocol — NFS, SMB, or iSCSI — while quietly moving the data to AWS. It also keeps a copy of your most recently used data on a local cache disk, so frequent reads stay fast.
Define the protocols once: NFS (Network File System, the Linux/Unix way to mount a remote folder), SMB (Server Message Block, the Windows file-sharing protocol), and iSCSI (a way to present a remote disk as if it were a local block device).
The three gateway types
| Gateway type | Protocol your apps see | Where data lands in AWS | Best for |
|---|---|---|---|
| File Gateway | NFS or SMB file shares | Objects in Amazon S3 | Storing files/documents in S3 while keeping a file-share interface |
| Volume Gateway | iSCSI block volumes | EBS snapshots in S3 | Block storage and backup of on-prem SAN volumes |
| Tape Gateway | iSCSI VTL (Virtual Tape Library) | S3 and S3 Glacier | Replacing physical backup tapes |
File Gateway
File Gateway exposes an NFS or SMB share. When your app writes a file, the gateway stores it as a single object in an S3 bucket — one file becomes one S3 object, with the same name. This is the most common type because the result is plain S3 data you can also read with the S3 API, analytics tools, or lifecycle rules.
When to use it: you want files in S3 but your apps only speak NFS/SMB. When NOT to: you need a true shared POSIX file system across many EC2 instances — use Amazon EFS instead.
Volume Gateway
Volume Gateway presents iSCSI block volumes. It has two modes:
- Cached volumes — the primary data lives in S3; a local cache holds hot blocks. Good when on-prem disk is limited.
- Stored volumes — the primary data lives on-prem; AWS holds asynchronous backups as EBS snapshots. Good when you want fully local low latency plus offsite backup.
When to use it: backing up SAN volumes or doing disaster recovery for block workloads. When NOT to: as primary storage for a latency-sensitive database — local SSD or EBS on EC2 is the right tool.
Tape Gateway
Tape Gateway pretends to be a physical tape library (a VTL) so your existing backup software (Veeam, Commvault, etc.) keeps working. The “tapes” are stored in S3 and can be moved to S3 Glacier for cheap long-term archival.
When to use it: retiring physical tape infrastructure for backups and compliance archives. When NOT to: for active, frequently read data — Glacier retrieval can take minutes to hours.
The big gotcha: it is an on-ramp, not a fast local disk
This is the point most tutorials skip. Storage Gateway is a hybrid and migration tool, not a way to “mount S3 as a fast local drive.” Performance depends entirely on two things:
- Your local cache — reads of recently used data are fast because they come from the cache disk. Reads of cold data must be pulled from S3 over your internet/Direct Connect link, which is far slower.
- Your network link — writes are uploaded asynchronously, so a slow or saturated link causes a growing backlog.
Treat Storage Gateway as a bridge into the cloud, not as primary low-latency storage. If you need consistent millisecond access from EC2, use EBS or EFS directly. If your link is slow and you read mostly cold data, expect cold-read latency, not local-disk latency.
Creating a File Gateway
Console steps
- Open the AWS Storage Gateway console and choose Create gateway.
- Pick gateway type Amazon S3 File Gateway.
- Choose a platform (VMware ESXi, Microsoft Hyper-V, Amazon EC2, or a hardware appliance) and download/deploy the appliance image.
- Power on the appliance, note its IP address, then back in the console choose Connect and enter that IP.
- Configure time zone and gateway name, then Activate gateway.
- Assign at least one local disk as cache.
- Choose Create file share, select your S3 bucket, the protocol (NFS or SMB), and the IAM role the gateway uses to access S3.
AWS CLI equivalent
After the appliance is activated and you have its gateway ARN, create the share:
aws storagegateway create-nfs-file-share \
--client-token a1b2c3d4 \
--gateway-arn arn:aws:storagegateway:us-east-1:111122223333:gateway/sgw-0a1b2c3d \
--location-arn arn:aws:s3:::my-app-files \
--role arn:aws:iam::111122223333:role/StorageGatewayS3Role \
--client-list 10.0.0.0/24
Output:
{
"FileShareARN": "arn:aws:storagegateway:us-east-1:111122223333:share/share-0a1b2c3d"
}
On a Linux client you then mount the share like any NFS export (replace the IP with your gateway’s address):
sudo mount -t nfs -o nolock,hard 10.0.0.50:/my-app-files /mnt/files
Sizing the cache and watching the link
The cache disk should be large enough to hold your working set (the data accessed day to day). The console exposes a CacheHitPercent CloudWatch metric — if it drops, reads are going to S3 and feeling slow. You can check upload backlog with the CLI:
aws cloudwatch get-metric-statistics \
--namespace AWS/StorageGateway \
--metric-name CachePercentDirty \
--dimensions Name=GatewayId,Value=sgw-0a1b2c3d \
--start-time 2026-06-15T00:00:00Z \
--end-time 2026-06-15T01:00:00Z \
--period 300 --statistics Average
Output:
{
"Datapoints": [
{ "Timestamp": "2026-06-15T00:30:00Z", "Average": 4.2, "Unit": "Percent" }
],
"Label": "CachePercentDirty"
}
A high CachePercentDirty value means writes are queued and not yet uploaded — usually a sign your network link is the bottleneck.
Cost note
You pay a per-gateway hourly fee (a few cents per hour) plus the normal storage cost of the data that lands in S3, EBS snapshots, or Glacier. Data transferred out of AWS to your data center incurs standard egress charges, so a workload that constantly re-reads cold data from S3 can cost more than you expect. The local cache disk is your own on-prem hardware and is not billed by AWS.
Best practices
- Size the local cache for your hot working set so most reads are cache hits, not S3 round-trips.
- Use a reliable, low-latency link (AWS Direct Connect for production) to avoid upload backlogs.
- Pick File Gateway for file workloads, Volume Gateway for block/backup, and Tape Gateway only for backup-tape replacement.
- Apply S3 lifecycle policies to the destination bucket to tier older data to cheaper classes automatically.
- Monitor CacheHitPercent and CachePercentDirty in CloudWatch and alarm on degradation.
- Treat the gateway as a migration on-ramp; once data is in S3, prefer native AWS services for cloud-resident workloads.
- Encrypt the destination bucket and use a least-privilege IAM role for the gateway.