Cloud Storage

Key features

  • Object storage suitable for any type of data
  • Scalable to Exabytes (1000 Petabytes)
  • Very high availability across all storage classes
  • Single API
  • Data stored in buckets
  • Supports customer supplied encryption keys (CSEK)

Storage Classes

There are a number of storage classes, each with their own characteristics and storage costs

  • Multi-region storage is suitable for data that must be accessed globally e.g. website, streaming content etc
  • Dual-region storage optimised for resources located in one of the regions and improved availability
  • In-Region offers best performance for data stored alongside GKE & Compute Engine instances

Storage classes in descending order of cost

Note: as storage costs decrease, access costs (price for reading the data) increase

Standard
  • Most expensive option
  • No minimum storage duration
  • No retrieval cost
  • Best for frequently accessed data aka hot data
Nearline
  • Low cost option for data at rest
  • Minimum storage = 30 days
  • Low data access costs
  • Slightly lower availability compared with standard storage
  • Lower at-rest cost compared with standard
  • Suitable for data read or modified monthly
Coldline
  • Very low cost for data accessed infrequently
  • Minimum storage duration = 90 days
  • Higher data access costs
  • Suitable for data is read or modified quarterly
Archive
  • Lowest cost
  • Minimum storage duration = 365 days
  • Highest costs for data access
  • Suitable for online backup and DR
  • Data is still available in a matter of seconds
  • Suitable for data that is ready or modified annually

Storage SLAs

All storage classes have high annual durability of 99.999999999% (11 9s) meaning that even in the data is not accessible, it is still safe and secure.

RegionStandardNearlineColdlineArchive
multi-region99.9599.999.999.95
dual-region99.9599.999.999.95
region99.999.099.099.9
Monthly Uptime Percentage

Storage Buckets

  • Basic containers that hold data in a flat namespace i.e. no-hierarchy
  • Data is accessed through the console, gsutil or the JASON and XML APIs
  • Buckets must have globally unique name
  • Each bucket has a default storage class
  • Object lifecycle management provides lifecycle management of a bucket e.g.
    • Downgrade storage class
    • Delete objects of a spcified age
    • Retain a certain number of revisions
    • Rules applied in batches and can take upto 24 hours to come into effect

Storage Objects

  • Data is stored in objects
  • Object names must be unique within a bucket
  • No limit on the number of objects in a bucket
  • Objects have data and metadata
  • Object metadata is a collection of name-value paries describing it’s properties
  • Cloud console and gsutil allow a virtual directory structure to be created e.g. /data/web/xml, /data/web/html but in reality data is stored in a flat structure
  • Objects inherit the default storage class of the bucket but can be overridden during or after creation
  • A bucket may contain a mix of storage class objects
  • An objects location type (region) cannot be altered
  • Object versioning allows different generations of an object to be stored
    • Objects are immutable: cannot be changed but can be replaced (overwritten) or deleted
    • Older versions can be retained through versioning
    • Enabled at a bucket level
    • Each revision of an object is assigned a generation number

Access Control

  • Permissions applied either as:
    • Uniform access (recommended) with Cloud IAM policies
    • Fine-grained control with legacy Access Control Lists (ACL) designed for interoperability with AWS
    • Signed URLs

Cloud IAM

  • IAM cloud storage permissions are grouped into IAM roles for cloud storage
  • Roles can be applied at a project or bucket level
  • Some roles can be applied at both project and bucket level whilst others can be applied at one level only
  • IAM Policies are inherited from project -> bucket -> object
  • Policies can be applied at bucket and object level as well
    • Policies applied at a bucket level are not shown in IAM project level policies
    • Inherited project level permissions are shown in cloud console when viewing a bucket level policy but not when viewed through gsutil or APIs
    • Use Cloud Console to view both project and bucket level policies to determine full level of access

Access Control Lists

  • Use alongside IAM to provide fine grained access to storage objects: permission from one OR the other is sufficient to provide access
  • ACLs can be applied at bucket or object level whereas IAM policies control access at a bucket level
  • IAM and ACL permissions are different with a few exceptions
  • ACLs do not appear in IAM Policies so must be checked in addition to a policy to determine access rights
  • ACLs consists of up to 100 entries per bucket or object each containing:
    • Permissions (what)
    • Scope (who)
  • Permissions are concentric (cumulative) meaning that a user with WRITER permissions also has READER etc
    • READER (Bucket and Object level)
    • WRITER (Bucket level only)
    • OWNER (Bucket and Object level)
    • Default (Bucket and Object level)
  • Scope are associated with permissions and may consists of the following:
    • Google account email address
    • Google Group email address
    • Convenience values (discouraged)
    • Google Workspace (G-Suite) domain user
    • Cloud Identity domain user
    • allAuthenticatedUsers Anyone authenticated with a Google account
    • allUsers Literally anyone (does not require authentication)

Signed URLs

  • Provide time limited access to a resource to anyone with the URL
  • Access to defined storage API XML endpoints only
  • URLs associated with a user or service account with required permissions on target object
  • Commonly used to provide upload and download access to a resource