Colossus

Colossusis Google’s next-generation distributed file system. It underpins data storage for Google Cloud services like Bigtable, Spanner, and Google Cloud Storage—offering extreme scale, reliability, and performance.

Distributed file system

Colossus stores and manages exabytes of data across thousands of servers globally, supporting massive concurrency and ubiquitous data access.

High availability

Data is replicated automatically, ensuring durability and rapid recovery if hardware fails. No single point of failure exists.

Performance

Optimized for low-latency reads/writes and high throughput—enabling every Google service from YouTube to Gmail to operate seamlessly.

Scalability

Easily scales to billions of objects and users. Data is stored in chunks and managed across globally distributed Colossus clusters.

Data integrity

Colossus validates checksums on every read/write, protecting against hardware and network errors with automatic recovery.

Serverless integration

Cloud services interact with Colossus using APIs. Developers leverage Colossus-backed storage via products like Bigtable and Google Cloud Storage.

Colossus architecture

Colossus eliminates dependence on central metadata servers by distributing metadata and chunk management--delivering higher availability and reliability.

Regions in Colossus

Colossus operates globally, with data replication across multiple regions and zones for resilience, locality, and regulatory compliance.

Security in Colossus

All stored data is encrypted at rest and in transit, including chunk and metadata encryption. Fine-grained access controls ensure data safety.

Colossus vs. GFS (Google File System)

Colossus is the successor to GFS—built for modern scale, capable of supporting cloud-native, multi-tenant, and exabyte-scale workloads with vastly increased reliability and performance.

Colossus-backed services

Bigtable, Spanner, Google Cloud Storage, and other core Google platforms leverage Colossus for durable, high-speed, and globally synchronous data access.

Monitoring Colossus

At Google, Colossus's health, network latency, replication status, chunk errors, and capacity are continuously monitored with automated recovery processes.