Skip to main content

Describe Azure storage services

Each time you want to spin up a type of storage in Azure, it's bundled under a storage account. Think of it like a namespace in C#. Storage accounts have different types and redundancy options.

You can choose from:

  • Blob storage (files & unstructured data)
  • File storage (NAS but in Azure)
  • Disk storage (HDDs/SSDs for Azure services)
  • Table storage (NoSQL for key-value storage)
  • Queue storage (message queues don't feel like storage but I guess they are?)

Each type has different redundancy options, which will be covered in a later section.

Each storage account needs a unique-to-Azure name. Names must be between 3 and 24 alphanumeric characters.

Describe Azure storage redundancy

This is where the fun begins. Azure storage accounts store multiple copies of your data so it's protected from planned events (tests, hardware replacement) and unplanned events (natural disasters, hardware failures).

More redundancy = more peace of mind = more cost. A business must do a risk assessment of its data and decide on storage types for each subset of its data. Microsoft recommends using three factors to make decisions on storage:

  • How the data is replicated in the primary region
  • Whether the data is replicated to a second region that is geographically distant
  • Whether the application that access the data requires read access to the replicated data in the secondary region if the primary region becomes unavailable

For the first bullet point, Azure storage replicates data three times in the primary region. Two options are offered for how that data will be replicated in the primary region: Locally Redundant Storage (LRS) and Zone Redundant Storage (ZRS).

LRS replicates data three times within a single data center within the primary region. Microsoft uses the selling point "11 nines of durability" to guarantee near-100% retention and purity of data over a year. It's the lowest-cost option, but if that data center experiences a natural disaster, all that data is gone.

ZRS is available for Availability Zone-enabled regions. Remember that availability zones are [n] number of data centers in the primary region, so the idea for ZRS is that your data is synchronized across 3 data centers in the primary region. With ZRS, the data is both read and write accessible even if a single zone becomes unavailable. ZRS is meant for companies that need to meet high availability requirements or government agencies.

If you're with a company that offers a service with high availability and/or durability requirements, it's recommended to have redundancy in a second region as well (bullet point 2). The secondary region sell is pretty typical: your data is copied over to another data center hundreds of miles away from your primary region.

Secondary regions offer Geo-Redundant Storage (GRS) and Geo-Zone-Redundant Storage (GZRS).

GRS backs up your data in the primary region using LRS, then copies the data asynchronously to a single physical location in your specified secondary region (your region pair), where that data center uses LRS to back up your data.

GZRS combines everything together: you get a LRS backup, a ZRS backup, and a GRS backup. This is the most costly option but it over-ensures the safety of the data.

Something to note with GRS (and in turn GZRS) is that when your data is replicated to the secondary region, it's not available for reading like a normal copy. If your workplace experienced a data catastrophe, your workplace would have to reach out to Microsoft to initiate a failover that would allow for that data to be read. In cases where a Microsoft data center is hit by a catastrophe, they will initiate the failover.

There's an [! Important] block that says backup data might not be up to date because of RPO, which goes undefined. RPO stands for Recovery Point Objective, which is the maximum acceptable duration of data loss in case of emergency or disaster. RPO is measured in time (i.e., 30 minutes of data).

Describe Azure storage services

These are the nitty-gritty services that a developer will interact with during daily work versus the IT-focused storage accounts. There are 5 services available, repeated from the storage account section but copied from the learning path page:

  • Azure Blobs: A massively scalable object store for text and binary data. Also includes support for big data analytics through Data Lake Storage Gen2.
  • Azure Files: Managed file shares for cloud or on-premises deployments.
  • Azure Queues: A messaging store for reliable messaging between application components.
  • Azure Disks: Block-level storage volumes for Azure VMs.
  • Azure Tables: NoSQL table option for structured, non-relational data.

All of these services share different tiers that allow you to optimize usage and cost.

  • Hot: optimized for storing data that is accessed frequently
  • Cool: optimized for data that is infrequently accessed, stored for at least 30 days
  • Cold: same as cool but storage duration increased to 90 days, higher cost to access
  • Archive: same as cool but storage duration increased to 180 days, higher cost to access & rehydrate

Key points for the services:

Blob
  • Unstructured high-capacity storage that accepts basically anything
  • Can be accessed from anywhere with an internet connection: supports HTTP(S) and wrappers for various languages (.NET, Java, Python, etc.)
  • Azure handles the physical storage adjustments (i.e., space) so devs can just throw data at it
Files
  • NAS but in the cloud
  • Can map an Azure File service like mapping a folder/drive in Windows Explorer
  • Can be interacted with via scripting (PowerShell, etc.)
Queues
  • Azure's implementation of a message queue (Kafka, RabbitMQ)
  • Message size caps out at 64kb
  • Globally accessible like blob storage is
Disks

In principle they're virtualized storage disks where you get better resilience and availability through redundancy, in practice they're treated like normal hard drives

Tables

NoSQL datastore that can be accessed from within and outside the Azure cloud

Identify Azure data migration options

If you're moving to the cloud, your data will have to move from how you have it set up at the time to living in Azure. Azure supports real-time migration of infrastructure & data with Azure Migrate and Azure Data Box.

Azure Migrate does what it says on the tin: it moves stuff that's physically in your workplace to the cloud with a bunch of tools nestled within it. It can move SQL data to Microsoft-compatible databases, VMs, move MSSQL databases to their Azure counterparts, etc.

Azure Data Box is a giant portable hard drive made for corporations with more than 40 terabytes of data. Microsoft sends it to the corporation, the corporation fills it and sends it back, Microsoft unloads the data within it to Azure.

Identify Azure file movement options

Sometimes you don't need to move your whole operation, but just pieces. Other times you did some POC locally and want to push it to Azure. Still other times you need to move small bits of stuff within Azure. Microsoft has tools for those kinds of migrations as well.

AzCopy is a CLI program that lets you move files from a blob storage account to another service. If your company uses multiple cloud vendors because your teams are siloed, AzCopy also lets you move blobs/files from Azure to another cloud blob storage service like AWS S3.

Azure Storage Explorer is a file explorer for your storage accounts. It has AzCopy built into it, allowing you to use it for its usual features. ASE lets you download from Azure as well.

Azure File Sync allows you to synchronize your cloud storage with a locally hosted server. Handy if you're doing dev/POC work and want to work with live or prod data, just copy it down.