Design governance
Governance: the general process of establishing rules and policies and subsequently enforcing them
Individuals and startups can get away with having average or even poor governance, but as a business/service grows, it needs to be controlled. A good governance strategy helps maintain control over apps & resources managed in Azure, which allows them to be compliant with industry standards (i.e. security) and corporate standards (also security).
The module will use the fictitious company Tailwind Traders and we'll be its fictitious CTO. Tailwind Traders is fully on-board with Azure, but they need governance.
Design for governance
Before we think about coming up with governance strategies and solutions, our Azure architecture needs to be equipped to support governance design. The best way to do this is organize your management groups, subscriptions, resource groups, and resources in a hierarchical manner.
- Management groups help manage access, policy, and compliance for multiple subscriptions
- Subscriptions are logical containers that serve as units of management and scale, and they also act as billing boundaries
- Resource groups are logical containers into which Azure resources are deployed and managed
- Resources are instances of services we create
If this sounds familiar, it's a summary of the beginning of the Design Prerequisites module, where we talk about Azure accounts. Now that we're in the weeds of "how do we establish rules and policies" we'll get to talk about each one in depth as we design governance for Tailwind Traders.
The subsections after this one will have links to the subsections in the module as they contain detailed images and additional recommendations as lists of bullet points.
Design for management groups
Management groups are containers that help manage access, policy, and compliance across multiple subscriptions, i.e., multiple subscriptions can sit within a management group. The nitty-gritty is in the docs here: https://learn.microsoft.com/en-us/azure/governance/management-groups/overview
Management groups can limit the regions where virtual machines can be created across the subscriptions within the management group. They also provide user access to the subscriptions by creating one role that's inherited by the subscriptions. Finally, management groups can monitor and audit policy assignments across its subscriptions.
Management groups have some characteristics to be aware of when planning the hierarchy:
- Management groups can be used to aggregate policy and initative assignments via Azure Policy
- A management group tree can support up to six levels of depth, which doesn't include the tenant root level or the subscription level
- Azure role-based access control authorization for management group operations isn't enabled by default
- All new subscriptions are placed under the root management group by default
Microsoft cooked up a hierarchy for the management groups of Tailwind Traders, giving the following details about the company's organization:
- They have three departments: sales, corporate, and IT
- Sales manages offices in the East and West
- Corporate offices include HR and Legal
- IT handles research, development, and production
- They currently have two applications hosted in Azure
We'll always have a tenant root, but our first management group at the top of the hierarchy is "Tailwind", representing the entire company. From there we spin up three management groups that represent the different departments in Tailwind.
Microsoft creates a "production" group under the IT group, providing the following rationale: A production management group for Tailwind Traders can provide product-specific policies for corporate applications.
The bullet points are good reading, covering recommendations on keeping hierarchies as flat as possible to minimize complexity, creating a management group that's sandboxed so developers can use it for experimentation, and using management groups to isolate sensitive information.
Design for subscriptions
Azure Subscriptions are logical containers that serve as units of management, scale, and billing boundaries. You can apply limits and quotas to a subscription, and organizations can use subscriptions to manage costs & resources by group.
Each company typically has its own Azure account, and it buys subscriptions for its needs. Remember that we group subscriptions within management groups.
Microsoft urges us to consider the following when working with subscriptions:
- Subscriptions can provide separate billing environments, such as development, test, and production
- Policies for individual subscriptions can help satisfy different compliance standards
- We can organize specialized workloads to scale beyond the limits of a given subscription
- We can manage and track costs for the organization with subscriptions
Microsoft places "West" and "East" subscriptions under the Sales management group, "HR" and "Legal" subscriptions under the Corporate management group, a "R&D" subscription directly underneath the IT management group, and "App1", "App2", and "Shared" subscriptions under the Production group that's a child group of IT.
As part of governance, your coworkers should be made aware of their roles and responsibilities as subscription members.
Design for resource groups
Resource groups are logical containers into which Azure resources are deployed and managed. Resources include things like apps, databases, storage, AppInsights, etc. Resource groups give you the following control features:
- Organization of resources, i.e.:
- grouping resources with similar usage together
- grouping resources with similar lifecycles together
- Applying role permissions to a group of resources, i.e., giving a group access to administer resources
- Resource locks protect individual resources in a group from deletion or change
Like other Azure management features, resource groups have their own quirks and characteristics:
- Resource groups have their own region (location) assigned to them, and the region they're in is where their metadata is stored
- If a resource group's region isn't available (i.e., USWEST is down), the resources within it cannot be updated as the metadata isn't available
- Resources in the resource group can be in different regions from the resource group
- Resources in a resource group can connect to other resources in other resource groups
- Resources can be moved between resource groups (with caveats and exceptions)
- Resource groups cannot be nested
- Each resource must be in one and only one resource group, i.e., a singular database cannot be in two different resource groups
- Resource groups cannot be renamed after creation
In relation to subscriptions, think back to our IT group's App1 subscription. It'll have a webapp, which consists of databases, storage, web services, etc. Microsoft offers two schools of thought for resource group creation: group by type and group by app.
Group by type will have all databases necessary for App1 in one resource group and all web services necessary in another, all under the App1 subscription.
Group by app will have all the services needed for the web application in one resource group. The App1 subscription will have one resource group with everything necessary to run the app, and similarly for the App2 subscription.
Design for resource tags
So what's the point of tags? Automation.
When you tag Azure objects, you provide metadata that signals what roles or properties an Azure object might have. You can have tags that specify region, tags for classification (public, internal, classified), tags for product type, and so on.
Microsoft recommends analysis of your business needs to see if you would benefit from either IT-aligned tags, business-aligned tags, or a combination of both. Depending on the size/scale of your business/client, you may only need IT tags (startup, very low headcount SaaS company) or only need business tags (law firm with majority business employees, minority tech employees).
Business-aligned tags help track things like accounting, ownership, and business criticality. Business tags are great for identifying business departments that typically experience high demand of Azure resources, or labeling which Azure objects represent critical/core parts of your business.
Microsoft provides 5 tag types to leverage when designing a tagging system for your hierarchy:
- Functional
- Categorize Azure objects according to their purpose. Typically these will be for the deployed environment of the resource, but can describe functionality and operational details as well
- Examples: "app = catalogsearch1", "env = staging", "webserver = apache"
- Categorize Azure objects according to their purpose. Typically these will be for the deployed environment of the resource, but can describe functionality and operational details as well
- Classification
- Identify how resources are used
- Examples: "confidentiality = private", "sla = 24hours"
- Identify how resources are used
- Accounting
- Associate an Azure object within specific groups for billing purposes
- Examples: "department = finance", "region = northamerica", "program = business-initiative"
- Associate an Azure object within specific groups for billing purposes
- Partnership
- Describe which people and/or departments are associated with an Azure object
- Examples: "owner = jsmith", "stakeholders = user1;user2;user3"
- Describe which people and/or departments are associated with an Azure object
- Purpose
- Associate Azure objects with business functions
- Examples: "businessprocess = support", "revenueimpact = high"
- Associate Azure objects with business functions
Tagging can get out of hand very quickly, so Microsoft makes the following suggestions:
- Start with as few tags as possible and adding tags as use cases & needs crop up
- Use Azure policy to apply tags automatically
- Enforce tagging rules and conventions
- Not all resources require tagging; it may be prudent to only tag your critical Azure objects
Design for Azure Policy
Azure Policy allows you to create JSON documents that specify rules in order to control your Azure objects. They're very granular, dealing with one object and one outcome. Azure objects can have multiple policies applied to them.
Azure Policies are grouped in containers called Initiatives. Microsoft provides some generic policies out of the box, but you can create your own via the Azure CLI, REST API, or web portal.
Azure Policies are inherited down the organizational hierarchy and can be scoped to different levels of the hierarchy.
Azure Policies evaluate all resources in Azure and highlight Azure objects that aren't compliant with policies.
Azure Policies can provide limitations for certain resource types, enforce tags, restrict usage based on user or region, and so on.
Azure Policies are one of the hard parts of the organizational hierarchy because there are so many ways you can control the resources at your disposal and how they're distributed. Azure Policies can do some of the heavy lifting on their own, or you can handle noncompliant resources manually.
One mistake that developers initially make is conflating Azure Policy with Azure Role Based Access Control (RBAC). Azure Policy ensures that the state of any given resource is compliant with the business rules; it doesn't matter who made any given change or who has permission to make changes. RBAC focuses on user interactions at different scopes: who can access resources, what can be done with said resources, and what areas a given user can access within Azure.
RBAC controls actions, while Policy enforces state. The next section talks more about using RBAC.
Design for role-baesd access control (RBAC)
While Azure Policy enforces resource state to be compliant with business rules, RBAC is what allows Azure administrators to grant or deny access to resources. The video on the page draws 3 circles that group various aspects of RBAC under 3 different terms: "what", "who" and "where".
"Who" encompasses identity: users, groups, managed identities, and even applications. These are the things we can assign RBAC to.
"What" encompasses roles. Roles have two subgroups: built-in and custom. Roles are the grouping of permissions.
"Where" encompasses scopes, the levels in your hierarchy of Azure tenants, management groups, subscriptions, resource groups, etc.
RBAC: What can your Who do and Where?
Example RBAC scenarios:
- Allow a single user to manage VMs in a subscription, then allow another single user to manage virtual networks
- Allow members of a database admin group to manage SQL databases in a subscription
- Allow a given user to manage all resources in a resource group, such as VMs, websites, and subnets
- Allow an application to access all resources in a resource group
RBAC considerations:
Your first step is accurately defining each role definition in your Azure hierarchy and its permissions. Next, assign those roles to specific users, groups, and service principles. Lastly, scope those roles to management groups, subscriptions, etc. Assign each role at the highest scope level that meets the requirements.
As you plan your RBAC implementation, consider the access needs for your existing users (devs, apps, etc.). It's best practice to grant users the least privilege they need to get their work done. The philosophy of least privilege makes it easier to federate team member responsibilities and limit risk if something is compromised.
It might be worth your time, depending on organization size, to assign roles to groups versus users. Adding users to specific permission groups helps minimize the number of permission assignments for each entity (user, app, etc.)
While planning RBAC implementation, you may run into the need for custom roles. That is a separate aspect of RBAC that's not covered by this course specifically.
Additionally while planning, keep in mind that you might create overlapping role assignments, and that you'll have to resolve them when they come up. RBAC is an additive model, so a user's effective permissions are the sum of their role assignments. If a user has a "Contributor" role at the subscription scope and "Reader" on a resource group, the "sum" of Contributor and Reader is effectively Contributor, meaning assigning the Reader role is redundant: as a Contributor, the user can read and work with the resource.
Design for Azure landing zones
A landing zone is either a management group or a subscription that is designed to scale according to business needs and priorities. Basically, the stuff in a landing zone is everything Azure related (RBAC, resources, tags, even sub-management groups and resource groups) that your app needs to run when it lands in Azure.
Landing zones are a concept and not a tangible product offering: you can use the Azure CLI, REST API, etc. to pre-provision resources. However, Microsoft offers a service called the "Landing Zone Accelerator" that sets up resources as a landing zone for you.
The landing zone concept has a lot going on within in terms of designing one, to the point where it has its own section in the Cloud Adoption Framework: https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/