Categories
Azure

Azure Active Directory authorization strategy utilizing nested groups

In this article we’ll review how we can create Azure Active Directory authorization strategy utilizing nested groups with assigned role-based access control (RBAC) and access control lists (ACL). What’s the difference between the two access types. How we can access resources and fine-grain the control with examples for Data Lake and Databricks.

Role-Based Access Control (RBAC) and Access Control Lists (ACL) for Azure Active Directory Groups (AAD groups)

In simpler terms RBAC is the way to assign access to Azure resources and ACL is approach which grant access to specific location only (e.g. Data Lake directory / Databricks Workspace directory). Microsoft already have pre-built RBACs (1) we can use and on top of that we can also create custom-made RBAC. Access control lists (ACL) contain access control entries (ACE) where each entry can be Azure object (for instance AAD group) and each entry can have read, write and execute permissions to Data Lake directory ACLs (2) or it can be read, run, edit and manage for Databricks Workspace ACLs (3) (Different resources may have different object access control types, but the idea is the same).

Azure Active Directory group (AAD group) is a group that can hold users, other AD groups and different access. In the case below we’re using three different AD groups to hold members, RBAC and ACLs.

That way when new member appears he needs only access to his Company AD group (Team), which will deliver full access to everything he needs. The Company AD group (Team) will hold members and other AD groups. For RBAC and ACL we can use different AD groups, where the RBAC AD groups will have role assignments and the ACL AD groups will have ACE.

For naming convention we can have something like this:

  • Azure_Team_{MyCompany}_{MyProject} is Azure Team AAD group to hold members and other AD groups (RBAC & ACL) presuming that separation by company and project might be usefull.
  • Azure_RBAC_{MyProject}_{Environment}_{Role} to hold RBACs, where we can use this for specific project, environment, and role. Most-used environments are development, pre-production (quality) or production. And for role it can be anything from the Microsoft list with RBACs or pre-defined role. Most used are “Contributor” (full access but can’t grant access) and “Reader” (can only read).
  • Azure_ACL_{MyProject/Location}_{Environment}_{Role} to hold ACLs. The idea for MyProject/Location us to have access to specific directory. The rest can be same as above.

Apart of {MyProject} for more monolithic-centric architecture we can use “COMMON” as project, in case many projects share the same resources.

Setup AAD groups

To create new AAD group simply search “Groups” and then click “New Group”

Then name the Group. Optionally we can add description, owners and add members/objects right away, or we can do this after creation.

To add group membership within group, go to the desired group, group memberships > add membership > select. Adding members is done in the same fashion.

Assign RBAC

RBAC can be assigned on all levels – subscription / resource group / resource and the assignment are inherited. If necessary, for Data Lake the assignment can be also within the resource – per container.

To assign to resource group, go to the resource group’s access control (IAM) > Add > Add role assignment and then add role / access / object and click Save. In the example below we’re adding Contributor RBAC to AD group “Azure_RBAC_CoolProject_D_Contributor” for “itt-coolproject-dev-rg” resource group.

If required custom RBAC can be created by selecting “Create custom role”. Here’s list of all actions (4).

Assign ACL – Data Lake Gen2

To assign ACL to Data Lake Gen2 we can use Microsoft Azure Storage Explorer. The important thing to note here is that if ACL is assigned to subfolder, but the previous folders don’t have ACLs (root folders) – the permissions won’t work properly. So if for example we have directory structure like below and we’ve assigned access to “myproject” but don’t have access to view “applications” directory – the access would be incomplete, so we’ll need to add Read & Execute access to applications directory.

To assign access go to the desired directory > Manage Access. There are 3 types of access (6).

 FileDirectory
Read (R)Can read the contents of a fileRequires Read and Execute to list the contents of the directory
Write (W)Can write or append to a fileRequires Write and Execute to create child items in a directory
Execute (X)Does not mean anything in the context of Data Lake Storage Gen2Required to traverse the child items of a directory

For inheritance of the permissions to all new child items in the directory select “Default” but be mindful that changing ACL on a parent does not affect permissions on child items that already exist.

To add the AD group, select “Add” search the AD group and select the permissions.

If child items already exist and we want to assign the parent permissions to them, there’s new functionality to apply ACLs recursively (7) – select the directory > Propagate Access Control Lists… But note that this will re-apply all access control lists from the parent directory, not a specific access control entry.

Assign ACL – Databricks Workspace

In Databricks Workspace the “Shared” directory has full permissions to all provisioned users, so there’s no need of ACLs if we’re using this directory, but if we want more fine-grained control we can create another root directory and assign access there. Another important thing to note is that unlike Data Lake, if we assign access to specific directory, there’s no need to add access to the parent directories.

To assign access click on the directory > Permissions

Select the AD group, the desired permissions and click Done.

Conclusion & important notes

Once we’re ready with the access setup the only thing left to do when new members join the project is to add them to the Team AAD group and they’ll automatically have all the required access. This approach works well for monolithic architectures where many teams share same resources.

Important notes:

  1. Assigning AD groups within Databricks Workspace require premium Databricks license.
  2. Renaming already existing provisioned AAD groups within Databricks does not work at the time of writing. Provisioning tips (8).
  3. For automatic provisioning of Databricks AAD groups you can check out my other article (9)
  4. Data Lake Gen2 ACLs require permissions on parent, Databricks Workspace doesn’t.
  5. Propagating Data Lake Gen2 ACLs re-applies the whole list, not a specific entry.
  6. Adding AAD groups to your Team group must be in “Group memberships” not “Members”.
  7. Custom-build RBAC can be assigned on multiple subscriptions. Deny assignments can also be added.

References

  1. https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles
  2. https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control
  3. https://docs.microsoft.com/en-us/azure/databricks/security/access-control/workspace-acl
  4. https://docs.microsoft.com/en-us/azure/role-based-access-control/resource-provider-operations
  5. https://azure.microsoft.com/en-us/features/storage-explorer
  6. https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control#types-of-acls
  7. https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-explorer-acl#apply-acls-recursively
  8. https://docs.microsoft.com/en-us/azure/databricks/administration-guide/users-groups/scim/aad#provisioning-tips
  9. https://ivotalkstech.com/azure/auto-provision-databricks-ad-groups-that-match-specific-regex

Leave a Reply

Your email address will not be published.