Build a Data Classification MVP for Microsoft 365
Kickstart your governance with data classification users will actually use!
- Resources are the primary obstacle to getting a foothold in Microsoft 365 governance, whether it is funding or FTE resources.
- Data is segmented and is difficult to analyse when you can’t see it or manage the relationships between sources.
- Organisations expect results early and quickly and a common obstacle is that building a proper data classification framework can take more than two years and the business can't wait that long.
Critical Insight
- Data classification is the lynchpin to ANY effective governance of Microsoft 365 and your objective is to navigate through this easily and effectively and build a robust, secure, and viable governance model.
- Start your journey by identifying what and where your data is and how much data you have. You need to understand what sensitive data you have and where it is stored before you can protect it or govern that data.
- Ensure there is a high-level leader who is the champion of the governance objective.
Impact and Result
- Using least complex sensitivity labels in your classification are your building blocks to compliance and security in your data management schema; they are your foundational steps.
Questions you need to ask
Four key questions to kick off your MVP
Know Your Data
Protect Your Data
Prevent Data Loss
Govern Your Data
Classification tiers
Build your schema
Microsoft MIP Topology
Microsoft Information Protection (MIP), which is part of Microsoft’s Data Classification Services, is the key to achieving your governance goals. Without an MVP, data classification will be overwhelming; simplifying is the first step in achieving governance.
Discover and classify on-premises files using AIP
Azure Information Protection scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data. Discover mode helps you identify and report on files containing sensitive data (Microsoft Inside Track and CIAOPS, 2022). Enforce mode automatically classifies, labels, and protects files with sensitive data.
AIP helps you manage sensitive data prior to migrating to Office 365:
- Use discover mode to identify and report on files containing sensitive data.
- Use enforce mode to automatically classify, label, and protect files with sensitive data.
Can be configured to scan:
- SMB files
- SharePoint Server 2016, 2013
- Map your network and find over-exposed file shares.
- Protect files using MIP encryption.
- Inspect the content in file repositories and discover sensitive information.
- Classify and label file per MIP policy.
Understanding Governance
Microsoft Information Governance
Information Governance
- Retention policies for workloads
- Inactive and archive mailboxes
Records Management
- Retention labels for items
- Disposition review
Retention and Deletion
Connectors for Third-Party Data
Information governance manages your content lifecycle using solutions to import, store, and classify business-critical data so you can keep what you need and delete what you do not. Backup should not be used as a retention methodology since information governance is managed as a “living entity” and backup is a stored information block that is “suspended in time.
Records management uses intelligent classification to automate and simplify the retention schedule for regulatory, legal, and business-critical records in your organisation. It is that discrete set of content that needs to be immutable.
Retention and Backup Policy Decision
Retention is not backup
Jacana IT Insight
Retention is not backup. Retention means something different:“the content must be available for discovery and legal document production while being able to defend its provenance, chain of custody, and its deletion or destruction” (AvePoint Blog, 2021).
Microsoft Responsibility (Microsoft Protection) Weeks to Months.
Loss of service due to natural disaster or data center outage
Loss of service due to hardware or infrastructure failure
Short-term (30 days) user error with recycle bin/version history (including OneDrive “File Restore”)
Short-term (14 days) administrative error with soft-delete for groups, mailboxes, or service-led rollback
Customer Responsibility (DLP, Backup, Retention Policy) Months to Years.
Loss of data due to departing employees or deactivated accounts
Loss of data due to malicious insiders or hackers deleting content
Loss of data due to malware or ransomware
Recovery from prolonged outages
Long-term accidental deletion coverage with selective rollback
Understand Retention Policy
What are retention policies used for? Why do you need them as part of your MVP?
Do not confuse retention labels and policies with backup.
Remember: “retention [policies are] auto-applied whereas retention label policies are only applied if the content is tagged with the associated retention label” (AvePoint Blog, 2021).
Data retention policy tools enable a business to:
- Decide proactively whether to retain content, delete content, or retain and then delete the content when needed.
- Apply a policy to all content or just content meeting certain conditions, such as items with specific keywords or specific types of sensitive information.
- Apply a single policy to the entire organisation or specific locations or users.
- Maintain discoverability of content for lawyers and auditors, while protecting it from change or access by other users. […] ‘Retention Policies’ are different to ‘Retention Label Policies’ – they do the same thing – but a retention policy is auto-applied, whereas retention label policies are only applied if the content is tagged with the associated retention label.
“It is also important to remember that ‘Retention Label Policies’ do not move a copy of the content to the ‘Preservation Holds’ folder until the content under policy is changed next.” (Source: AvePoint Blog, 2021)
Definitions
Data classification is a focused term used in the fields of cybersecurity and information governance to describe the process of identifying, categorising, and protecting content according to its sensitivity or impact level. In its most basic form, data classification is a means of protecting your data from unauthorised disclosure, alteration, or destruction based on how sensitive or impactful it is.
Once data is classified, you can then create policies; sensitive data types, trainable classifiers, and sensitivity labels function as inputs to policies. Policies define behaviors, like if there will be a default label, if labeling is mandatory, what locations the label will be applied to, and under what conditions. A policy is created when you configure Microsoft 365 to publish or automatically apply sensitive information types, trainable classifiers, or labels.
Sensitivity label policies show one or more labels to Office apps (like Outlook and Word), SharePoint sites, and Office 365 groups. Once published, users can apply the labels to protect their content
Data loss prevention (DLP) policies help identify and protect your organisation’s sensitive info (Microsoft Docs, April 2022). For example, you can set up policies to help make sure information in email and documents is not shared with the wrong people. DLP policies can use sensitive information types and retention labels to identify content containing information that might need protection.
Retention policies and retention label policies help you keep what you want and get rid of what you do not. They also play a significant role in records management.
Data examples for MVP classification
- Examples of the type of data you consider to be Confidential, Internal, or Public.
- This will help you determine what to classify and where it is.
Internal Personal, Employment, and Job Performance Data
- Social Security Number
- Date of birth
- Marital status
- Job application data
- Mailing address
- Resume
- Background checks
- Interview notes
- Employment contract
- Pay rate
- Bonuses
- Benefits
- Performance reviews
- Disciplinary notes or warnings
Confidential Information
- Business and marketing plans
- Company initiatives
- Customer information and lists
- Information relating to intellectual property
- Invention or patent
- Research data
- Passwords and IT-related information
- Information received from third parties
- Company financial account information
- Social Security Number
- Payroll and personnel records
- Health information
- Self-restricted personal data
- Credit card information
Internal Data
- Sales data
- Website data
- Customer information
- Job application data
- Financial data
- Marketing data
- Resource data
Public Data
- Press releases
- Job descriptions
- Marketing material intended for general public
- Research publications
New Container Sensitivity Labels (MIP)
New Container Sensitivity Labels
Privacy
Public
- Membership to group is open; anyone can join
- “Everyone except external guest” ACL onsite; content available in search to all tenants
Private
- Only owner can add members
- No access beyond the group membership until someone shares it or changes permissions
External guest policy
Allowed
- Membership to group is open; anyone can join
- “Everyone except external guest” ACL onsite; content available in search to all tenants
Not Allowed
- Only owner can add members
- No access beyond the group membership until someone shares it or changes permissions
What users will see when they create or label a Team/Group/Site
Why you need sensitivity container labels:
- Manage privacy of Teams Sites and Microsoft 365 Groups
- Manage external user access to SPO sites and teams
- Manage external sharing from SPO sites
- Manage access from unmanaged devices
Data Protection and Security Baselines
Data Protection Baseline
“Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline” (Microsoft Docs, June 2022). This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance. This baseline draws elements primarily from NIST CSF (National Institute of Standards and Technology Cybersecurity Framework) and ISO (International Organization for Standardization) as well as from FedRAMP (Federal Risk and Authorization Management Program) and GDPR (General Data Protection Regulation of the European Union).
Security Baseline
The final stage in Microsoft 365 governance is security. You need to implement a governance policy that clearly defines storage locations for certain types of data and who has permission to access it. You need to record and track who accesses content and how they share it externally. “Part of your process should involve monitoring unusual external sharing to ensure staff only share documents that they are allowed to” (Rencore, 2021).
Prerequisite Baseline
Security
- Banned password list
- BYOD sync with corporate network
Users
- Enable guest users
- External sharing
- Block client forwarding rules
Resources
- OneDrive
- SharePoint
Controls
- Mobile application management policy
Building Baselines
Sensitivity Profiles: Public, Internal, Confidential; Subcategory: Highly Confidential.
Microsoft 365 Collaboration Protection Profiles
Sensitivity
Description
Label details
Teams or Site details
Public
Data that is specifically prepared for public consumption
- No content marking
- No encryption
- Public site
- External collaboration allowed
- Unmanaged devices: allow full access
Public Team or Site open discovery, guests are allowed
External Collaboration
Not approved for public consumption, but OK for external collaboration
- No content marking
- No encryption
- Private site
- External collaboration allowed
- Unmanaged devices: allow full access
Private Team or Site members are invited, guests are allowed
Internal
External collaboration highly discouraged and must be justified
- Content marking
- Encryption
- Private site
- External collaboration allowed but monitored
- Unmanaged devices: limited web access
Private Team or Site members are invited, guests are allowed
Highly Confidential
Data of the highest sensitivity: avoid oversharing, internal collaboration only
- Content marking
- Encryption
- Private site
- External collaboration disabled
- Unmanaged devices: block access
Private Team or Site members are invited, guests are not allowed
MVP activities
Define Your Governance
The objective of the MVP is reducing barriers to establishing an initial governance position, and then enabling rapid progression of the solution to address a variety of tangible risks, including DLP, data retention, legal holds, and labeling. Decide on your classification labels early.
Data Discovery and Management
AIP (Azure Information Protection) scanner helps discover, classify, label, and protect sensitive information in on-premises file servers. You can run the scanner and get immediate insight into risks with on-premises data.
Primary Activities
Baseline Setup
Building baseline profiles will be a part of your MVP. You will understand what type of information you are addressing and label it accordingly. Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline.
Default Microsoft 365 settings
Microsoft provides a default assessment in Compliance Manager for the Microsoft 365 data protection baseline. This baseline assessment has a set of controls for key regulations and standards for data protection and general data governance.