Identity meets Resilience
Blog

Identity meets Resilience

IAM is the loosely defined collection of functions also known as account and password management. It contains Identity Governance and Administration (IGA), Access Management (authentication and authorization) with subsets Privileged Access Management (access for sensitive accounts) and Public Key Infrastructure (management and use of cryptographic keys).
5 minutes
November 12, 2024

Identity & Access Management overview

IAM is the loosely defined collection of functions also known as account and password management. It contains Identity Governance and Administration (IGA), Access Management (authentication and authorization) with subsets Privileged Access Management (access for sensitive accounts) and Public Key Infrastructure (management and use of cryptographic keys).

IGA is a hybrid set of rule-based and request-based provisioning workflows to directories and applications leveraging local account stores, bolstered by periodic attestation flows: did you as the appointed manager approve this? Typically, Identity Governance offers a web portal and has a provisioning engine, executing queues of tasks in creating and changing user accounts, passwords, and authorizations. Conceptually, it covers the process front-end of account security, whereas Access Management, Privileged Access Management and PKI are the technical back-end, where all the real-time operations happen.

Resilience of Identity

As the corporate trust anchor, especially in any modern Zero Trust architecture, Identity is the top of the food chain in security. Considering it is an attackers’ candy store, it should be guarded heavily against compromise, and prepared (with procedures and exercises) for early and speedy rebuilding and recovery in case of a major cyber incident. Traditionally IGA isn’t considered mission critical as it is asynchronous and onboarding new employees isn’t considered an essential capability. However, onboarding emergency reinforcements is mission critical, as any drill will show. Now, this is a good starting point: have a pen-test as a kick-off.

Onboarding additional staff

The most likely use case in a cyber crisis in progress is to onboard extra support staff, such as forensic specialists, additional technical support, and various crisis & communication specialists. This shouldn’t be anything out of the ordinary but should be possible during the freeze of an environment under attack. It may interfere with defense, as the attacker is quite likely messing about with accounts, too. However, if you know which changes are known to be good (since they come out of IGA – at least as long as that is secure), you can rule anything else as known to be bad. And as a bonus, if you have IGA Reconciliation, you can automatically clean up.

Reset all credentials

The most important of all use cases during a breach is “Reset all passwords”, in a wider sense including everything equivalent, like static keys (as in MDM, API, SSH and PKI) and for additional factors in MFA (such as TOTP). You should also mind factors you can’t reset – biometrics and device fingerprints. Refreshing credentials is vital as compromise is very hard to detect and they make perfect persistence vectors.

This complex use case is somewhere between IGA and Access Management, depending on the products in use and choices made. Resetting all passwords sounds straightforward – which it may be – but has its challenges. But nothing you can’t prepare for; a simulation (dry run) will be helpful in ironing oud the biggest obstacles. It will serve as a last resort measure, considering its’ impact, not to be attempted unprepared, yet if an attacker has had a few weeks and is seen to have used Mimikatz, which is standard for any APT or anything like an orchestrated attack, it is probably unavoidable.

First and foremost, this use case is about knowing where and how. Clearly you should start with provisioning to core and central systems providing Single Sign On, such as Azure Active Directory and the traditional AD. Unless your network is immense and plagued by latency and queuing, executing the great reset is doable. Of course, you’ll have confused users and find orphaned and dormant accounts and the helpdesk will be flooded, but this is all foreseeable.

Reset decentral accounts

But what about non-integrated platforms? Quite often these are also privileged accounts, such as local accounts in the Linux layer under the Java servers, or the local admin accounts on windows clients. As they are not integrated, they’ll typically be out of scope – from the IT perspective. As attackers love these, they can’t be out of scope for Identity management from the security perspective. So, this gives the third Resilience meets Identity Use Case, and a really hard one at that: manage stand-alone accounts.

There should be as a minimum transparency, meaning that the process owner can inform on where such accounts are. In time of crisis, knowing what to (and how to) manually reset, or monitor as a minimum, is a must have.

Reset periphery accounts

The hardest part of this Use Case is Managing all Periphery Accounts, including those used by 3rd party staff. These include accounts in the supply chain (like customer support accounts at IT-vendors and voicemail – which have proven to be a successful attack vector), SaaS (Github, ChatGPT, etc,) accounts, including social media and those considered Shadow IT. As a final category we should mention administrators and developers’ private accounts, as these are rarely truly separated. Their workstations are commonly a treasure trove for cached credentials and a common starting point for breaches.

Area denial & breach tracking by Identity

Area denial is the containment part of breach handling – moving internal boundaries sealing off data and systems, hopefully blocking compromise. Its’ effectiveness is related to your understanding of what the attacker is doing and intends to do. If you are in the blind area denial can still be useful, to limit exposure, for instance by offloading sensitive files or pausing access to databases. This can reduce potential additional losses, or just frustrate the attacker. The latter may not be extremely useful, but it may just make the attacker quit – or at least it will make the defensive team feel better.

Prepared Emergency raising of account barriers

Less impactful but possibly effective during a major incident is shoring up defenses by increasing password complexity rules, lowering the reset counter for incorrect password entries and/or shortening the password validity period. It will help contain a breach from spreading – at least reducing recovery costs, and possibly meaning less down-time.

Enforcing this is probably only feasible in corporate systems and possibly in some supply chains, however. And as these changes can have a big impact, they can only be done reliably if prepared.

There are more things you can do, things you could probably have done if the change had been approved. To name a few:

Emergency integration for provisioning

Limit rights to unstructured data

This use case basically aims to reduce authorizations and should be part of prevention – yet will help contain breaches in progress. Even though “Least Privilege” has been a mantra in information security since at least the late 1970s, reality is in many networks that shared network drives are still in widespread use and deemed so very indispensable for daily operations that they are moved to the cloud in a Lift and Shift approach.

Way too often the rights on those shares are set to basically everyone in the organization. Such file shares tend to accumulate a wide variety of documents over time, including sensitive stuff. The problem isn’t just file shares, but any data that is not centrally stored and managed, as anything you open via a browser is stored in your download folder and anything you print is cached in the printer’s internal hard drive. It is all the MS Office data, e-mail messages, PDF-files, but also just about anything your phone backs-up in the cloud. As a regular user you probably have access to more data than you would expect, something that is gaining visibility with AI-driven chatbots such as CoPilot, which shows any data it considers to be relevant to what you have access to. So, expect this topic to surface soon.

In real-world security, unstructured data tends to be a major blind spot: there is no owner, no quantifiable risk, and no easy solution in the shape of an omnipotent tool. Identity Management can help, but there are no easy fixes or quick wins, so it often gets a low priority. “Big data experts estimate that unstructured data accounts for 90% of all new enterprise data. This trend reveals that unstructured data is growing 55-65% every year—a rate three times faster than the growth of structured data”.

Attackers, especially those of the ransomware tribe, love unstructured data – as that is what you can encrypt at lot easier than a database which will be ‘in use’ and refuse encryption. At any time, the absolute minimum is removing any rights assigned to “everyone”, minimally to “authenticated” users The Authenticated Users group includes all users whose identities were authenticated when they logged on. The Everyone group adds the built-in Guest account, and built-in security accounts like SERVICE, LOCAL_SERVICE, NETWORK_SERVICE, and others. This way access is limited to just human users, where attackers generally prefer built-in accounts as they tend to be under the radar.

And while you’re at it – file shares are used to run applications as well, as a convenient way to run small applications, “portable” versions of regular applications and/or shadow IT. This trait is very commonly abused by malware spreading over your networks and can be easily defeated by removing the executable bit in the file system. Be aware that this stops all executables and is an inherited right, so try before you apply it on any scale.

Much more can – and must – be said on this topic but is beyond the immediate scope of this document.

Correlate accounts (reconciliate)

A staple to a mature Identity Management stack is the reconciliation capability. It calculates the delta between what accounts have which authorizations (IST) and the authorizations they should have (SOLL), which enable you to (temporarily) disable orphaned accounts, removing a common pivoting point in breaches and a common method of persistence.

Access Management meets Resilience

AM systems provide real-time access for the accounts managed by IGA. AM is where the passwords are, and thus they are most likely target or steppingstone in any attack. So, the starting point is to ensure you have High Availability and Disaster Recovery in place for AM. As made abundantly clear in the high-profile case of Maersk – ensure you have recoverable backups for every directory, too. Mind that a compromised system may still be working, but no longer for you; you need data recovery of clean data, not merely a system restore.

More factors in MFA

The first use-case to be built-in you AM-capabilities for when you are under siege is to enforce stricter policies in access – add factors to MFA and – when you have the technology, to tighten additional policies if conditional access is in place (platforms, patch-levels etc). Such capabilities are often the product features not selected during the implementation project, but they could be a nice back-up during an attack. A flexible configuration of authentication could be a useful instrument in your cyber arsenal.

Reset all user sessions & force login

The prime Identity based resilience use-case is where you reset all credentials (passwords, secrets like SSH and PKI keys, etc.), to remove the possible re-entry of the attacker. There is one caveat, open sessions -a.k.a. the validity of trust. In short – how and how often do you revalidate a user session? And how do you close an existing session if there is something suspicious?

This is the realm of access management, which checks if the entity requesting something is who or what it claims to – by means of authentication. The most common means is a username/password challenge, but an increasing number of alternatives are available, like ‘passwordless’ relying on stores secret keys and PIN or biometrics. The client receives an access token after successful authentication. Having that token gives access for a defined or an undefined period, depending on deployment and protocol. As long as the token is valid, the client is trusted. If the network session expires, the token is enough for you to get a new session. And so will anyone else with a copy of the token – it replaces the real authentication for the duration of the session – which can be forever. This is why protection of the session token is vital, as security best practice prescribes done by not storing it on disc but keeping it in memory of the client. Especially when memory addresses are randomized, stealing keys from memory is hard. It also means that when a session survives rebooting (meaning you won’t have to log in again) something must be stored on disk, making it less secure.

Forcing revalidation by limiting duration of the validation of a client to increase security would be feasible but is not a common feature. Probably because of its user unfriendliness, as the user would have to re-authenticate often and that breaks the Single Sign On experience. So, we sacrifice some security and stolen tokens is such a big thing.

Now for the great reset use case. You’ll seek to limit the validity of open sessions, to ensure that only legitimate users and systems have continued access. Unfortunately, most session token mechanisms don’t allow for server-side session termination; only the client can log off. For instance, http, the most common protocol, can only revoke a user session when using persistent cookies, by changing the expiry time. Persistent cookies are files stored on disc; a method considered insecure as they give access to the user session when stolen.  This explains why changing the password or key of a REST or SOAP API doesn’t do anything – API’s don’t log out.

Similar – but different – caveats exist in Kerberos, the other common protocol, as used by on-premises active directory. Kerberos tickets received after successful authentication stay valid for time that they’re se to be valid, there is no revocation mechanism. Kerberized services validate the received tickets “off-line”, without contacting a domain controller or any other central authority – as long as the ticket decrypts using the service’s key (keytab) it’s deemed good. There is no way for tickets to be centrally revoked once they are issued.

This means that if the user only has a ‘krbtgt’ ticket, it can be “revoked” at the domain controller (KDC) by disabling the user account. This way the KDC refuses to issue further tickets; however, if the user already has tickets for other services, those tickets stay valid and it’s up to each service to do the job of validating the account if they want. This means that disabling a hacked account – or changing the password or reducing authorizations doesn’t lock out the attacker.

For services that do not use any additional authorization server (e.g. API’s or SSH hosts that only require a ticket), you’re out of luck – the ticket will stay valid however long it is valid. Which in many cases is indefinitely. So good luck getting the attacker out.

Even if the tickets do become invalid by expiry, this does not cause sessions to immediately drop. Kerberos-authenticated RDP or SSH sessions can remain active indefinitely. RDP and SSH are commonly privileged user sessions – the attackers preferred route to do nasty with your systems and data. Kerberos-authenticated web SSO (http) will also remain active as long as the SSO cookies are valid. Hackers will have learned never to sign out anything – an insight vital to incident responders. If you ignore this, your chances of cleaning up any mess will be minimal.

This is generally true for any authentication mechanism, not just Kerberos. For example, if you SSH to a server using a pubkey, your session stays open even if that key is later removed from authorized_keys. In the case of switches and routers, if they’re using additional means such as RADIUS or TACACS AAA server for authorization in addition to Kerberos authentication, and if they’re set up with per-command authorization via AAA (as opposed to just checking at login time), then they’ll likely close sessions as soon as the next command is issued and the AAA server says: “Account seems locked according to LDAP”. Otherwise, the session stays open, and the attackers can proceed indefinitely.

Generally, the great reset functionality isn’t available in a default configuration, as Identity systems aren’t required to deliver these capabilities. That doesn’t mean nothing can be done at all – knowing what is and what isn’t possible is vitally important, particularly during a breach. Zero Trust teaches the importance of “never trust, always verify”, yet we see here that in the topic of the duration of validated trust, a lot of vital work commonly overlooked, remains to be done.

Privileged Access Management use cases

PAM is the securing of the use of accounts with higher privileges, the rights to change rights users have – including his own. These are administrator and roots accounts, and their equivalents. To an attacker to succeed they are vital to move laterally through a network. Privileged accounts are thus the most common targets, and deploying a PAM solution centralizes what you defend, which implies more visibility & easier response. It potentially gives you one place to increase logging levels and reset all user sessions (instead of in every host separately!), it provides the functionality to record admin sessions and a single route to create emergency privileged accounts for extra support staff. These are all benefits from the concept of PAM, yet hardly what you do in case of a suspected breach.

Reset and clean up local admins

The first thing you should probably do with PAM is a global reset of local admin passwords and a removal of additional local accounts. Doing this is probably a mix of IGA, AM and PAM tools, so it warrants preparation and even trial runs.

Other actions with PAM

Depending on tool and deployment, you have other capabilities that could be really useful when fighting off a breach:

  • Reset service accounts (NPAs) with privileged access
  • Set approval workflow for high-privileged actions, Enable (stricter) approval flows / business justification controls (integrate with ITSM) (For PASM, PEDM, vendor access)
  • Minimize # of assets to which account can be used
  • Introduce new privileged accounts for specific role based activities / assets (segmentation)
  • Use PAM specific threat analytics available on integrated hosts
    • Detection on unusual logins, activities in sessions, access to assets without use of PAM, Golden tickets / PTH attacks
    • List all recent elevations of privileges (UAV bypasses, token impersonation & theft a.o.)
  • More stringent RBAC en PoLP based privileged accounts
  • Replace standing access accounts with ephemeral accounts

PKI use cases

Encryption is the most common technology for protection and trust, and as such the core of cyber security. Having said that, in all its pervasiveness it is generally overlooked and rarely managed actively nor considered in risk analysis. Public PKI is the anchor of security for the public namespace, which means in the age of cloud, all your security hinges on it. Meaning that you may lose all your data, applications, and control over your devices – IT, OT and IoT.

Really?

It is the worst case, but it can happen. Let’s dive into this. To log into the cloud to manage your tenant, your Admin logs in with his e-mail address. That user account depends on e-mail for the user service, such as device enrollment, MFA set-up and forgotten password flows. The identity of the e-mail server’s as set in the DNS record is ‘secured’ by a certificate provided by a trusted Certificate Authority. So, if that is compromised, so is the account controlling your cloud, as it passes control of all mail to the attacker, including the password recovery. This could give the attacker control over the account managing your cloud.

This means that if you put ‘everything’ in the cloud and your public certificate provider gets compromised, you could lose it all; the root account of your tenant is a SPOF (single point of failure) in the supply chain. Of course, there are some more steps to the attack, but it could be that simple. This is why the EU legislator has a lot of attention for both the DNS-providers and the certificate authorities. But this won’t prevent compromises; at best it makes a fragile situation somewhat less breakable. And everyone should do their part, not just the government and the DNS and PKI companies.

PKI and Cyber Resilience thus means preparing the use case to be prepared for Public CA compromise. Considering that NIS2 explicitly mentions these topics, this would be the best place to start. Other use cases to consider covering should be:

  • Root compromise external trust anchors
    • Move to another provider
  • Root compromise internal trust anchors
    • CA Trust reset for DNS
    • CA Trust reset for code signing
    • CA Trust reset for Deep Packet Inspection
  • Revoke (all) user keys
    • Emergency reissue of PKI private keys
      • MDM / AD / AAD
      • TLS/DNS (DANE)
      • X802.1x
      • S/MIME
      • DLP
      • Code signing
      • SSH keys

Conclusion and next steps

This post hopefully helps you to understand this vast and mostly uncharted territory. In your drive for more cyber resilience, it is a good idea to enlist the help of the specialists responsible for your identity systems. Do a workshop, at least. And if you need any help at all, we are ready for you. Call us or mail us.

The Cyber Chronicle Newsroom
We keep you posted with the latest news, data & trend topics
Cloud Platform Security
Microsoft Entra: Porträt einer vielseitigen Produktfamilie
Learn more
Identity & Access Management
Identity meets Resilience
Learn more
NIS2
NIS2 & Penetration Tests: Getting Grip on NIS2-compliant Technology
Learn more
Identity & Access Management
Resilience by Identity
Learn more
Identity & Access Management
Identity & Access Management
Getting a Grip on Cryptography
Learn more
Microsoft Sentinel as Azure SIEM - Benefits & Costs
Learn more
AI
Fighting AI attacks: How to protect data and systems
Learn more
Assessment & Advisory
ISO 27001 Certification without delay
Learn more
Assessment & Advisory
Managed Services to counter the shortage of manpower
Learn more
Security & IT Solutions
Workload Security with SASE, this is how it works
Learn more
Cloud Platform Security
DevOps security: Stress test for culture and technology
Learn more
Identity & Access Management
Biometrics - better security without passwords?
Learn more
Cyber Defense
Threat Intelligence - Knowledge is power & security
Learn more
NIS2
NIS2 & ISO/IEC 27001:2022: New controls to fulfill both standards
Learn more
Identity & Access Management
How Privileged Access Management increases security
Learn more
Assessment & Advisory
vCISO - more IT Security through customizable support
Learn more
Wir sind für Sie da
Einfach Formular ausfüllen und unsere Experten melden sich.

You are currently viewing a placeholder content from HubSpot. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.

More Information