InfraRunBook
    Back to articles

    AWS IAM Roles Policies and Best Practices

    Cloud
    Published: Apr 8, 2026
    Updated: Apr 8, 2026

    A senior engineer's guide to AWS IAM roles and policies — covering trust policies, permission boundaries, SCPs, cross-account access, and real-world least-privilege patterns.

    AWS IAM Roles Policies and Best Practices

    What IAM Roles and Policies Actually Are

    AWS Identity and Access Management (IAM) is the backbone of access control in AWS. Everything that touches your AWS environment — a developer running CLI commands, an EC2 instance reading from S3, a Lambda function writing to DynamoDB — goes through IAM. And at the heart of IAM are two concepts: roles and policies.

    An IAM policy is a JSON document that defines permissions. It answers one question: what actions are allowed or denied on which resources? An IAM role is an identity that can be assumed by a trusted entity — it's essentially a set of permissions bundled together with a trust relationship that says who's allowed to use it.

    The distinction between roles and users trips people up early on. An IAM user has long-term credentials — an access key ID and secret. A role has no credentials of its own. Instead, when an entity assumes a role, AWS Security Token Service (STS) issues temporary credentials with a configurable expiry. This is fundamentally more secure, and it's the direction AWS has been pushing for years. If you're still creating IAM users with static access keys for application workloads, you're carrying unnecessary risk.

    How Policies Work: The JSON Structure

    Every IAM policy is built from the same building blocks. Understanding these cold will save you hours of debugging permission errors.

    A policy document contains one or more statements. Each statement has an Effect (either

    Allow
    or
    Deny
    ), an Action listing the API operations being permitted or denied, a Resource specifying the ARN the action applies to, an optional Condition block that must evaluate true for the statement to apply, and a Principal field used in resource-based policies and trust policies to specify who the policy targets.

    Here's a straightforward example — a policy that allows read-only access to a specific S3 bucket used by a deployment pipeline at solvethenetwork.com:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "AllowS3ReadDeployArtifacts",
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:ListBucket"
          ],
          "Resource": [
            "arn:aws:s3:::solvethenetwork-deploy-artifacts",
            "arn:aws:s3:::solvethenetwork-deploy-artifacts/*"
          ]
        }
      ]
    }

    Notice the two resource ARNs.

    s3:ListBucket
    applies to the bucket itself, while
    s3:GetObject
    applies to objects within it. This is a classic gotcha — if you only specify the
    /*
    ARN,
    ListBucket
    calls will fail with an access denied even though the intent seems obvious. I've watched engineers burn a solid afternoon on exactly that.

    Trust Policies: The Other Half of the Equation

    Every IAM role has two policy attachments that matter: the permission policy (what the role can do) and the trust policy (who can assume the role). Most people learn permissions first and then get confused when their role assumption fails. Always check the trust policy.

    The trust policy is a resource-based policy attached directly to the role. It uses the

    sts:AssumeRole
    action — or
    sts:AssumeRoleWithWebIdentity
    for OIDC federation. Here's a trust policy that allows an EC2 instance to assume a role:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "ec2.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }

    And here's one for cross-account access — allowing a role in a separate CI/CD AWS account to assume a deployment role in the production account:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::123456789012:role/infrarunbook-admin-cicd-role"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "solvethenetwork-prod-deploy-2024"
            }
          }
        }
      ]
    }

    The

    ExternalId
    condition is there to prevent the confused deputy problem — a scenario where an attacker tricks a trusted third-party service into performing actions on their behalf using your role. I've seen teams skip this on internal tooling because it seemed unnecessary. It's a small addition that eliminates a real attack vector.

    Policy Types and When to Use Each

    AWS gives you three categories of policies, and choosing the right one matters for auditability and long-term maintainability.

    AWS Managed Policies are maintained by AWS and cover common use cases like

    AmazonS3ReadOnlyAccess
    or
    AdministratorAccess
    . They're convenient, but they're almost always too broad.
    AmazonS3ReadOnlyAccess
    grants read access to every S3 bucket in your account. That might be acceptable for a developer sandbox; it's not acceptable for an application role in production that only needs to touch one bucket.

    Customer Managed Policies are the ones you write and maintain. They're reusable — you can attach them to multiple roles — and they're naturally suited for infrastructure-as-code tools like Terraform or CloudFormation. This is the right choice for anything non-trivial in a real environment. Write them once, version them, test them, and attach them explicitly.

    Inline Policies are embedded directly into a single role, user, or group and don't exist as independent objects. I rarely recommend them. They make auditing painful because there's no central view, and they get silently deleted when the parent identity is deleted. The one case where they make sense is when you need an enforced one-to-one relationship between a policy and a role that should genuinely never be reused elsewhere.

    How IAM Evaluates Requests: The Decision Logic

    Understanding the evaluation order is the thing that will unblock you when a permission looks correct but still gets denied.

    AWS evaluates IAM requests in a strict sequence. First, if there is an explicit Deny anywhere — in any attached identity-based policy, SCP, permission boundary, session policy, or resource-based policy — the request is denied. Full stop. Second, if there is an explicit Allow that covers the action and no deny contradicts it, the request is allowed. Third, if there is neither an allow nor a deny, the default is an implicit deny. Nothing gets through without an explicit allow.

    The phrase "explicit deny wins" is the single most important thing to internalize about IAM. A deny statement is absolute — no allow in any other policy can override it. This is exactly how Service Control Policies work in AWS Organizations. They define the ceiling of what's possible in member accounts, and no IAM permissioning inside those accounts can punch through that ceiling.

    In my experience, most debugging sessions come down to one of three things: the action isn't present in the identity-based policy, the resource ARN doesn't match what the request is actually targeting, or a condition isn't being satisfied. The AWS IAM Policy Simulator is criminally underused — it lets you test exactly what a role or user can do without triggering the actual API call and then hunting through CloudTrail for the deny event.

    Real-World Example: EC2 Instance Profile for an Application Server

    Let's say you're running an application on sw-infrarunbook-01, an EC2 instance that needs to read configuration from AWS Secrets Manager, write logs to CloudWatch, and consume messages from an SQS queue. Here's how you'd configure this with proper least-privilege scoping.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "AllowSecretsManagerRead",
          "Effect": "Allow",
          "Action": [
            "secretsmanager:GetSecretValue",
            "secretsmanager:DescribeSecret"
          ],
          "Resource": "arn:aws:secretsmanager:us-east-1:123456789012:secret:solvethenetwork/app/prod/*"
        },
        {
          "Sid": "AllowCloudWatchLogs",
          "Effect": "Allow",
          "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
          ],
          "Resource": "arn:aws:logs:us-east-1:123456789012:log-group:/solvethenetwork/app/*"
        },
        {
          "Sid": "AllowSQSConsume",
          "Effect": "Allow",
          "Action": [
            "sqs:ReceiveMessage",
            "sqs:DeleteMessage",
            "sqs:GetQueueAttributes"
          ],
          "Resource": "arn:aws:sqs:us-east-1:123456789012:solvethenetwork-app-queue"
        }
      ]
    }

    You attach this policy to a role with the EC2 trust policy shown earlier, then attach that role to an instance profile and associate it with sw-infrarunbook-01. The EC2 instance gets temporary credentials automatically via the instance metadata service (IMDS), rotated in the background by AWS. The application picks them up via the SDK without any hardcoded keys anywhere.

    Every resource ARN here is scoped precisely. The app can only access secrets under its own path prefix, logs under its own log group, and exactly one SQS queue. If that instance is ever compromised, the blast radius is tightly contained.

    Real-World Example: Lambda Execution Role

    Lambda functions are another place where I see overly permissive roles consistently. The AWS console creates a default execution role if you don't specify one, and it often includes broader permissions than the function actually needs. Always define the execution role explicitly.

    Here's a minimal execution role for a Lambda function that processes S3 events and writes results to DynamoDB:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "AllowBasicLambdaLogging",
          "Effect": "Allow",
          "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
          ],
          "Resource": "arn:aws:logs:*:123456789012:log-group:/aws/lambda/solvethenetwork-processor:*"
        },
        {
          "Sid": "AllowS3ReadInput",
          "Effect": "Allow",
          "Action": "s3:GetObject",
          "Resource": "arn:aws:s3:::solvethenetwork-input-data/*"
        },
        {
          "Sid": "AllowDynamoDBWrite",
          "Effect": "Allow",
          "Action": [
            "dynamodb:PutItem",
            "dynamodb:UpdateItem"
          ],
          "Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/solvethenetwork-results"
        }
      ]
    }

    Three distinct permission blocks: logging, reading input, writing output. Nothing else. If that Lambda function is ever exploited through a dependency vulnerability or injection attack, the attacker's options are strictly limited to those three operations on those specific resources.

    Conditions: The Underused Power Feature

    IAM conditions let you add context-aware restrictions to your policies. They're one of the most powerful features in IAM and are consistently underused in environments I've audited.

    You can require MFA for sensitive IAM operations — useful for protecting your role and policy management surface:

    {
      "Effect": "Allow",
      "Action": [
        "iam:DeleteRole",
        "iam:DetachRolePolicy",
        "iam:DeleteRolePolicy"
      ],
      "Resource": "*",
      "Condition": {
        "BoolIfExists": {
          "aws:MultiFactorAuthPresent": "true"
        }
      }
    }

    You can restrict actions to specific source IP ranges — useful for limiting console access to your corporate network or VPN egress IPs:

    {
      "Effect": "Deny",
      "Action": "*",
      "Resource": "*",
      "Condition": {
        "NotIpAddress": {
          "aws:SourceIp": [
            "10.10.0.0/16",
            "192.168.1.0/24"
          ]
        },
        "Bool": {
          "aws:ViaAWSService": "false"
        }
      }
    }

    Note the

    aws:ViaAWSService
    condition. Without it, you'll block legitimate service-to-service calls that originate from AWS services on your behalf, which produces confusing failures. That's a real operational trap. You can also use
    aws:RequestedRegion
    to prevent resources from being created outside your approved regions — a solid control for data residency requirements that complements SCPs when you want per-role granularity.

    Best Practices That Actually Matter

    Use roles everywhere, users nowhere — for workloads. EC2 gets instance profiles. Lambda gets execution roles. ECS tasks get task roles. Humans doing CLI work should assume roles from a central identity provider via SAML or OIDC federation through AWS IAM Identity Center, not use personal IAM users with static access keys. Static keys get leaked in git commits, CI pipeline logs, Docker image layers, and Slack messages. Temporary STS credentials expire and dramatically limit the damage window when something goes wrong.

    Scope resource ARNs explicitly. Using

    "Resource": "*"
    in a permission policy is a red flag. Some IAM actions — like
    ec2:DescribeInstances
    — genuinely don't support resource-level permissions and require a wildcard. But for anything that does support it — S3, Secrets Manager, DynamoDB, SQS, KMS — scope to the specific ARN or a deliberate path prefix. The effort is minimal and the security improvement is significant.

    Use permission boundaries for delegated administration. If you're giving a platform team or a product team the ability to create and manage their own IAM roles for their CI/CD pipelines, permission boundaries prevent privilege escalation. The boundary acts as an absolute ceiling — even if a team creates a role and attaches

    AdministratorAccess
    to it, the boundary restricts what that role can actually do. Without boundaries in place, delegated IAM administration is a privilege escalation path waiting to be walked.

    Implement AWS Organizations with SCPs. Service Control Policies are your most powerful control mechanism. They operate above IAM and can't be overridden by anything in member accounts. Use them to block root account API usage, enforce approved regions, prevent disabling CloudTrail, prevent leaving the organization, and restrict access to high-risk services like IAM Identity Center configuration changes. These are the guardrails that hold even when someone misconfigures an IAM policy in a member account.

    Audit regularly. IAM Access Analyzer identifies resources shared with external entities — including unintended public exposure. The credential report shows last-used dates for all IAM users and access keys. In my experience, anything unused for 90 days should be reviewed. Anything unused for 180 days should be removed without hesitation. Dormant credentials are a liability, not a convenience.

    Common Misconceptions

    "An explicit allow in a member account overrides an SCP deny from the parent organization." It doesn't. SCPs are evaluated before identity-based policies in the chain. If the SCP denies an action, IAM never gets a chance to allow it. I've watched teams spend hours debugging why a policy that looks correct still produces access denied errors, only to find an SCP that was put in place months earlier by a different team and forgotten about. Always check SCPs first when debugging in an Organizations setup.

    "Resource-based policies and identity-based policies are interchangeable." They're not. For same-account access, either an identity-based policy or a resource-based policy that grants the access is sufficient. For cross-account access, you generally need both — the identity-based policy in the source account allowing the action, and the resource-based policy on the target resource allowing the principal from the source account. Miss either side and the access fails. S3 bucket policies and KMS key policies are the most common places this catches people.

    "Inline policies are more secure than managed policies." The security of a policy comes from what it says, not how it's attached. A narrowly scoped customer managed policy is more auditable, more maintainable, and easier to review in a security assessment than a sprawl of inline policies embedded across dozens of roles. The one real advantage of inline policies — that they can't be accidentally reused — is better handled with careful naming conventions and code review.

    "IAM groups are sufficient for managing developer access in multi-account setups." Groups made sense in the early days when everyone worked in a single AWS account. In a modern multi-account environment — separate accounts for dev, staging, prod, logging, and security — managing access via groups in each individual account doesn't scale and produces inconsistency. Use AWS IAM Identity Center with permission sets, and manage group membership in your identity provider. That's the current standard approach and it's genuinely better.

    "The root account is fine to use for routine admin tasks." The root account has unrestricted access that cannot be constrained by any policy, including SCPs. There are a handful of operations that only root can perform — closing the account, changing the support tier, enabling MFA on the root user itself. Those are the only legitimate uses. Everything else should be done through roles. Enable MFA on root, ideally with a hardware key, store the credentials somewhere secure and rarely accessed, and treat it as a break-glass account.


    IAM is one of those foundational areas where small habits — scoping resources, using roles instead of users, adding conditions where they add value — compound into a meaningfully more defensible environment over time. The primitives are consistent once you've internalized them. The evaluation logic, the trust policy structure, the policy types — they all follow the same rules everywhere. Build that mental model solidly and you'll spend a lot less time puzzling over access denied errors and a lot more time building things that matter.

    Frequently Asked Questions

    What is the difference between an IAM role and an IAM user?

    An IAM user has long-term credentials (an access key ID and secret access key) permanently associated with it. An IAM role has no credentials of its own — instead, when a trusted entity assumes the role, AWS STS issues temporary credentials with a configurable expiration. Roles are preferred for application workloads because temporary credentials limit the damage window if they are ever exposed.

    How does a trust policy differ from a permission policy in IAM?

    A permission policy defines what actions a role is allowed or denied to perform on which AWS resources. A trust policy is a resource-based policy attached to the role itself that defines which principals (AWS services, accounts, or identities) are permitted to assume the role via sts:AssumeRole. Both must be correctly configured — an IAM role that has no trust policy can never be assumed, regardless of what its permission policies allow.

    What is an IAM permission boundary and when should I use it?

    A permission boundary is a managed policy attached to an IAM role or user that sets the maximum permissions that identity can ever have, regardless of what other policies are attached. Even if you attach AdministratorAccess to a role that has a restrictive boundary, the boundary wins. Permission boundaries are most valuable when delegating IAM administration to teams — they prevent those teams from creating roles with permissions that exceed what the boundary allows, eliminating privilege escalation paths.

    Can a Service Control Policy deny override an explicit Allow in an IAM policy?

    Yes. Service Control Policies (SCPs) are evaluated before identity-based policies in the IAM authorization chain. An explicit Deny in an SCP cannot be overridden by any Allow in any IAM policy attached within a member account. SCPs define the ceiling of what is possible in an AWS Organizations member account, making them the strongest preventive control available in a multi-account AWS environment.

    What is the confused deputy problem in AWS IAM and how does ExternalId help?

    The confused deputy problem occurs when an attacker tricks a legitimate third-party AWS service (which has permission to assume your role) into performing actions on their behalf using your role's permissions. The ExternalId condition in a trust policy mitigates this by requiring the caller to provide a secret value that only the legitimate parties know. Even if an attacker can invoke the third-party service, they cannot provide the correct ExternalId, so the role assumption fails.

    Related Articles