Skip to main content

Terraform Data Sources and Dependencies: Implicit vs. Explicit

 Terraform Data Sources and Dependencies: Implicit vs. Explicit

Your complete guide to understanding how Terraform discovers, fetches, and connects infrastructure—and how to control it when automatic dependency detection isn't enough.

📅 Published: Feb 2026
⏱️ Estimated Reading Time: 22 minutes
🏷️ Tags: Terraform Data Sources, Dependencies, Resource Graph, Implicit Dependencies, Explicit Dependencies


🧠 Introduction: The Invisible Graph

Terraform's Superpower

When you run terraform apply, something magical happens. Terraform doesn't just execute your configuration line by line. It builds a dependency graph—a map of every resource, data source, and module, connected by their relationships to each other.

hcl
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  vpc_id = aws_vpc.main.id  # ← Terraform sees this reference
  cidr_block = "10.0.1.0/24"
}

Terraform sees: "The subnet needs the VPC ID. The VPC must exist before the subnet can be created."

This is implicit dependency detection. You don't tell Terraform "create the VPC first, then the subnet." You just declare what each resource needs, and Terraform figures out the order automatically.


The Two Types of Dependencies

Implicit DependenciesExplicit Dependencies
How they're createdAutomatically via referencesManually via depends_on
What you writevpc_id = aws_vpc.main.iddepends_on = [aws_vpc.main]
When to useAlways, whenever possibleWhen Terraform can't infer the relationship
ReliabilityPerfect when references existAs reliable as your documentation

Understanding the difference between implicit and explicit dependencies—and knowing when to use each—is what separates intermediate Terraform users from experts.


🔍 Data Sources: Reading, Not Creating

What Are Data Sources?

A data source is Terraform's way of reading information from your providers without creating or modifying anything. It's like a read-only query.

hcl
# This DOES NOT create anything
data "aws_ami" "ubuntu" {
  most_recent = true
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
  
  owners = ["099720109477"]  # Canonical
}

# This uses the data
resource "aws_instance" "web" {
  ami = data.aws_ami.ubuntu.id  # ← Reference to data source
  instance_type = "t2.micro"
}

Think of data sources as asking questions:

  • "What's the most recent Ubuntu AMI?" (data.aws_ami)

  • "What does this VPC look like?" (data.aws_vpc)

  • "Who am I authenticated as?" (data.aws_caller_identity)

  • "What's this secret's value?" (data.aws_secretsmanager_secret)


When to Use Data Sources

Scenario 1: Importing Existing Infrastructure

hcl
# You have a VPC that was created outside Terraform
data "aws_vpc" "existing" {
  id = "vpc-12345678"  # Hard-coded ID—better to use variable!
}

# Now you can use it like any other resource
resource "aws_subnet" "new" {
  vpc_id     = data.aws_vpc.existing.id
  cidr_block = "10.0.1.0/24"
}

Scenario 2: Fetching Dynamic Information

hcl
# Always get the latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux_2" {
  most_recent = true
  owners      = ["amazon"]
  
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
  
  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

Scenario 3: Reading Configuration from External Systems

hcl
# Read secrets from AWS Secrets Manager
data "aws_secretsmanager_secret" "db_password" {
  name = "/prod/database/password"
}

data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = data.aws_secretsmanager_secret.db_password.id
}

# Use the secret
resource "aws_db_instance" "main" {
  password = data.aws_secretsmanager_secret_version.db_password.secret_string
}

Scenario 4: Environment Discovery

hcl
# Discover availability zones in the current region
data "aws_availability_zones" "available" {
  state = "available"
}

# Use them dynamically
resource "aws_subnet" "public" {
  count = 3
  
  vpc_id               = aws_vpc.main.id
  cidr_block           = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
  availability_zone_id = data.aws_availability_zones.available.zone_ids[count.index]
}

Scenario 5: Multi-account/region lookups

hcl
# Read VPC from a different AWS account
provider "aws" {
  alias = "network"
  assume_role {
    role_arn = "arn:aws:iam::123456789012:role/NetworkReadOnly"
  }
}

data "aws_vpc" "shared" {
  provider = aws.network
  tags = {
    Environment = "production"
    Purpose     = "shared-services"
  }
}

Data Source vs. Resource: The Critical Distinction

ResourceData Source
Creates infrastructure✅ Yes❌ No
Destroys infrastructure✅ Yes❌ No
Updates infrastructure✅ Yes❌ No
Costs moneyUsuallyNo
Can be target of depends_on✅ Yes✅ Yes (rarely needed)
Can be referenced✅ Yes✅ Yes
Can fail✅ Yes✅ Yes

This distinction is crucial: If you use a data source for something that doesn't exist yet, Terraform can't create it—it will just fail.

hcl
# This will FAIL if the VPC doesn't exist
data "aws_vpc" "main" {
  tags = {
    Name = "main-vpc"
  }
}

# This will CREATE the VPC if it doesn't exist
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  tags = {
    Name = "main-vpc"
  }
}

🔗 Implicit Dependencies: The Magic of References

How Terraform Builds the Graph

Every time you reference one resource from another, Terraform creates an implicit dependency.

hcl
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  vpc_id     = aws_vpc.main.id  # ← DEPENDENCY: subnet → vpc
  cidr_block = "10.0.1.0/24"
}

resource "aws_instance" "web" {
  subnet_id = aws_subnet.public.id  # ← DEPENDENCY: instance → subnet
  ami       = data.aws_ami.ubuntu.id  # ← DEPENDENCY: instance → data source
}

The dependency graph:

text
aws_vpc.main ──┬──> aws_subnet.public ──> aws_instance.web
                │
data.aws_ami.ubuntu ─────────────────────┘

Terraform reads these references and guarantees:

  1. The VPC is created before the subnet

  2. The subnet is created before the instance

  3. The AMI data is fetched before the instance is created

All of this happens automatically. You never need to write "create this, then create that."


References That Create Dependencies

Resource-to-resource references:

hcl
vpc_id = aws_vpc.main.id  # dependency on aws_vpc.main

Resource-to-module references:

hcl
vpc_id = module.vpc.vpc_id  # dependency on module.vpc

Module-to-resource references:

hcl
# Inside module
resource "aws_instance" "this" {
  subnet_id = var.subnet_id  # NOT a dependency—it's just a value
}

# The dependency is created where the value is PROVIDED, not where it's CONSUMED

Important: Variable references inside a module do NOT create dependencies. The dependency is created where the argument value is provided:

hcl
# root/main.tf
module "compute" {
  source    = "./modules/compute"
  subnet_id = aws_subnet.public.id  # ← DEPENDENCY CREATED HERE!
}

Data source references:

hcl
ami = data.aws_ami.ubuntu.id  # dependency on data.aws_ami.ubuntu

Provider aliases:

hcl
provider = aws.west  # dependency on provider configuration

What DOESN'T Create Dependencies

String interpolation without resource references:

hcl
name = "web-server-${var.environment}"  # No dependency

Literal values:

hcl
cidr_block = "10.0.0.0/16"  # No dependency

Variable references:

hcl
instance_type = var.instance_type  # No dependency

Local values:

hcl
resource_name = local.name_prefix  # No dependency

Count and for_each meta-arguments:

hcl
count = var.instance_count  # No dependency

🧩 Complex Reference Patterns

References Through Local Values

Local values are evaluated after variables but before resources. They don't create dependencies, but they can pass through dependencies.

hcl
locals {
  # This doesn't create a dependency—it's just computing a string
  instance_name = "${var.project}-${var.environment}-web"
  
  # This DOES create a dependency when used
  vpc_info = {
    id         = aws_vpc.main.id
    cidr_block = aws_vpc.main.cidr_block
  }
}

resource "aws_instance" "web" {
  name = local.instance_name  # No dependency
  vpc_id = local.vpc_info.id  # ← DEPENDENCY (through local)
}

The dependency is created at the point of REFERENCE, not the point of DEFINITION.


References Through Module Outputs

This is where implicit dependencies get powerful—and sometimes confusing.

hcl
# modules/vpc/main.tf
resource "aws_vpc" "this" {
  cidr_block = var.cidr_block
}

output "vpc_id" {
  value = aws_vpc.this.id  # Exports the dependency
}

# root/main.tf
module "vpc" {
  source = "./modules/vpc"
  cidr_block = "10.0.0.0/16"
}

module "compute" {
  source = "./modules/compute"
  vpc_id = module.vpc.vpc_id  # ← DEPENDENCY on module.vpc
}

The dependency is preserved through the module boundary. When you reference module.vpc.vpc_id, you implicitly depend on everything that module created.


References Through for_each and count

When you use for_each or count, references inside the iterator don't create dependencies—the dependency is on the entire collection.

hcl
resource "aws_subnet" "public" {
  count = 3
  
  vpc_id = aws_vpc.main.id  # DEPENDENCY: each subnet → vpc
}

resource "aws_instance" "web" {
  count = 3
  
  subnet_id = aws_subnet.public[count.index].id  # DEPENDENCY: each instance → its subnet
}

But careful with this pattern:

hcl
locals {
  subnet_ids = aws_subnet.public[*].id  # DEPENDENCY on all subnets
}

resource "aws_instance" "web" {
  for_each = toset(local.subnet_ids)  # DEPENDENCY on local.subnet_ids → all subnets
  
  subnet_id = each.value
}

🛠️ Explicit Dependencies: When Magic Isn't Enough

The Problem Implicit Dependencies Can't Solve

Sometimes resources depend on each other without directly referencing each other's attributes.

hcl
# Example 1: IAM role and policy
resource "aws_iam_role" "lambda" {
  name = "lambda-execution-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "lambda.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "s3_access" {
  name = "s3-access-policy"
  role = aws_iam_role.lambda.name  # ← This creates a dependency
  
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = ["s3:GetObject", "s3:ListBucket"]
        Effect = "Allow"
        Resource = [
          aws_s3_bucket.data.arn,  # ← DEPENDENCY on bucket
          "${aws_s3_bucket.data.arn}/*"
        ]
      }
    ]
  })
}

# BUT: The bucket doesn't reference the role or policy
resource "aws_s3_bucket" "data" {
  bucket = "my-app-data"
  
  # No reference to the IAM policy that depends on this bucket!
}

Terraform sees: The policy depends on the bucket (explicit reference). The policy depends on the role (explicit reference). But the bucket doesn't depend on anything.

This creates a potential race condition: Terraform might create the bucket, then the role, then the policy (correct). Or it might create the role, then the bucket, then the policy (still correct). But if it creates the bucket last, and some other resource tries to use it before the policy exists...


depends_on: The Manual Override

depends_on tells Terraform: "Trust me, this resource needs to wait for that one."

hcl
resource "aws_s3_bucket" "data" {
  bucket = "my-app-data"
  
  # EXPLICIT DEPENDENCY: Don't create this bucket until the IAM policy exists
  depends_on = [
    aws_iam_role_policy.s3_access
  ]
}

Now the dependency graph is complete:

text
aws_iam_role.lambda ──> aws_iam_role_policy.s3_access ──> aws_s3_bucket.data
                                                          ↻ (circular? No—policy depends on bucket for ARN,
                                                             bucket depends on policy for creation order)

This isn't a circular dependency—it's a creation order requirement. The policy needs the bucket ARN (so bucket must exist when policy is created). The bucket doesn't need the policy to exist, but we want it created AFTER the policy to ensure other systems don't access it before permissions are ready.


Common Use Cases for depends_on

1. IAM policies and the resources they protect

hcl
resource "aws_s3_bucket" "logs" {
  bucket = "app-logs"
  
  depends_on = [aws_iam_role_policy.log_writer]
}

resource "aws_iam_role_policy" "log_writer" {
  role = aws_iam_role.app.name
  policy = jsonencode({
    Statement = [{
      Effect = "Allow"
      Action = "s3:PutObject"
      Resource = "${aws_s3_bucket.logs.arn}/*"  # Needs bucket ARN
    }]
  })
}

2. DNS records and load balancers (creation order)

hcl
resource "aws_lb" "main" {
  name = "app-lb"
  # ... configuration
}

resource "aws_route53_record" "app" {
  zone_id = var.zone_id
  name    = "app.example.com"
  type    = "A"
  
  alias {
    name                   = aws_lb.main.dns_name
    zone_id                = aws_lb.main.zone_id
    evaluate_target_health = true
  }
}

# Wait for DNS to propagate before proceeding
resource "null_resource" "wait_for_dns" {
  depends_on = [aws_route53_record.app]
  
  provisioner "local-exec" {
    command = "sleep 60"
  }
}

3. Resources with eventual consistency

hcl
resource "aws_ecr_repository" "app" {
  name = "my-app"
}

resource "null_resource" "wait_for_ecr" {
  depends_on = [aws_ecr_repository.app]
  
  provisioner "local-exec" {
    command = <<EOF
      echo "Waiting for ECR repository to be fully available..."
      sleep 30
    EOF
  }
}

resource "null_resource" "docker_push" {
  depends_on = [null_resource.wait_for_ecr]
  
  provisioner "local-exec" {
    command = "docker push ${aws_ecr_repository.app.repository_url}:latest"
  }
}

4. Cross-module dependencies without data exchange

hcl
module "networking" {
  source = "./modules/networking"
}

module "monitoring" {
  source = "./modules/monitoring"
  
  # No reference to networking module, but needs it to exist
  depends_on = [module.networking]
}

5. Bootstrapping infrastructure with chicken-egg problems

hcl
# This creates a dependency cycle if not handled carefully
resource "aws_iam_role" "eks_cluster" {
  name = "eks-cluster-role"
  assume_role_policy = data.aws_iam_policy_document.eks_assume_role.json
}

resource "aws_eks_cluster" "main" {
  name     = "my-cluster"
  role_arn = aws_iam_role.eks_cluster.arn
  vpc_config {
    subnet_ids = module.vpc.subnet_ids
  }
  
  # EKS cluster must be created before node group
}

resource "aws_iam_role" "eks_node_group" {
  name = "eks-node-group-role"
  assume_role_policy = data.aws_iam_policy_document.ec2_assume_role.json
  
  # This role needs to reference the cluster role? No.
  # But both roles are independent.
}

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_role_arn   = aws_iam_role.eks_node_group.arn
  subnet_ids      = module.vpc.subnet_ids
  
  # This automatically depends on the cluster
  # No explicit depends_on needed
}

When NOT to Use depends_on

❌ When you can use an implicit reference instead

hcl
# BAD: Unnecessary explicit dependency
resource "aws_subnet" "public" {
  vpc_id = aws_vpc.main.id
  # This already creates an implicit dependency!
}

resource "aws_instance" "web" {
  subnet_id = aws_subnet.public.id
  # This already creates an implicit dependency!
  
  depends_on = [aws_vpc.main]  # ❌ UNNECESSARY
  depends_on = [aws_subnet.public]  # ❌ UNNECESSARY
}

❌ To force ordering between unrelated resources

hcl
# BAD: These resources don't actually depend on each other
resource "aws_s3_bucket" "logs" {
  bucket = "app-logs"
}

resource "aws_dynamodb_table" "sessions" {
  name = "user-sessions"
  
  depends_on = [aws_s3_bucket.logs]  # ❌ WHY? No relationship!
}

❌ As a substitute for proper module composition

hcl
# BAD: Module should expose outputs, not rely on depends_on
module "vpc" {
  source = "./modules/vpc"
}

module "eks" {
  source = "./modules/eks"
  
  # This shouldn't be necessary—EKS module should accept vpc_id
  depends_on = [module.vpc]
}

❌ To create artificial dependencies for "safety"

hcl
# BAD: This doesn't make anything safer
resource "aws_instance" "web" {
  # ... config ...
  
  depends_on = [
    aws_s3_bucket.logs,        # No relationship
    aws_dynamodb_table.data,   # No relationship  
    aws_iam_role.lambda,       # No relationship
  ]
}

🔄 Dependency Cycles and How to Break Them

What Is a Dependency Cycle?

A dependency cycle occurs when Terraform detects that Resource A depends on Resource B, and Resource B depends on Resource A (directly or indirectly).

hcl
# Example of a cycle
resource "aws_security_group" "web" {
  name = "web-sg"
  
  ingress {
    from_port = 80
    to_port   = 80
    protocol  = "tcp"
    # Can't reference itself during creation!
    security_groups = [aws_security_group.web.id]  # ❌ CYCLE!
  }
}

# Terraform error:
# Error: Cycle: aws_security_group.web, aws_security_group.web

This is the most common Terraform error that beginners can't resolve. The solution isn't to force it—it's to restructure.


Breaking Self-Referential Cycles

Problem: A security group needs to reference itself for internal communication.

Solution 1: Create the group, then add rules separately

hcl
# Step 1: Create empty security group
resource "aws_security_group" "web" {
  name = "web-sg"
}

# Step 2: Add rule referencing the now-existing group
resource "aws_security_group_rule" "web_self" {
  type              = "ingress"
  from_port         = 80
  to_port           = 80
  protocol          = "tcp"
  security_group_id = aws_security_group.web.id
  source_security_group_id = aws_security_group.web.id  # Now this works!
}

Solution 2: Use self attribute (provider-specific)

hcl
resource "aws_security_group" "web" {
  name = "web-sg"
  
  ingress {
    from_port = 80
    to_port   = 80
    protocol  = "tcp"
    self      = true  # Special attribute for self-reference
  }
}

Breaking Cross-Resource Cycles

Problem: Two resources need to reference each other during creation.

hcl
# This creates a cycle
resource "aws_instance" "web" {
  user_data = <<-EOF
    #!/bin/bash
    echo "Web server" > /tmp/index.html
    nohup python3 -m http.server 80 &
  EOF
  
  vpc_security_group_ids = [aws_security_group.web.id]
}

resource "aws_security_group" "web" {
  name = "web-sg"
  
  ingress {
    from_port = 80
    to_port   = 80
    protocol  = "tcp"
    cidr_blocks = ["${aws_instance.web.public_ip}/32"]  # ❌ CYCLE!
  }
}

Solution: Use a static IP or separate resource

hcl
# Create a static IP
resource "aws_eip" "web" {
  vpc = true
}

resource "aws_security_group" "web" {
  name = "web-sg"
  
  ingress {
    from_port = 80
    to_port   = 80
    protocol  = "tcp"
    cidr_blocks = ["${aws_eip.web.public_ip}/32"]
  }
}

resource "aws_instance" "web" {
  user_data = <<-EOF
    #!/bin/bash
    echo "Web server" > /tmp/index.html
    nohup python3 -m http.server 80 &
  EOF
  
  vpc_security_group_ids = [aws_security_group.web.id]
}

resource "aws_eip_association" "web" {
  instance_id   = aws_instance.web.id
  allocation_id = aws_eip.web.id
}

Breaking Cross-Module Cycles

Problem: Module A depends on Module B, and Module B depends on Module A.

hcl
module "vpc" {
  source = "./modules/vpc"
  # No dependencies
}

module "eks" {
  source = "./modules/eks"
  vpc_id = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnet_ids
}

module "monitoring" {
  source = "./modules/monitoring"
  cluster_name = module.eks.cluster_name
  
  # Wait for VPC? This creates potential cycle if monitoring needs VPC too
  depends_on = [module.vpc]
}

module "backup" {
  source = "./modules/backup"
  cluster_name = module.eks.cluster_name
  
  # Also depends on VPC
  depends_on = [module.vpc]
}

Solution: Use data sources to decouple

hcl
# Instead of passing cluster_name from eks module,
# read it directly in dependent modules

# modules/backup/main.tf
variable "cluster_name" {
  description = "Name of EKS cluster"
  type        = string
}

data "aws_eks_cluster" "this" {
  name = var.cluster_name
}

# Now backup module can access VPC info without creating a cycle
resource "aws_iam_role" "backup" {
  # Use data.aws_eks_cluster.this.arn, not module.eks.cluster_arn
}

🎯 Real-World Dependency Patterns

Pattern 1: The Provisioner Dependency

hcl
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
  
  provisioner "remote-exec" {
    inline = [
      "sudo apt-get update",
      "sudo apt-get install -y nginx",
      "sudo systemctl enable nginx",
      "sudo systemctl start nginx"
    ]
    
    connection {
      type        = "ssh"
      user        = "ubuntu"
      private_key = file("~/.ssh/id_rsa")
      host        = self.public_ip
    }
  }
}

resource "aws_route53_record" "web" {
  zone_id = var.zone_id
  name    = "web.${var.domain}"
  type    = "A"
  ttl     = 300
  records = [aws_instance.web.public_ip]
  
  # Wait for nginx to be installed before creating DNS record
  depends_on = [aws_instance.web]
}

Better: Use null_resource with explicit dependency

hcl
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
}

resource "null_resource" "web_provision" {
  triggers = {
    instance_id = aws_instance.web.id
  }
  
  provisioner "remote-exec" {
    # ... same as above ...
  }
  
  depends_on = [aws_instance.web]
}

resource "aws_route53_record" "web" {
  zone_id = var.zone_id
  name    = "web.${var.domain}"
  type    = "A"
  ttl     = 300
  records = [aws_instance.web.public_ip]
  
  # Now explicitly wait for provisioning to complete
  depends_on = [null_resource.web_provision]
}

Pattern 2: The Bootstrapping Cycle

Problem: You need an S3 bucket for Terraform state, but you need Terraform state to create the S3 bucket.

Solution 1: Manual bootstrap (one-time)

hcl
# bootstrap/main.tf
# Run this once with local state

resource "aws_s3_bucket" "terraform_state" {
  bucket = "company-terraform-state"
  
  versioning {
    enabled = true
  }
  
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
    }
  }
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
}

output "bucket_arn" {
  value = aws_s3_bucket.terraform_state.arn
}

Solution 2: Partial configuration with remote state data source

hcl
# main.tf
terraform {
  backend "s3" {
    # Partially configured - will be completed during init
    bucket = "company-terraform-state"
    key    = "prod/terraform.tfstate"
    region = "us-west-2"
    
    dynamodb_table = "terraform-state-locks"
    encrypt        = true
  }
}

# This can read from the state we're about to write!
data "terraform_remote_state" "bootstrap" {
  backend = "s3"
  
  config = {
    bucket = "company-terraform-state"
    key    = "bootstrap/terraform.tfstate"
    region = "us-west-2"
  }
}

# Use the bootstrap outputs
resource "aws_iam_policy" "state_access" {
  name = "terraform-state-access"
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:GetObject",
          "s3:PutObject",
          "s3:DeleteObject",
          "s3:ListBucket"
        ]
        Resource = [
          data.terraform_remote_state.bootstrap.outputs.bucket_arn,
          "${data.terraform_remote_state.bootstrap.outputs.bucket_arn}/*"
        ]
      }
    ]
  })
}

Pattern 3: The Application Configuration Cycle

Problem: Application needs database connection string. Database connection string includes database endpoint. Database endpoint is only known after creation.

hcl
# Option 1: Store in SSM Parameter Store
resource "aws_db_instance" "main" {
  allocated_storage    = 20
  engine              = "postgres"
  engine_version      = "14.7"
  instance_class      = "db.t3.micro"
  username           = "admin"
  password           = random_password.db.result
  skip_final_snapshot = true
}

resource "aws_ssm_parameter" "db_connection" {
  name  = "/${var.environment}/database/connection_string"
  type  = "String"
  value = "postgresql://${aws_db_instance.main.username}:${random_password.db.result}@${aws_db_instance.main.endpoint}/${aws_db_instance.main.db_name}"
  
  depends_on = [aws_db_instance.main]
}

# Application reads from SSM at startup

Option 2: User data script with runtime discovery

hcl
data "aws_region" "current" {}

data "aws_caller_identity" "current" {}

resource "aws_instance" "app" {
  user_data = <<-EOF
    #!/bin/bash
    DB_ENDPOINT=$(aws rds describe-db-instances \
      --db-instance-identifier ${aws_db_instance.main.id} \
      --region ${data.aws_region.current.name} \
      --query 'DBInstances[0].Endpoint.Address' \
      --output text)
    
    echo "DATABASE_URL=postgresql://${aws_db_instance.main.username}:${random_password.db.result}@$DB_ENDPOINT/${aws_db_instance.main.db_name}" >> /etc/environment
  EOF
  
  iam_instance_profile = aws_iam_instance_profile.rds_read_only.name
  
  depends_on = [aws_db_instance.main]
}

🧪 Practice Exercises

Exercise 1: Identify Dependencies

Task: Look at this configuration and identify all implicit and explicit dependencies.

hcl
data "aws_availability_zones" "available" {}

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  count = 3
  
  vpc_id               = aws_vpc.main.id
  cidr_block           = "10.0.${count.index}.0/24"
  availability_zone_id = data.aws_availability_zones.available.zone_ids[count.index]
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

resource "aws_route_table_association" "public" {
  count = 3
  
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

resource "aws_security_group" "web" {
  name   = "web-sg"
  vpc_id = aws_vpc.main.id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_instance" "web" {
  count = 2
  
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = "t2.micro"
  subnet_id              = aws_subnet.public[count.index].id
  vpc_security_group_ids = [aws_security_group.web.id]
  
  tags = {
    Name = "web-${count.index}"
  }
  
  depends_on = [
    aws_internet_gateway.main
  ]
}

data "aws_ami" "ubuntu" {
  most_recent = true
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
  
  owners = ["099720109477"]
}

Answer:

Implicit dependencies:

  • aws_subnet.public → aws_vpc.main (reference to vpc_id)

  • aws_subnet.public → data.aws_availability_zones.available (reference to zone_ids)

  • aws_internet_gateway.main → aws_vpc.main (reference to vpc_id)

  • aws_route_table.public → aws_vpc.main (reference to vpc_id)

  • aws_route_table.public → aws_internet_gateway.main (reference to gateway_id)

  • aws_route_table_association.public → aws_subnet.public (reference to subnet_id)

  • aws_route_table_association.public → aws_route_table.public (reference to route_table_id)

  • aws_security_group.web → aws_vpc.main (reference to vpc_id)

  • aws_instance.web → data.aws_ami.ubuntu (reference to ami.id)

  • aws_instance.web → aws_subnet.public (reference to subnet_id)

  • aws_instance.web → aws_security_group.web (reference to security group ID)

Explicit dependencies:

  • aws_instance.web → aws_internet_gateway.main (depends_on)


Exercise 2: Fix a Dependency Cycle

Problem: This configuration has a cycle. Identify it and fix it.

hcl
resource "aws_lb" "main" {
  name               = "app-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.lb.id]
  subnets            = aws_subnet.public[*].id
}

resource "aws_security_group" "lb" {
  name   = "lb-sg"
  vpc_id = aws_vpc.main.id
  
  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
}

resource "aws_security_group_rule" "lb_to_web" {
  type                     = "ingress"
  from_port                = 80
  to_port                  = 80
  protocol                 = "tcp"
  security_group_id        = aws_security_group.web.id
  source_security_group_id = aws_security_group.lb.id
}

resource "aws_security_group" "web" {
  name   = "web-sg"
  vpc_id = aws_vpc.main.id
  
  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    cidr_blocks     = []
    security_groups = [aws_security_group.lb.id]  # Reference to lb SG
  }
}

resource "aws_lb_target_group" "web" {
  name     = "web-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id
  
  health_check {
    path = "/health"
    port = "traffic-port"
  }
}

resource "aws_lb_listener" "web" {
  load_balancer_arn = aws_lb.main.arn
  port              = 80
  protocol          = "HTTP"
  
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web.arn
  }
}

What's the cycle?

  1. aws_lb.main → aws_security_group.lb (security_groups reference)

  2. aws_security_group_rule.lb_to_web → aws_security_group.web (security_group_id) and aws_security_group.lb (source)

  3. aws_security_group.web → aws_security_group.lb (ingress security_groups reference)

  4. aws_lb_listener.web → aws_lb.main (load_balancer_arn)

No direct cycle between aws_lb.main and aws_security_group.lb? Wait—there is!

aws_lb.main references aws_security_group.lb.idaws_security_group.lb doesn't reference the LB directly. But aws_lb_listener.web references aws_lb.main.arn, and aws_lb_target_group.web references aws_vpc.main (no cycle).

Actually, this might not have a cycle! Let's trace:

text
aws_lb.main ──> aws_security_group.lb
aws_security_group.lb ──> aws_vpc.main
aws_security_group.web ──> aws_security_group.lb
aws_security_group_rule.lb_to_web ──> aws_security_group.web AND aws_security_group.lb
aws_lb_target_group.web ──> aws_vpc.main
aws_lb_listener.web ──> aws_lb.main AND aws_lb_target_group.web

No cycle! All dependencies flow in one direction. So maybe the exercise was a trick—there is no cycle. But if there were, it would be between aws_security_group.web and aws_security_group.lb if they referenced each other directly.

Fix for a hypothetical cycle: Use aws_security_group_rule resources to break circular references.


Exercise 3: Optimize Dependencies

Task: This configuration works but has unnecessary explicit dependencies. Remove them.

hcl
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  
  depends_on = [aws_internet_gateway.main]  # ❌ Unnecessary
}

resource "aws_subnet" "public" {
  count = 3
  
  vpc_id     = aws_vpc.main.id
  cidr_block = "10.0.${count.index}.0/24"
  
  depends_on = [aws_vpc.main]  # ❌ Unnecessary (already implicit)
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  
  depends_on = [aws_vpc.main]  # ❌ Unnecessary (already implicit)
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
  
  depends_on = [
    aws_vpc.main,                 # ❌ Unnecessary (implicit from vpc_id)
    aws_internet_gateway.main,    # ❌ Unnecessary (implicit from gateway_id)
  ]
}

resource "aws_route_table_association" "public" {
  count = 3
  
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
  
  depends_on = [
    aws_subnet.public,           # ❌ Unnecessary (implicit from subnet_id)
    aws_route_table.public,      # ❌ Unnecessary (implicit from route_table_id)
  ]
}

resource "aws_security_group" "web" {
  name   = "web-sg"
  vpc_id = aws_vpc.main.id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  depends_on = [aws_vpc.main]  # ❌ Unnecessary (implicit from vpc_id)
}

resource "aws_instance" "web" {
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = "t2.micro"
  subnet_id              = aws_subnet.public[0].id
  vpc_security_group_ids = [aws_security_group.web.id]
  
  depends_on = [
    aws_subnet.public,          # ❌ Unnecessary (implicit from subnet_id)
    aws_security_group.web,     # ❌ Unnecessary (implicit from security group reference)
    aws_internet_gateway.main,  # ⚠️ Maybe necessary if the instance needs internet at launch
    aws_route_table.public,     # ❌ Unnecessary (already have IGW dependency)
  ]
}

Clean version:

hcl
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  count = 3
  
  vpc_id     = aws_vpc.main.id
  cidr_block = "10.0.${count.index}.0/24"
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

resource "aws_route_table_association" "public" {
  count = 3
  
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

resource "aws_security_group" "web" {
  name   = "web-sg"
  vpc_id = aws_vpc.main.id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_instance" "web" {
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = "t2.micro"
  subnet_id              = aws_subnet.public[0].id
  vpc_security_group_ids = [aws_security_group.web.id]
  
  # Only keep explicit dependencies that can't be inferred
  depends_on = [aws_internet_gateway.main]  # Kept because instance needs internet at launch
}

📋 Dependency Management Best Practices

Do's and Don'ts

✅ DO rely on implicit dependencies whenever possible. They're automatic, self-documenting, and never get out of sync.

✅ DO use depends_on sparingly and only when necessary. Every explicit dependency is a maintenance burden.

✅ DO document why you need explicit dependencies. Future maintainers need to know why the relationship isn't implicit.

✅ DO test your dependency graph. Use terraform graph to visualize and verify dependencies.

✅ DO use data sources to break cycles. Reading existing infrastructure instead of passing references can eliminate cycles.

❌ DON'T create dependencies on data sources. Data sources are read-only and don't need ordering (except when they fail).

❌ DON'T use depends_on = [module.xxx] when you could use module outputs. If a module doesn't expose what you need, fix the module.

❌ DON'T create circular dependencies. Restructure your configuration to avoid them.

❌ DON'T assume depends_on solves eventual consistency issues. It only ensures creation order, not that the resource is ready to use.


Using terraform graph to Visualize Dependencies

bash
# Generate a DOT file
terraform graph > graph.dot

# Convert to PNG (requires GraphViz)
dot -Tpng graph.dot > graph.png

# View graph
open graph.png  # macOS
xdg-open graph.png  # Linux

What to look for:

  • Cycles — loops in the graph that Terraform can't resolve

  • Missing dependencies — resources that should be connected but aren't

  • Unnecessary dependencies — connections that don't need to exist

  • Long chains — resources that depend on many other resources


🎓 Summary: Master the Graph

Terraform's dependency graph is its most powerful feature—and its most misunderstood.

Implicit DependenciesExplicit Dependencies
How to createReference another resource's attributesdepends_on = [resource.xxx]
When to useAlways, by defaultWhen implicit detection fails
MaintenanceAutomaticManual
Self-documentingYesNo
Can create cyclesRarelyYes, if used carelessly

Data sources are read-only windows into your infrastructure. They can create dependencies when their results are used, but they never modify infrastructure.

The mark of an expert Terraform user is knowing when to trust the automatic dependency detection—and when to override it. Most of the time, you trust it. The rest of the time, you have a specific, documented reason not to.


🔗 Master Terraform Dependencies with Hands-on Labs

Understanding dependencies is the key to mastering Terraform. Practice identifying, fixing, and optimizing dependencies in real scenarios.

👉 Practice dependency management with interactive labs and real cloud infrastructure at:
https://devops.trainwithsky.com/

Our platform provides:

  • Dependency graph visualization exercises

  • Cycle detection and resolution challenges

  • Data source configuration labs

  • Complex multi-module dependency scenarios

  • Real-time validation of your dependency graphs


Frequently Asked Questions

Q: Can Terraform create resources in parallel?

A: Yes! Terraform creates independent resources in parallel. Dependencies create serialization points—dependent resources wait for their dependencies.

Q: How does Terraform detect dependencies in for_each and count?

A: References inside for_each and count expressions create dependencies on the entire collection, not individual elements.

Q: Do data sources create dependencies?

A: Yes, when you reference a data source's attributes in a resource, that creates an implicit dependency on the data source being read successfully.

Q: Can I create a dependency on a module output?

A: Yes! When you reference a module output, you implicitly depend on everything that module creates.

Q: Why does terraform plan sometimes reorder resources?

A: Terraform always respects dependencies, but independent resources may be reordered for execution efficiency.

Q: How do I debug "Error: Cycle" messages?

A: Use terraform graph to visualize the cycle. Look for resources that reference each other directly or indirectly. Break the cycle by:

  • Moving one reference to a separate resource

  • Using data sources instead of direct references

  • Restructuring your module boundaries

Q: Can I force Terraform to ignore certain dependencies?

A: No. Dependencies are fundamental to correctness. If you have a dependency you want to ignore, you have a design problem, not a tool limitation.


Struggling with a tricky dependency cycle? Not sure if you need depends_on? Share your configuration in the comments—our community of Terraform experts is here to help! 💬

Comments

Popular posts from this blog

Introduction to Terraform – The Future of Infrastructure as Code

  Introduction to Terraform – The Future of Infrastructure as Code In today’s fast-paced DevOps world, managing infrastructure manually is outdated . This is where Terraform comes in—a powerful Infrastructure as Code (IaC) tool that allows you to define, provision, and manage cloud infrastructure efficiently . Whether you're working with AWS, Azure, Google Cloud, or on-premises servers , Terraform provides a declarative, automation-first approach to infrastructure deployment. Shape Your Future with AI & Infinite Knowledge...!! Read In-Depth Tech & Self-Improvement Blogs http://www.skyinfinitetech.com Watch Life-Changing Videos on YouTube https://www.youtube.com/@SkyInfinite-Learning Transform Your Skills, Business & Productivity – Join Us Today! In today’s digital-first world, agility and automation are no longer optional—they’re essential. Companies across the globe are rapidly shifting their operations to the cloud to keep up with the pace of innovatio...

📊 Monitoring & Logging in Kubernetes – Tools like Prometheus, Grafana, and Fluentd

  Monitoring & Logging in Kubernetes – Tools like Prometheus, Grafana, and Fluentd Monitoring and logging are essential for maintaining a healthy and well-performing Kubernetes cluster. In this guide, we’ll cover why monitoring is important, key monitoring tools like Prometheus and Grafana, and logging tools like Fluentd to help you gain visibility into your cluster’s performance and logs. Shape Your Future with AI & Infinite Knowledge...!! Want to Generate Text-to-Voice, Images & Videos? http://www.ai.skyinfinitetech.com Read In-Depth Tech & Self-Improvement Blogs http://www.skyinfinitetech.com Watch Life-Changing Videos on YouTube https://www.youtube.com/@SkyInfinite-Learning Transform Your Skills, Business & Productivity – Join Us Today! 🚀 Introduction In today’s fast-paced cloud-native environment, Kubernetes has emerged as the de-facto container orchestration platform. But deploying and managing applications in Kubernetes is just half the ba...

🔒 Kubernetes Security – RBAC, Network Policies, and Secrets Management

  Kubernetes Security – RBAC, Network Policies, and Secrets Management Security is a critical aspect of managing Kubernetes clusters. In this guide, we'll cover essential security mechanisms like Role-Based Access Control (RBAC) , Network Policies , and Secrets Management to help you secure your Kubernetes environment effectively. Shape Your Future with AI & Infinite Knowledge...!! Want to Generate Text-to-Voice, Images & Videos? http://www.ai.skyinfinitetech.com Read In-Depth Tech & Self-Improvement Blogs http://www.skyinfinitetech.com Watch Life-Changing Videos on YouTube https://www.youtube.com/@SkyInfinite-Learning Transform Your Skills, Business & Productivity – Join Us Today! 🚀 Introduction: Why Kubernetes Security Is Non-Negotiable As Kubernetes becomes the backbone of modern cloud-native infrastructure, security is no longer optional—it’s mission-critical . With multiple moving parts like containers, pods, services, nodes, and more, Kuberne...