Terraform Data Sources and Dependencies: Implicit vs. Explicit
Your complete guide to understanding how Terraform discovers, fetches, and connects infrastructure—and how to control it when automatic dependency detection isn't enough.
📅 Published: Feb 2026
⏱️ Estimated Reading Time: 22 minutes
🏷️ Tags: Terraform Data Sources, Dependencies, Resource Graph, Implicit Dependencies, Explicit Dependencies
🧠 Introduction: The Invisible Graph
Terraform's Superpower
When you run terraform apply, something magical happens. Terraform doesn't just execute your configuration line by line. It builds a dependency graph—a map of every resource, data source, and module, connected by their relationships to each other.
resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" } resource "aws_subnet" "public" { vpc_id = aws_vpc.main.id # ← Terraform sees this reference cidr_block = "10.0.1.0/24" }
Terraform sees: "The subnet needs the VPC ID. The VPC must exist before the subnet can be created."
This is implicit dependency detection. You don't tell Terraform "create the VPC first, then the subnet." You just declare what each resource needs, and Terraform figures out the order automatically.
The Two Types of Dependencies
| Implicit Dependencies | Explicit Dependencies | |
|---|---|---|
| How they're created | Automatically via references | Manually via depends_on |
| What you write | vpc_id = aws_vpc.main.id | depends_on = [aws_vpc.main] |
| When to use | Always, whenever possible | When Terraform can't infer the relationship |
| Reliability | Perfect when references exist | As reliable as your documentation |
Understanding the difference between implicit and explicit dependencies—and knowing when to use each—is what separates intermediate Terraform users from experts.
🔍 Data Sources: Reading, Not Creating
What Are Data Sources?
A data source is Terraform's way of reading information from your providers without creating or modifying anything. It's like a read-only query.
# This DOES NOT create anything data "aws_ami" "ubuntu" { most_recent = true filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"] } owners = ["099720109477"] # Canonical } # This uses the data resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id # ← Reference to data source instance_type = "t2.micro" }
Think of data sources as asking questions:
"What's the most recent Ubuntu AMI?" (
data.aws_ami)"What does this VPC look like?" (
data.aws_vpc)"Who am I authenticated as?" (
data.aws_caller_identity)"What's this secret's value?" (
data.aws_secretsmanager_secret)
When to Use Data Sources
Scenario 1: Importing Existing Infrastructure
# You have a VPC that was created outside Terraform data "aws_vpc" "existing" { id = "vpc-12345678" # Hard-coded ID—better to use variable! } # Now you can use it like any other resource resource "aws_subnet" "new" { vpc_id = data.aws_vpc.existing.id cidr_block = "10.0.1.0/24" }
Scenario 2: Fetching Dynamic Information
# Always get the latest Amazon Linux 2 AMI data "aws_ami" "amazon_linux_2" { most_recent = true owners = ["amazon"] filter { name = "name" values = ["amzn2-ami-hvm-*-x86_64-gp2"] } filter { name = "virtualization-type" values = ["hvm"] } }
Scenario 3: Reading Configuration from External Systems
# Read secrets from AWS Secrets Manager data "aws_secretsmanager_secret" "db_password" { name = "/prod/database/password" } data "aws_secretsmanager_secret_version" "db_password" { secret_id = data.aws_secretsmanager_secret.db_password.id } # Use the secret resource "aws_db_instance" "main" { password = data.aws_secretsmanager_secret_version.db_password.secret_string }
Scenario 4: Environment Discovery
# Discover availability zones in the current region data "aws_availability_zones" "available" { state = "available" } # Use them dynamically resource "aws_subnet" "public" { count = 3 vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index) availability_zone_id = data.aws_availability_zones.available.zone_ids[count.index] }
Scenario 5: Multi-account/region lookups
# Read VPC from a different AWS account provider "aws" { alias = "network" assume_role { role_arn = "arn:aws:iam::123456789012:role/NetworkReadOnly" } } data "aws_vpc" "shared" { provider = aws.network tags = { Environment = "production" Purpose = "shared-services" } }
Data Source vs. Resource: The Critical Distinction
| Resource | Data Source | |
|---|---|---|
| Creates infrastructure | ✅ Yes | ❌ No |
| Destroys infrastructure | ✅ Yes | ❌ No |
| Updates infrastructure | ✅ Yes | ❌ No |
| Costs money | Usually | No |
Can be target of depends_on | ✅ Yes | ✅ Yes (rarely needed) |
| Can be referenced | ✅ Yes | ✅ Yes |
| Can fail | ✅ Yes | ✅ Yes |
This distinction is crucial: If you use a data source for something that doesn't exist yet, Terraform can't create it—it will just fail.
# This will FAIL if the VPC doesn't exist data "aws_vpc" "main" { tags = { Name = "main-vpc" } } # This will CREATE the VPC if it doesn't exist resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" tags = { Name = "main-vpc" } }
🔗 Implicit Dependencies: The Magic of References
How Terraform Builds the Graph
Every time you reference one resource from another, Terraform creates an implicit dependency.
resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" } resource "aws_subnet" "public" { vpc_id = aws_vpc.main.id # ← DEPENDENCY: subnet → vpc cidr_block = "10.0.1.0/24" } resource "aws_instance" "web" { subnet_id = aws_subnet.public.id # ← DEPENDENCY: instance → subnet ami = data.aws_ami.ubuntu.id # ← DEPENDENCY: instance → data source }
The dependency graph:
aws_vpc.main ──┬──> aws_subnet.public ──> aws_instance.web
│
data.aws_ami.ubuntu ─────────────────────┘Terraform reads these references and guarantees:
The VPC is created before the subnet
The subnet is created before the instance
The AMI data is fetched before the instance is created
All of this happens automatically. You never need to write "create this, then create that."
References That Create Dependencies
Resource-to-resource references:
vpc_id = aws_vpc.main.id # dependency on aws_vpc.main
Resource-to-module references:
vpc_id = module.vpc.vpc_id # dependency on module.vpc
Module-to-resource references:
# Inside module resource "aws_instance" "this" { subnet_id = var.subnet_id # NOT a dependency—it's just a value } # The dependency is created where the value is PROVIDED, not where it's CONSUMED
Important: Variable references inside a module do NOT create dependencies. The dependency is created where the argument value is provided:
# root/main.tf module "compute" { source = "./modules/compute" subnet_id = aws_subnet.public.id # ← DEPENDENCY CREATED HERE! }
Data source references:
ami = data.aws_ami.ubuntu.id # dependency on data.aws_ami.ubuntu
Provider aliases:
provider = aws.west # dependency on provider configuration
What DOESN'T Create Dependencies
String interpolation without resource references:
name = "web-server-${var.environment}" # No dependency
Literal values:
cidr_block = "10.0.0.0/16" # No dependency
Variable references:
instance_type = var.instance_type # No dependency
Local values:
resource_name = local.name_prefix # No dependency
Count and for_each meta-arguments:
count = var.instance_count # No dependency
🧩 Complex Reference Patterns
References Through Local Values
Local values are evaluated after variables but before resources. They don't create dependencies, but they can pass through dependencies.
locals { # This doesn't create a dependency—it's just computing a string instance_name = "${var.project}-${var.environment}-web" # This DOES create a dependency when used vpc_info = { id = aws_vpc.main.id cidr_block = aws_vpc.main.cidr_block } } resource "aws_instance" "web" { name = local.instance_name # No dependency vpc_id = local.vpc_info.id # ← DEPENDENCY (through local) }
The dependency is created at the point of REFERENCE, not the point of DEFINITION.
References Through Module Outputs
This is where implicit dependencies get powerful—and sometimes confusing.
# modules/vpc/main.tf resource "aws_vpc" "this" { cidr_block = var.cidr_block } output "vpc_id" { value = aws_vpc.this.id # Exports the dependency } # root/main.tf module "vpc" { source = "./modules/vpc" cidr_block = "10.0.0.0/16" } module "compute" { source = "./modules/compute" vpc_id = module.vpc.vpc_id # ← DEPENDENCY on module.vpc }
The dependency is preserved through the module boundary. When you reference module.vpc.vpc_id, you implicitly depend on everything that module created.
References Through for_each and count
When you use for_each or count, references inside the iterator don't create dependencies—the dependency is on the entire collection.
resource "aws_subnet" "public" { count = 3 vpc_id = aws_vpc.main.id # DEPENDENCY: each subnet → vpc } resource "aws_instance" "web" { count = 3 subnet_id = aws_subnet.public[count.index].id # DEPENDENCY: each instance → its subnet }
But careful with this pattern:
locals { subnet_ids = aws_subnet.public[*].id # DEPENDENCY on all subnets } resource "aws_instance" "web" { for_each = toset(local.subnet_ids) # DEPENDENCY on local.subnet_ids → all subnets subnet_id = each.value }
🛠️ Explicit Dependencies: When Magic Isn't Enough
The Problem Implicit Dependencies Can't Solve
Sometimes resources depend on each other without directly referencing each other's attributes.
# Example 1: IAM role and policy resource "aws_iam_role" "lambda" { name = "lambda-execution-role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "lambda.amazonaws.com" } } ] }) } resource "aws_iam_role_policy" "s3_access" { name = "s3-access-policy" role = aws_iam_role.lambda.name # ← This creates a dependency policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = ["s3:GetObject", "s3:ListBucket"] Effect = "Allow" Resource = [ aws_s3_bucket.data.arn, # ← DEPENDENCY on bucket "${aws_s3_bucket.data.arn}/*" ] } ] }) } # BUT: The bucket doesn't reference the role or policy resource "aws_s3_bucket" "data" { bucket = "my-app-data" # No reference to the IAM policy that depends on this bucket! }
Terraform sees: The policy depends on the bucket (explicit reference). The policy depends on the role (explicit reference). But the bucket doesn't depend on anything.
This creates a potential race condition: Terraform might create the bucket, then the role, then the policy (correct). Or it might create the role, then the bucket, then the policy (still correct). But if it creates the bucket last, and some other resource tries to use it before the policy exists...
depends_on: The Manual Override
depends_on tells Terraform: "Trust me, this resource needs to wait for that one."
resource "aws_s3_bucket" "data" { bucket = "my-app-data" # EXPLICIT DEPENDENCY: Don't create this bucket until the IAM policy exists depends_on = [ aws_iam_role_policy.s3_access ] }
Now the dependency graph is complete:
aws_iam_role.lambda ──> aws_iam_role_policy.s3_access ──> aws_s3_bucket.data
↻ (circular? No—policy depends on bucket for ARN,
bucket depends on policy for creation order)This isn't a circular dependency—it's a creation order requirement. The policy needs the bucket ARN (so bucket must exist when policy is created). The bucket doesn't need the policy to exist, but we want it created AFTER the policy to ensure other systems don't access it before permissions are ready.
Common Use Cases for depends_on
1. IAM policies and the resources they protect
resource "aws_s3_bucket" "logs" { bucket = "app-logs" depends_on = [aws_iam_role_policy.log_writer] } resource "aws_iam_role_policy" "log_writer" { role = aws_iam_role.app.name policy = jsonencode({ Statement = [{ Effect = "Allow" Action = "s3:PutObject" Resource = "${aws_s3_bucket.logs.arn}/*" # Needs bucket ARN }] }) }
2. DNS records and load balancers (creation order)
resource "aws_lb" "main" { name = "app-lb" # ... configuration } resource "aws_route53_record" "app" { zone_id = var.zone_id name = "app.example.com" type = "A" alias { name = aws_lb.main.dns_name zone_id = aws_lb.main.zone_id evaluate_target_health = true } } # Wait for DNS to propagate before proceeding resource "null_resource" "wait_for_dns" { depends_on = [aws_route53_record.app] provisioner "local-exec" { command = "sleep 60" } }
3. Resources with eventual consistency
resource "aws_ecr_repository" "app" { name = "my-app" } resource "null_resource" "wait_for_ecr" { depends_on = [aws_ecr_repository.app] provisioner "local-exec" { command = <<EOF echo "Waiting for ECR repository to be fully available..." sleep 30 EOF } } resource "null_resource" "docker_push" { depends_on = [null_resource.wait_for_ecr] provisioner "local-exec" { command = "docker push ${aws_ecr_repository.app.repository_url}:latest" } }
4. Cross-module dependencies without data exchange
module "networking" { source = "./modules/networking" } module "monitoring" { source = "./modules/monitoring" # No reference to networking module, but needs it to exist depends_on = [module.networking] }
5. Bootstrapping infrastructure with chicken-egg problems
# This creates a dependency cycle if not handled carefully resource "aws_iam_role" "eks_cluster" { name = "eks-cluster-role" assume_role_policy = data.aws_iam_policy_document.eks_assume_role.json } resource "aws_eks_cluster" "main" { name = "my-cluster" role_arn = aws_iam_role.eks_cluster.arn vpc_config { subnet_ids = module.vpc.subnet_ids } # EKS cluster must be created before node group } resource "aws_iam_role" "eks_node_group" { name = "eks-node-group-role" assume_role_policy = data.aws_iam_policy_document.ec2_assume_role.json # This role needs to reference the cluster role? No. # But both roles are independent. } resource "aws_eks_node_group" "main" { cluster_name = aws_eks_cluster.main.name node_role_arn = aws_iam_role.eks_node_group.arn subnet_ids = module.vpc.subnet_ids # This automatically depends on the cluster # No explicit depends_on needed }
When NOT to Use depends_on
❌ When you can use an implicit reference instead
# BAD: Unnecessary explicit dependency resource "aws_subnet" "public" { vpc_id = aws_vpc.main.id # This already creates an implicit dependency! } resource "aws_instance" "web" { subnet_id = aws_subnet.public.id # This already creates an implicit dependency! depends_on = [aws_vpc.main] # ❌ UNNECESSARY depends_on = [aws_subnet.public] # ❌ UNNECESSARY }
❌ To force ordering between unrelated resources
# BAD: These resources don't actually depend on each other resource "aws_s3_bucket" "logs" { bucket = "app-logs" } resource "aws_dynamodb_table" "sessions" { name = "user-sessions" depends_on = [aws_s3_bucket.logs] # ❌ WHY? No relationship! }
❌ As a substitute for proper module composition
# BAD: Module should expose outputs, not rely on depends_on module "vpc" { source = "./modules/vpc" } module "eks" { source = "./modules/eks" # This shouldn't be necessary—EKS module should accept vpc_id depends_on = [module.vpc] }
❌ To create artificial dependencies for "safety"
# BAD: This doesn't make anything safer resource "aws_instance" "web" { # ... config ... depends_on = [ aws_s3_bucket.logs, # No relationship aws_dynamodb_table.data, # No relationship aws_iam_role.lambda, # No relationship ] }
🔄 Dependency Cycles and How to Break Them
What Is a Dependency Cycle?
A dependency cycle occurs when Terraform detects that Resource A depends on Resource B, and Resource B depends on Resource A (directly or indirectly).
# Example of a cycle resource "aws_security_group" "web" { name = "web-sg" ingress { from_port = 80 to_port = 80 protocol = "tcp" # Can't reference itself during creation! security_groups = [aws_security_group.web.id] # ❌ CYCLE! } } # Terraform error: # Error: Cycle: aws_security_group.web, aws_security_group.web
This is the most common Terraform error that beginners can't resolve. The solution isn't to force it—it's to restructure.
Breaking Self-Referential Cycles
Problem: A security group needs to reference itself for internal communication.
Solution 1: Create the group, then add rules separately
# Step 1: Create empty security group resource "aws_security_group" "web" { name = "web-sg" } # Step 2: Add rule referencing the now-existing group resource "aws_security_group_rule" "web_self" { type = "ingress" from_port = 80 to_port = 80 protocol = "tcp" security_group_id = aws_security_group.web.id source_security_group_id = aws_security_group.web.id # Now this works! }
Solution 2: Use self attribute (provider-specific)
resource "aws_security_group" "web" { name = "web-sg" ingress { from_port = 80 to_port = 80 protocol = "tcp" self = true # Special attribute for self-reference } }
Breaking Cross-Resource Cycles
Problem: Two resources need to reference each other during creation.
# This creates a cycle resource "aws_instance" "web" { user_data = <<-EOF #!/bin/bash echo "Web server" > /tmp/index.html nohup python3 -m http.server 80 & EOF vpc_security_group_ids = [aws_security_group.web.id] } resource "aws_security_group" "web" { name = "web-sg" ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["${aws_instance.web.public_ip}/32"] # ❌ CYCLE! } }
Solution: Use a static IP or separate resource
# Create a static IP resource "aws_eip" "web" { vpc = true } resource "aws_security_group" "web" { name = "web-sg" ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["${aws_eip.web.public_ip}/32"] } } resource "aws_instance" "web" { user_data = <<-EOF #!/bin/bash echo "Web server" > /tmp/index.html nohup python3 -m http.server 80 & EOF vpc_security_group_ids = [aws_security_group.web.id] } resource "aws_eip_association" "web" { instance_id = aws_instance.web.id allocation_id = aws_eip.web.id }
Breaking Cross-Module Cycles
Problem: Module A depends on Module B, and Module B depends on Module A.
module "vpc" { source = "./modules/vpc" # No dependencies } module "eks" { source = "./modules/eks" vpc_id = module.vpc.vpc_id subnet_ids = module.vpc.private_subnet_ids } module "monitoring" { source = "./modules/monitoring" cluster_name = module.eks.cluster_name # Wait for VPC? This creates potential cycle if monitoring needs VPC too depends_on = [module.vpc] } module "backup" { source = "./modules/backup" cluster_name = module.eks.cluster_name # Also depends on VPC depends_on = [module.vpc] }
Solution: Use data sources to decouple
# Instead of passing cluster_name from eks module, # read it directly in dependent modules # modules/backup/main.tf variable "cluster_name" { description = "Name of EKS cluster" type = string } data "aws_eks_cluster" "this" { name = var.cluster_name } # Now backup module can access VPC info without creating a cycle resource "aws_iam_role" "backup" { # Use data.aws_eks_cluster.this.arn, not module.eks.cluster_arn }
🎯 Real-World Dependency Patterns
Pattern 1: The Provisioner Dependency
resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" provisioner "remote-exec" { inline = [ "sudo apt-get update", "sudo apt-get install -y nginx", "sudo systemctl enable nginx", "sudo systemctl start nginx" ] connection { type = "ssh" user = "ubuntu" private_key = file("~/.ssh/id_rsa") host = self.public_ip } } } resource "aws_route53_record" "web" { zone_id = var.zone_id name = "web.${var.domain}" type = "A" ttl = 300 records = [aws_instance.web.public_ip] # Wait for nginx to be installed before creating DNS record depends_on = [aws_instance.web] }
Better: Use null_resource with explicit dependency
resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" } resource "null_resource" "web_provision" { triggers = { instance_id = aws_instance.web.id } provisioner "remote-exec" { # ... same as above ... } depends_on = [aws_instance.web] } resource "aws_route53_record" "web" { zone_id = var.zone_id name = "web.${var.domain}" type = "A" ttl = 300 records = [aws_instance.web.public_ip] # Now explicitly wait for provisioning to complete depends_on = [null_resource.web_provision] }
Pattern 2: The Bootstrapping Cycle
Problem: You need an S3 bucket for Terraform state, but you need Terraform state to create the S3 bucket.
Solution 1: Manual bootstrap (one-time)
# bootstrap/main.tf # Run this once with local state resource "aws_s3_bucket" "terraform_state" { bucket = "company-terraform-state" versioning { enabled = true } server_side_encryption_configuration { rule { apply_server_side_encryption_by_default { sse_algorithm = "AES256" } } } } resource "aws_dynamodb_table" "terraform_locks" { name = "terraform-state-locks" billing_mode = "PAY_PER_REQUEST" hash_key = "LockID" attribute { name = "LockID" type = "S" } } output "bucket_arn" { value = aws_s3_bucket.terraform_state.arn }
Solution 2: Partial configuration with remote state data source
# main.tf terraform { backend "s3" { # Partially configured - will be completed during init bucket = "company-terraform-state" key = "prod/terraform.tfstate" region = "us-west-2" dynamodb_table = "terraform-state-locks" encrypt = true } } # This can read from the state we're about to write! data "terraform_remote_state" "bootstrap" { backend = "s3" config = { bucket = "company-terraform-state" key = "bootstrap/terraform.tfstate" region = "us-west-2" } } # Use the bootstrap outputs resource "aws_iam_policy" "state_access" { name = "terraform-state-access" policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" ] Resource = [ data.terraform_remote_state.bootstrap.outputs.bucket_arn, "${data.terraform_remote_state.bootstrap.outputs.bucket_arn}/*" ] } ] }) }
Pattern 3: The Application Configuration Cycle
Problem: Application needs database connection string. Database connection string includes database endpoint. Database endpoint is only known after creation.
# Option 1: Store in SSM Parameter Store resource "aws_db_instance" "main" { allocated_storage = 20 engine = "postgres" engine_version = "14.7" instance_class = "db.t3.micro" username = "admin" password = random_password.db.result skip_final_snapshot = true } resource "aws_ssm_parameter" "db_connection" { name = "/${var.environment}/database/connection_string" type = "String" value = "postgresql://${aws_db_instance.main.username}:${random_password.db.result}@${aws_db_instance.main.endpoint}/${aws_db_instance.main.db_name}" depends_on = [aws_db_instance.main] } # Application reads from SSM at startup
Option 2: User data script with runtime discovery
data "aws_region" "current" {} data "aws_caller_identity" "current" {} resource "aws_instance" "app" { user_data = <<-EOF #!/bin/bash DB_ENDPOINT=$(aws rds describe-db-instances \ --db-instance-identifier ${aws_db_instance.main.id} \ --region ${data.aws_region.current.name} \ --query 'DBInstances[0].Endpoint.Address' \ --output text) echo "DATABASE_URL=postgresql://${aws_db_instance.main.username}:${random_password.db.result}@$DB_ENDPOINT/${aws_db_instance.main.db_name}" >> /etc/environment EOF iam_instance_profile = aws_iam_instance_profile.rds_read_only.name depends_on = [aws_db_instance.main] }
🧪 Practice Exercises
Exercise 1: Identify Dependencies
Task: Look at this configuration and identify all implicit and explicit dependencies.
data "aws_availability_zones" "available" {} resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" } resource "aws_subnet" "public" { count = 3 vpc_id = aws_vpc.main.id cidr_block = "10.0.${count.index}.0/24" availability_zone_id = data.aws_availability_zones.available.zone_ids[count.index] } resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id } resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } } resource "aws_route_table_association" "public" { count = 3 subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id } resource "aws_security_group" "web" { name = "web-sg" vpc_id = aws_vpc.main.id ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } } resource "aws_instance" "web" { count = 2 ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" subnet_id = aws_subnet.public[count.index].id vpc_security_group_ids = [aws_security_group.web.id] tags = { Name = "web-${count.index}" } depends_on = [ aws_internet_gateway.main ] } data "aws_ami" "ubuntu" { most_recent = true filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"] } owners = ["099720109477"] }
Answer:
Implicit dependencies:
aws_subnet.public→aws_vpc.main(reference to vpc_id)aws_subnet.public→data.aws_availability_zones.available(reference to zone_ids)aws_internet_gateway.main→aws_vpc.main(reference to vpc_id)aws_route_table.public→aws_vpc.main(reference to vpc_id)aws_route_table.public→aws_internet_gateway.main(reference to gateway_id)aws_route_table_association.public→aws_subnet.public(reference to subnet_id)aws_route_table_association.public→aws_route_table.public(reference to route_table_id)aws_security_group.web→aws_vpc.main(reference to vpc_id)aws_instance.web→data.aws_ami.ubuntu(reference to ami.id)aws_instance.web→aws_subnet.public(reference to subnet_id)aws_instance.web→aws_security_group.web(reference to security group ID)
Explicit dependencies:
aws_instance.web→aws_internet_gateway.main(depends_on)
Exercise 2: Fix a Dependency Cycle
Problem: This configuration has a cycle. Identify it and fix it.
resource "aws_lb" "main" { name = "app-lb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.lb.id] subnets = aws_subnet.public[*].id } resource "aws_security_group" "lb" { name = "lb-sg" vpc_id = aws_vpc.main.id ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } ingress { from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } } resource "aws_security_group_rule" "lb_to_web" { type = "ingress" from_port = 80 to_port = 80 protocol = "tcp" security_group_id = aws_security_group.web.id source_security_group_id = aws_security_group.lb.id } resource "aws_security_group" "web" { name = "web-sg" vpc_id = aws_vpc.main.id ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = [] security_groups = [aws_security_group.lb.id] # Reference to lb SG } } resource "aws_lb_target_group" "web" { name = "web-tg" port = 80 protocol = "HTTP" vpc_id = aws_vpc.main.id health_check { path = "/health" port = "traffic-port" } } resource "aws_lb_listener" "web" { load_balancer_arn = aws_lb.main.arn port = 80 protocol = "HTTP" default_action { type = "forward" target_group_arn = aws_lb_target_group.web.arn } }
What's the cycle?
aws_lb.main→aws_security_group.lb(security_groups reference)aws_security_group_rule.lb_to_web→aws_security_group.web(security_group_id) andaws_security_group.lb(source)aws_security_group.web→aws_security_group.lb(ingress security_groups reference)aws_lb_listener.web→aws_lb.main(load_balancer_arn)
No direct cycle between aws_lb.main and aws_security_group.lb? Wait—there is!
aws_lb.main references aws_security_group.lb.id. aws_security_group.lb doesn't reference the LB directly. But aws_lb_listener.web references aws_lb.main.arn, and aws_lb_target_group.web references aws_vpc.main (no cycle).
Actually, this might not have a cycle! Let's trace:
aws_lb.main ──> aws_security_group.lb aws_security_group.lb ──> aws_vpc.main aws_security_group.web ──> aws_security_group.lb aws_security_group_rule.lb_to_web ──> aws_security_group.web AND aws_security_group.lb aws_lb_target_group.web ──> aws_vpc.main aws_lb_listener.web ──> aws_lb.main AND aws_lb_target_group.web
No cycle! All dependencies flow in one direction. So maybe the exercise was a trick—there is no cycle. But if there were, it would be between aws_security_group.web and aws_security_group.lb if they referenced each other directly.
Fix for a hypothetical cycle: Use aws_security_group_rule resources to break circular references.
Exercise 3: Optimize Dependencies
Task: This configuration works but has unnecessary explicit dependencies. Remove them.
resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" depends_on = [aws_internet_gateway.main] # ❌ Unnecessary } resource "aws_subnet" "public" { count = 3 vpc_id = aws_vpc.main.id cidr_block = "10.0.${count.index}.0/24" depends_on = [aws_vpc.main] # ❌ Unnecessary (already implicit) } resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id depends_on = [aws_vpc.main] # ❌ Unnecessary (already implicit) } resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } depends_on = [ aws_vpc.main, # ❌ Unnecessary (implicit from vpc_id) aws_internet_gateway.main, # ❌ Unnecessary (implicit from gateway_id) ] } resource "aws_route_table_association" "public" { count = 3 subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id depends_on = [ aws_subnet.public, # ❌ Unnecessary (implicit from subnet_id) aws_route_table.public, # ❌ Unnecessary (implicit from route_table_id) ] } resource "aws_security_group" "web" { name = "web-sg" vpc_id = aws_vpc.main.id ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } depends_on = [aws_vpc.main] # ❌ Unnecessary (implicit from vpc_id) } resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" subnet_id = aws_subnet.public[0].id vpc_security_group_ids = [aws_security_group.web.id] depends_on = [ aws_subnet.public, # ❌ Unnecessary (implicit from subnet_id) aws_security_group.web, # ❌ Unnecessary (implicit from security group reference) aws_internet_gateway.main, # ⚠️ Maybe necessary if the instance needs internet at launch aws_route_table.public, # ❌ Unnecessary (already have IGW dependency) ] }
Clean version:
resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" } resource "aws_subnet" "public" { count = 3 vpc_id = aws_vpc.main.id cidr_block = "10.0.${count.index}.0/24" } resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id } resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } } resource "aws_route_table_association" "public" { count = 3 subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id } resource "aws_security_group" "web" { name = "web-sg" vpc_id = aws_vpc.main.id ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } } resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = "t2.micro" subnet_id = aws_subnet.public[0].id vpc_security_group_ids = [aws_security_group.web.id] # Only keep explicit dependencies that can't be inferred depends_on = [aws_internet_gateway.main] # Kept because instance needs internet at launch }
📋 Dependency Management Best Practices
Do's and Don'ts
✅ DO rely on implicit dependencies whenever possible. They're automatic, self-documenting, and never get out of sync.
✅ DO use depends_on sparingly and only when necessary. Every explicit dependency is a maintenance burden.
✅ DO document why you need explicit dependencies. Future maintainers need to know why the relationship isn't implicit.
✅ DO test your dependency graph. Use terraform graph to visualize and verify dependencies.
✅ DO use data sources to break cycles. Reading existing infrastructure instead of passing references can eliminate cycles.
❌ DON'T create dependencies on data sources. Data sources are read-only and don't need ordering (except when they fail).
❌ DON'T use depends_on = [module.xxx] when you could use module outputs. If a module doesn't expose what you need, fix the module.
❌ DON'T create circular dependencies. Restructure your configuration to avoid them.
❌ DON'T assume depends_on solves eventual consistency issues. It only ensures creation order, not that the resource is ready to use.
Using terraform graph to Visualize Dependencies
# Generate a DOT file terraform graph > graph.dot # Convert to PNG (requires GraphViz) dot -Tpng graph.dot > graph.png # View graph open graph.png # macOS xdg-open graph.png # Linux
What to look for:
Cycles — loops in the graph that Terraform can't resolve
Missing dependencies — resources that should be connected but aren't
Unnecessary dependencies — connections that don't need to exist
Long chains — resources that depend on many other resources
🎓 Summary: Master the Graph
Terraform's dependency graph is its most powerful feature—and its most misunderstood.
| Implicit Dependencies | Explicit Dependencies | |
|---|---|---|
| How to create | Reference another resource's attributes | depends_on = [resource.xxx] |
| When to use | Always, by default | When implicit detection fails |
| Maintenance | Automatic | Manual |
| Self-documenting | Yes | No |
| Can create cycles | Rarely | Yes, if used carelessly |
Data sources are read-only windows into your infrastructure. They can create dependencies when their results are used, but they never modify infrastructure.
The mark of an expert Terraform user is knowing when to trust the automatic dependency detection—and when to override it. Most of the time, you trust it. The rest of the time, you have a specific, documented reason not to.
🔗 Master Terraform Dependencies with Hands-on Labs
Understanding dependencies is the key to mastering Terraform. Practice identifying, fixing, and optimizing dependencies in real scenarios.
👉 Practice dependency management with interactive labs and real cloud infrastructure at:
https://devops.trainwithsky.com/
Our platform provides:
Dependency graph visualization exercises
Cycle detection and resolution challenges
Data source configuration labs
Complex multi-module dependency scenarios
Real-time validation of your dependency graphs
Frequently Asked Questions
Q: Can Terraform create resources in parallel?
A: Yes! Terraform creates independent resources in parallel. Dependencies create serialization points—dependent resources wait for their dependencies.
Q: How does Terraform detect dependencies in for_each and count?
A: References inside for_each and count expressions create dependencies on the entire collection, not individual elements.
Q: Do data sources create dependencies?
A: Yes, when you reference a data source's attributes in a resource, that creates an implicit dependency on the data source being read successfully.
Q: Can I create a dependency on a module output?
A: Yes! When you reference a module output, you implicitly depend on everything that module creates.
Q: Why does terraform plan sometimes reorder resources?
A: Terraform always respects dependencies, but independent resources may be reordered for execution efficiency.
Q: How do I debug "Error: Cycle" messages?
A: Use terraform graph to visualize the cycle. Look for resources that reference each other directly or indirectly. Break the cycle by:
Moving one reference to a separate resource
Using data sources instead of direct references
Restructuring your module boundaries
Q: Can I force Terraform to ignore certain dependencies?
A: No. Dependencies are fundamental to correctness. If you have a dependency you want to ignore, you have a design problem, not a tool limitation.
Struggling with a tricky dependency cycle? Not sure if you need depends_on? Share your configuration in the comments—our community of Terraform experts is here to help! 💬
Comments
Post a Comment