Core Principle

Terraform is a declarative provisioning tool that builds a dependency graph from resources written in HCL, diffs the desired state against a persisted state file, and produces an explicit plan of API calls before anything runs. The model is closed-world: anything not in the config is either unmanaged or, if previously managed, will be destroyed.

Why This Matters

The plan/apply split is the feature. Most infrastructure mistakes are not “I wrote the wrong thing,” they are “I didn’t realize this change would also destroy that thing.” terraform plan makes the blast radius visible before commit, which is qualitatively different from procedural tools that only reveal effects by running.

The state file is the trade-off: it’s the source of “what is currently managed,” but it must be stored somewhere durable (S3, Terraform Cloud, GCS), locked during writes, and treated as sensitive (it can contain secrets in plaintext).

Evidence/Examples

  • Cloud infrastructure: VMs, VPCs, subnets, IAM roles, DNS zones, managed RDS, S3 buckets. The original sweet spot.
  • App-level “API objects” via providers: Grafana dashboards, Keycloak realms, GitHub org settings, Vault policies, Postgres roles and grants.
  • Anti-pattern: managing relational schema (tables, columns) via Terraform. Destroy-and-recreate semantics are catastrophic for tables with data. Use a real migration tool (see Atlas (Schema as Code)).
  • Anti-pattern: null_resource + local-exec to run imperative steps. Works until it doesn’t, no ordering guarantees, state gets confused.

Implications

  • If you find yourself asking “will this remove things I didn’t mean to remove?” you want Terraform’s plan step. If you’re asking “how do I get Terraform to just run this one command?” you want Ansible instead.
  • Don’t co-manage the same object with two tools. Pick one owner per resource or live with permanent drift loops.
  • Provider quality varies wildly. First-party providers (AWS, GCP, Azure) are robust; community providers can be thin or abandoned.
  • License changed from MPL 2.0 to BUSL in August 2023, which forked OpenTofu as a community-governed alternative. They remain wire-compatible for now but will diverge.

Questions

  • When is splitting Terraform into many small workspaces worth the orchestration cost vs one monolith with a slow plan?
  • What’s the right boundary between Terraform-managed app config and the app’s own config-as-code (Grafana provisioning, Keycloak realm import)? Both work; co-ownership is the failure mode.
  • How do teams handle Terraform state for ephemeral preview environments without state-file sprawl?