Deploying Keycloak on AWS ECS with Fargate using Terraform
Introduction
Keycloak is a popular open-source identity and access management solution that provides single sign-on, identity federation, social login, and much more. Deploying Keycloak in a production environment requires careful planning to ensure security, scalability, and high availability.
In this comprehensive guide, we'll walk through deploying Keycloak on AWS Elastic Container Service (ECS) with Fargate using Terraform. This serverless approach eliminates the need to manage underlying infrastructure, allowing you to focus on your application.
Why This Approach?
Using ECS with Fargate provides a serverless container platform that eliminates the need to provision and manage servers. Combined with Terraform for infrastructure as code, this approach offers the perfect balance of control, scalability, and operational simplicity.
Architecture Overview
Before diving into implementation details, let's break down the architecture we'll be building:
The architecture follows AWS best practices with a focus on security, high availability, and scalability:
Infrastructure Components
- Keycloak runs as a containerized application in ECS Fargate
- Aurora PostgreSQL provides a highly available database backend
- Application Load Balancer (ALB) distributes traffic across multiple instances
Security and Reliability
- Security groups and network ACLs control traffic flow
- Auto scaling ensures handling of varying loads
- CloudWatch provides monitoring and alerting
Prerequisites
Before You Begin
Make sure you have all the necessary prerequisites before starting the deployment process. Missing requirements could lead to errors or security issues in your infrastructure.
Before we begin, ensure you have the following:
- AWS account with appropriate permissions
- Terraform (version 1.0 or newer)
- AWS CLI configured with appropriate credentials
- Registered domain name (optional but recommended for production)
VPC Setup
Let's start by creating a Virtual Private Cloud (VPC) with public and private subnets across multiple availability zones:
1# VPC Configuration
2module "vpc" {
3 source = "terraform-aws-modules/vpc/aws"
4 version = "~> 3.0"
5
6 name = "keycloak-vpc"
7 cidr = "10.0.0.0/16"
8
9 azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
10 private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
11 public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
12
13 enable_nat_gateway = true
14 single_nat_gateway = false # For production use multiple NAT gateways
15 one_nat_gateway_per_az = true
16
17 enable_vpn_gateway = false
18
19 # Enable DNS support for VPC
20 enable_dns_hostnames = true
21 enable_dns_support = true
22
23 # Add tags for better resource management
24 tags = {
25 Environment = "production"
26 Project = "keycloak"
27 Terraform = "true"
28 }
29}
30
31# Security Groups
32resource "aws_security_group" "alb" {
33 name = "keycloak-alb-sg"
34 description = "Security group for Keycloak ALB"
35 vpc_id = module.vpc.vpc_id
36
37 ingress {
38 description = "HTTPS from internet"
39 from_port = 443
40 to_port = 443
41 protocol = "tcp"
42 cidr_blocks = ["0.0.0.0/0"]
43 }
44
45 ingress {
46 description = "HTTP from internet (for redirects)"
47 from_port = 80
48 to_port = 80
49 protocol = "tcp"
50 cidr_blocks = ["0.0.0.0/0"]
51 }
52
53 egress {
54 from_port = 0
55 to_port = 0
56 protocol = "-1"
57 cidr_blocks = ["0.0.0.0/0"]
58 }
59}
RDS Setup (Aurora PostgreSQL)
Keycloak requires a database to store configuration and user data. We'll use Amazon Aurora PostgreSQL for its high availability and performance:
1# Database subnet group
2resource "aws_db_subnet_group" "keycloak" {
3 name = "keycloak-db-subnet-group"
4 subnet_ids = module.vpc.private_subnets
5
6 tags = {
7 Name = "Keycloak DB Subnet Group"
8 }
9}
10
11# Database security group
12resource "aws_security_group" "database" {
13 name = "keycloak-database-sg"
14 description = "Security group for Keycloak database"
15 vpc_id = module.vpc.vpc_id
16
17 ingress {
18 description = "PostgreSQL from Keycloak service"
19 from_port = 5432
20 to_port = 5432
21 protocol = "tcp"
22 security_groups = [aws_security_group.keycloak.id]
23 }
24
25 egress {
26 from_port = 0
27 to_port = 0
28 protocol = "-1"
29 cidr_blocks = ["0.0.0.0/0"]
30 }
31}
32
33# Aurora PostgreSQL cluster
34resource "aws_rds_cluster" "keycloak" {
35 cluster_identifier = "keycloak-cluster"
36 engine = "aurora-postgresql"
37 engine_version = "13.7"
38 database_name = "keycloak"
39 master_username = "keycloak"
40 master_password = var.database_password # Use AWS Secrets Manager in production
41 backup_retention_period = 7
42 preferred_backup_window = "03:00-04:00"
43 db_subnet_group_name = aws_db_subnet_group.keycloak.name
44 vpc_security_group_ids = [aws_security_group.database.id]
45 skip_final_snapshot = true # Change for production
46
47 tags = {
48 Name = "Keycloak Aurora Cluster"
49 }
50}
ECS Cluster and Task Definition
Now let's create the ECS cluster and task definition to run Keycloak:
1# ECS Cluster
2resource "aws_ecs_cluster" "keycloak" {
3 name = "keycloak-cluster"
4
5 setting {
6 name = "containerInsights"
7 value = "enabled"
8 }
9
10 tags = {
11 Name = "Keycloak ECS Cluster"
12 }
13}
14
15# ECS Task Definition
16resource "aws_ecs_task_definition" "keycloak" {
17 family = "keycloak"
18 network_mode = "awsvpc"
19 requires_compatibilities = ["FARGATE"]
20 cpu = "1024"
21 memory = "2048"
22 execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
23
24 container_definitions = jsonencode([
25 {
26 name = "keycloak"
27 image = "quay.io/keycloak/keycloak:20.0.3"
28 essential = true
29 portMappings = [
30 {
31 containerPort = 8080
32 hostPort = 8080
33 protocol = "tcp"
34 }
35 ]
36 environment = [
37 { name = "KC_DB", value = "postgres" },
38 { name = "KC_DB_URL", value = "jdbc:postgresql://keycloak-cluster.cluster-abc123xyz.us-east-1.rds.amazonaws.com:5432/keycloak" },
39 { name = "KC_DB_USERNAME", value = "keycloak" },
40 { name = "KC_DB_PASSWORD", value = "var.database_password" },
41 { name = "KEYCLOAK_ADMIN", value = "admin" },
42 { name = "KEYCLOAK_ADMIN_PASSWORD", value = "var.keycloak_admin_password" }
43 ]
44 command = ["start", "--optimized"]
45 }
46 ])
47}
Application Load Balancer (ALB) Setup
Let's configure the Application Load Balancer to distribute traffic across Keycloak instances:
1# Application Load Balancer
2resource "aws_lb" "keycloak" {
3 name = "keycloak-alb"
4 internal = false
5 load_balancer_type = "application"
6 security_groups = [aws_security_group.alb.id]
7 subnets = module.vpc.public_subnets
8
9 enable_deletion_protection = true # Set to true for production
10
11 tags = {
12 Name = "Keycloak ALB"
13 }
14}
15
16# Target Group
17resource "aws_lb_target_group" "keycloak" {
18 name = "keycloak-tg"
19 port = 8080
20 protocol = "HTTP"
21 vpc_id = module.vpc.vpc_id
22 target_type = "ip"
23
24 health_check {
25 enabled = true
26 healthy_threshold = 2
27 unhealthy_threshold = 2
28 timeout = 5
29 interval = 30
30 path = "/health"
31 matcher = "200"
32 port = "traffic-port"
33 protocol = "HTTP"
34 }
35
36 tags = {
37 Name = "Keycloak Target Group"
38 }
39}
40
41# Listener
42resource "aws_lb_listener" "keycloak" {
43 load_balancer_arn = aws_lb.keycloak.arn
44 port = "443"
45 protocol = "HTTPS"
46 ssl_policy = "ELBSecurityPolicy-TLS-1-2-2019-07"
47 certificate_arn = aws_acm_certificate.keycloak.arn
48
49 default_action {
50 type = "forward"
51 target_group_arn = aws_lb_target_group.keycloak.arn
52 }
53}
Route 53 and ACM Setup
Finally, let's configure DNS and SSL certificates:
1# SSL Certificate
2resource "aws_acm_certificate" "keycloak" {
3 domain_name = var.keycloak_hostname
4 validation_method = "DNS"
5
6 lifecycle {
7 create_before_destroy = true
8 }
9
10 tags = {
11 Name = "Keycloak SSL Certificate"
12 }
13}
14
15# Route 53 record for Keycloak
16resource "aws_route53_record" "keycloak" {
17 zone_id = var.route53_zone_id
18 name = var.keycloak_hostname
19 type = "A"
20
21 alias {
22 name = aws_lb.keycloak.dns_name
23 zone_id = aws_lb.keycloak.zone_id
24 evaluate_target_health = true
25 }
26}
27
28# Certificate validation
29resource "aws_acm_certificate_validation" "keycloak" {
30 certificate_arn = aws_acm_certificate.keycloak.arn
31 validation_record_fqdns = [for record in aws_route53_record.keycloak_validation : record.fqdn]
32}
Deployment
To deploy the infrastructure, follow these steps:
1# Initialize Terraform
2terraform init
3
4# Plan the deployment
5terraform plan -var-file=prod.tfvars -out=tfplan
6
7# Apply the deployment
8terraform apply tfplan
9
10# Verify the deployment
11terraform output
12
13# Check Keycloak service status
14aws ecs describe-services --cluster keycloak-cluster --services keycloak-service
Deployment Tips
The initial deployment may take 10-15 minutes as AWS provisions all resources. Monitor the ECS service in the AWS console to track container startup progress. Keycloak typically takes 2-3 minutes to fully initialize after the container starts.
Testing Your Deployment
Once deployment is complete, it's crucial to verify that everything is working correctly:
1Basic Functionality
- Access Keycloak admin console
- Create a test realm and user
- Configure a simple client application
2Advanced Testing
- Test authentication flows
- Verify logs in CloudWatch
- Test high availability scenarios
Quick Verification Commands
1# Check if Keycloak is responding
2curl -k https://your-keycloak-domain.com/health
3
4# Test admin console access
5curl -k https://your-keycloak-domain.com/admin/
6
7# Check ECS service health
8aws ecs describe-services --cluster keycloak-cluster --services keycloak-service --query 'services[0].runningCount'
9
10# View container logs
11aws logs tail /ecs/keycloak --follow
Conclusion
In this guide, we've covered how to deploy Keycloak on AWS ECS with Fargate using Terraform. This approach provides a scalable, highly available, and secure identity management solution without the overhead of managing underlying infrastructure.
Key Benefits Summary
This architecture provides a production-ready Keycloak deployment that follows AWS best practices for security, scalability, and reliability. The serverless approach with Fargate minimizes operational overhead while maintaining full control over your identity management system.
Infrastructure
- Serverless operations with Fargate
- High availability across multiple AZs
- Auto scaling based on demand
Security
- Secure network configuration
- SSL/TLS encryption
- Private subnets for sensitive components
Management
- Managed database with Aurora PostgreSQL
- Infrastructure as code with Terraform
- Monitoring with CloudWatch
Next Steps
- •Customize Keycloak themes and configurations for your organization
- •Integrate with your existing applications using OIDC or SAML
- •Set up monitoring and alerting for production use
- •Implement backup and disaster recovery procedures