Skip to content

Instantly share code, notes, and snippets.

@nrashok
Last active September 17, 2025 06:30
Show Gist options
  • Select an option

  • Save nrashok/0338beb5b644ca5b440f2ac4f62b2e4d to your computer and use it in GitHub Desktop.

Select an option

Save nrashok/0338beb5b644ca5b440f2ac4f62b2e4d to your computer and use it in GitHub Desktop.
Automate stopping EC2 instances and scaling EKS NodeGroup using Lambda, Step Functions, SSM, and EventBridge
AWSTemplateFormatVersion: '2010-09-09'
Description: Automate stopping/starting EC2 and managed EKS NodeGroups with Step Functions + EventBridge + SSM
Parameters:
InstanceIds:
Type: CommaDelimitedList
Default: i-07c33ff5a82b668cc,i-047a90a37986a669e
ClusterName:
Type: String
Default: mr-blues-dev
NodeGroups:
Type: CommaDelimitedList
Default: t3-spot-private
Region:
Type: String
Default: ap-south-1
StartSize:
Type: Number
Default: 2
MinSize:
Type: Number
Default: 1
Resources:
LambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
- arn:aws:iam::aws:policy/AmazonEC2FullAccess
- arn:aws:iam::aws:policy/AWSStepFunctionsFullAccess
- arn:aws:iam::aws:policy/AmazonSSMFullAccess
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
- arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
Policies:
- PolicyName: EKSCustomNodegroupScaling
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- eks:UpdateNodegroupConfig
- eks:DescribeNodegroup
Resource:
- !Sub arn:aws:eks:${Region}:${AWS::AccountId}:nodegroup/${ClusterName}/*/*
StopStartLambda:
Type: AWS::Lambda::Function
Properties:
Runtime: python3.9
Handler: index.lambda_handler
Role: !GetAtt LambdaExecutionRole.Arn
Timeout: 300
Environment:
Variables:
CLUSTER: !Ref ClusterName
REGION: !Ref Region
START_SIZE: !Ref StartSize
MIN_SIZE: !Ref MinSize
Code:
ZipFile: |
import boto3, os
def get_ng_name(ng):
if ng.startswith("arn:aws:eks:"):
# ARN format: arn:aws:eks:region:account:nodegroup/<cluster>/<name>/<uuid>
return ng.split("/")[2] # extract <name>
return ng
def lambda_handler(event, context):
action = event.get("Action")
cluster_name = event.get("ClusterName", os.environ.get("CLUSTER"))
nodegroups = event.get("NodeGroups", [])
instances = event.get("InstanceIds", [])
start_size = int(event.get("StartSize", os.environ.get("START_SIZE", "2")))
min_size = int(event.get("MinSize", os.environ.get("MIN_SIZE", "1")))
region = event.get("Region", os.environ.get("REGION", "ap-south-1"))
ec2 = boto3.client("ec2", region_name=region)
eks = boto3.client("eks", region_name=region)
# Stop/start EC2 instances
if instances:
if action == "stop":
ec2.stop_instances(InstanceIds=instances)
elif action == "start":
ec2.start_instances(InstanceIds=instances)
# Scale managed nodegroups
for ng in nodegroups:
ng_name = get_ng_name(ng)
if action == "stop":
eks.update_nodegroup_config(
clusterName=cluster_name,
nodegroupName=ng_name,
scalingConfig={
"minSize": 0,
"desiredSize": 0
}
)
elif action == "start":
eks.update_nodegroup_config(
clusterName=cluster_name,
nodegroupName=ng_name,
scalingConfig={
"minSize": min_size,
"desiredSize": start_size
}
)
return {
"status": "success",
"action": action,
"instances": instances,
"nodegroups": nodegroups
}
StepFunctionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: states.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: InvokeLambdaPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action: lambda:InvokeFunction
Resource: !GetAtt StopStartLambda.Arn
AutomationStateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
RoleArn: !GetAtt StepFunctionRole.Arn
DefinitionString:
Fn::Sub: |
{
"Comment": "Stop/Start EC2 and managed EKS NodeGroups",
"StartAt": "StopStartAction",
"States": {
"StopStartAction": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"FunctionName": "${StopStartLambda}",
"Payload.$": "$"
},
"End": true
}
}
}
StopScheduleRule:
Type: AWS::Events::Rule
Properties:
ScheduleExpression: cron(0 16 * * ? *) # 9:30 PM IST
State: ENABLED
Targets:
- Arn: !Ref AutomationStateMachine
Id: StopTarget
RoleArn: !GetAtt StepFunctionRole.Arn
Input: |
{
"Action": "stop",
"ClusterName": "mr-blues-dev",
"InstanceIds": ["i-07c33ff5a82b668cc","i-047a90a37986a669e"],
"NodeGroups": ["t3-spot-private"],
"Region": "ap-south-1"
}
StartScheduleRule:
Type: AWS::Events::Rule
Properties:
ScheduleExpression: cron(0 2 * * ? *) # 7:30 AM IST
State: ENABLED
Targets:
- Arn: !Ref AutomationStateMachine
Id: StartTarget
RoleArn: !GetAtt StepFunctionRole.Arn
Input: |
{
"Action": "start",
"ClusterName": "mr-blues-dev",
"InstanceIds": ["i-07c33ff5a82b668cc","i-047a90a37986a669e"],
"NodeGroups": ["t3-spot-private"],
"Region": "ap-south-1",
"StartSize": 2,
"MinSize": 1
}
ManualTriggerSSMDocument:
Type: AWS::SSM::Document
Properties:
DocumentType: Automation
Content:
schemaVersion: '0.3'
description: "Manually trigger stop/start for EC2 + managed EKS NodeGroups"
parameters:
Action:
type: String
description: "Action to perform (stop/start)"
allowedValues: ["stop", "start"]
InstanceIds:
type: StringList
description: "List of EC2 instance IDs to start/stop"
default: ["i-07c33ff5a82b668cc","i-047a90a37986a669e"]
NodeGroups:
type: StringList
description: "Managed NodeGroup names or ARNs"
default: ["t3-spot-private"]
ClusterName:
type: String
description: "EKS Cluster name"
default: "mr-blues-dev"
StepFunctionArn:
type: String
description: "Step Function ARN"
mainSteps:
- name: InvokeStepFunction
action: aws:executeScript
inputs:
Runtime: python3.8
Handler: handler
Script: |
import boto3, json
def handler(event, context):
sfn = boto3.client("stepfunctions")
response = sfn.start_execution(
stateMachineArn=event["StepFunctionArn"],
input=json.dumps({
"Action": event["Action"],
"ClusterName": event["ClusterName"],
"InstanceIds": event["InstanceIds"],
"NodeGroups": event["NodeGroups"],
"MinSize": event.get("MinSize", 1),
"StartSize": event.get("StartSize", 2)
})
)
return response
InputPayload:
Action: "{{Action}}"
ClusterName: "{{ClusterName}}"
InstanceIds: "{{InstanceIds}}"
NodeGroups: "{{NodeGroups}}"
StepFunctionArn: !Ref AutomationStateMachine
Outputs:
StepFunctionArn:
Description: ARN of the Step Function
Value: !Ref AutomationStateMachine
LambdaName:
Description: Lambda function handling EC2/managed EKS actions
Value: !Ref StopStartLambda
ManualSSMDocumentName:
Description: Name of the SSM document for manual trigger
Value: !Ref ManualTriggerSSMDocument
@nrashok
Copy link
Author

nrashok commented Sep 12, 2025

{
  "Action": "start",
  "ClusterName": "mr-blues-dev",
  "InstanceIds": ["i-07c33ff5a82b668cc", "i-047a90a37986a669e"],
  "NodeGroups": ["t3-s-ng"],
  "Region": "ap-south-1",
  "StartSize": 2,
  "MinSize": 1
}


@nrashok
Copy link
Author

nrashok commented Sep 12, 2025

{
  "Action": "stop",
  "ClusterName": "mr-blues-dev",
  "InstanceIds": ["i-07c33ff5a82b668cc", "i-047a90a37986a669e"],
  "NodeGroups": ["t3-s-ng"],
  "Region": "ap-south-1"
}

@nrashok
Copy link
Author

nrashok commented Sep 12, 2025

{
  "Action": "start",
  "ClusterName": "mr-blues-dev",
  "InstanceIds": [
    "i-07c33ff5a82b668cc",
    "i-047a90a37986a669e"
  ],
  "NodeGroups": [
    "t3-s-ng"
  ],
  "Region": "ap-south-1",
  "StartSize": 2,
  "MinSize": 2
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment