Skip to content

Instantly share code, notes, and snippets.

@alvindaiyan
Created December 14, 2023 06:52
Show Gist options
  • Save alvindaiyan/facf2abac5f03b88cf89d5d3af0cd7b6 to your computer and use it in GitHub Desktop.
Save alvindaiyan/facf2abac5f03b88cf89d5d3af0cd7b6 to your computer and use it in GitHub Desktop.
Scaling a Sagemaker Endpoint to 0 and others
import os
import boto3
os.environ.setdefault('AWS_PROFILE', 'playground') # your aws profile name
endpoint_name = 'YOUR_ENDPOINT_NAME' # your endpoint name
if __name__ == '__main__':
sagemaker_client = boto3.client('sagemaker')
ep = sagemaker_client.describe_endpoint(
EndpointName=endpoint_name
)
print(ep)
response = sagemaker_client.update_endpoint_weights_and_capacities(
EndpointName=endpoint_name,
DesiredWeightsAndCapacities=[
{
'VariantName': 'main',
'DesiredInstanceCount': 0
},
]
)
print(response)
@alvindaiyan
Copy link
Author

alvindaiyan commented Dec 22, 2023

  1. From AWS Console to get the commandline data:

Screenshot 2023-12-22 at 10 22 31

  1. we can use the aws cli in cloudshell to do it:

Screenshot 2023-12-22 at 10 27 22

aws sagemaker update-endpoint-weights-and-capacities --endpoint-name aigc-utils-endpoint --desired-weights-and-capacities '{"VariantName": "main", "DesiredInstanceCount": 0}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment