Linux | Cloud | DevOps | Scripting

Breaking

Monday, 26 August 2019

Take Backup of EC2 Instances via Automatic Snapshots using AWS Lambda Function and CloudWatch


To perform this exercise, we need to follow the steps :

1. Create an AWS EC2 instance and provide tag auto_snapshot=true
2. Create an IAM role
3. Prepare the code for Lambda Function
4. Create a Lambda Function
5. Test the code
6. Create a trigger for Lambda function to take a snapshot on a particular time
7. Verify the snapshot

Step 1: Create an AWS EC2 instance and provide tag 'auto_snapshot=true':

Create an AWS EC2 instance or select the instance on which we want to create snapshot automatically, then give it the tag; Key: 'auto_snapshot' and Value: 'true'.

AWS ➔ Services ➔ EC2 ➔ Launch instance ➔ [*] Free tier only ➔ select AWS Linux AMI ➔ Review and launch ➔ Launch.

Now, we need to provide the tag:

Select the instance ➔ Tags ➔ Add/Edit Tags ➔ we didn't provide any name to the instance, so we can tag a name also from here. For this click on Create Tag ➔ Key: Name ➔ Value: Linux_Server ➔ Create Tag ➔ Key: auto_snapshot ➔ Value: true ➔ Save.

Add/Edit the Tags
Fig: Add/Edit the Tags

Tags are case sensitive, so make sure you are providing proper tagging.

Step 2: Create an IAM role:

To define permissions to the Lambda function, we need to create a role.

AWS ➔ Services ➔ IAM ➔ Roles ➔ Create role ➔ Lambda ➔ Next ➔ 'AmazonEC2FullAccess' (so that Lambda can freeze and take a snapshot of selected EC2 instances) and 'CloudWatchFullAccess' (so that Lambda can create/update logs) ➔ Next ➔ Role name: lambda_snap_role ➔ provide some description accordingly ➔ Create role.

Step 3: Prepare the code for Lambda Function:

Below-mentioned Green colour code will be used in the Function code area after creating Lambda function:

import boto3
import collections
import datetime
import time
import sys


today = datetime.date.today()
today_string = today.strftime('%Y/%m/%d')
delete_after_days = 2  # Delete snapshots after this many days

# Except after Monday (at Tuesday ~1am), since Friday is only 2 'working' days away:
if datetime.date.today().weekday() == 1:
    delete_after_days = delete_after_days + 2

deletion_date = today - datetime.timedelta(days=delete_after_days)
deletion_date_string = deletion_date.strftime('%Y/%m/%d')


ec2 = boto3.client('ec2')
regions = ec2.describe_regions().get('Regions',[] )
all_regions = [region['RegionName'] for region in regions]

def lambda_handler(event, context):
    snapshot_counter = 0
    snap_size_counter = 0
    deletion_counter = 0
    deleted_size_counter = 0

    for region_name in all_regions:
      print('Instances in EC2 Region {0}:'.format(region_name))
      ec2 = boto3.resource('ec2', region_name=region_name)

      # We only want to look through instances with the following tag key value pair: auto_snapshot : true
      instances = ec2.instances.filter(
          Filters=[
              {'Name': 'tag:auto_snapshot', 'Values': ['true']}
                  ]
              )

      volume_ids = []
      for i in instances.all():

          for tag in i.tags:  # Get the name of the instance
              if tag['Key'] == 'Name':
                  name = tag['Value']

          print('Found tagged instance \'{1}\', id: {0}, state: {2}'.format(i.id, name, i.state['Name']))

          vols = i.volumes.all()  # Iterate through each instance's volumes
          for v in vols:
              print('{0} is attached to volume {1}, proceeding to snapshot'.format(name, v.id))
              volume_ids.extend(v.id)
              snapshot = v.create_snapshot(
                  Description = 'AutoSnapshot of {0}, on volume {1} - Created {2}'.format(name, v.id, today_string),
                  )
              snapshot.create_tags(  # Add the following tags to the new snapshot
                  Tags = [
                      {
                          'Key': 'auto_snap',
                          'Value': 'true'
                      },
                      {
                          'Key': 'volume',
                          'Value': v.id
                      },
                      {
                          'Key': 'CreatedOn',
                          'Value': today_string
                      },
                       {
                          'Key': 'Name',
                          'Value': '{} autosnap'.format(name)
                      }
                  ]
              )
              print('Snapshot completed')
              snapshot_counter += 1
              snap_size_counter += snapshot.volume_size

              # Now iterate through snapshots which were made by autsnap
              snapshots = ec2.snapshots.filter(
                  Filters=[
                      {'Name': 'tag:auto_snap', 'Values': ['true']
                      }
                  ]
              )


              print('Checking for out of date snapshots for instance {0}...'.format(name))
              for snap in snapshots:
                  can_delete = False
                  for tag in snap.tags: # Use these if statements to get each snapshot's
                                        # cleated on date, name and auto_snap tag
                      if tag['Key'] == 'CreatedOn':
                          created_on_string = tag['Value']
                      if tag['Key'] == 'auto_snap':
                          if tag['Value'] == 'true':
                              can_delete = True
                      if tag['Key'] == 'Name':
                          name = tag['Value']
                  created_on = datetime.datetime.strptime(created_on_string, '%Y/%m/%d').date()

                  if created_on <= deletion_date and can_delete == True:
                      print('Snapshot id {0}, ({1}) from {2} is {3} or more days old... deleting'.format(snap.id, name, created_on_string, delete_after_days))
                      deleted_size_counter += snap.volume_size
                      snap.delete()
                      deletion_counter += 1

    print('   Made {0} snapshots totalling {1} GB\
        Deleted {2} snapshots totalling {3} GB'.format(snapshot_counter, snap_size_counter, deletion_counter, deleted_size_counter))
    return
----------------------
Above code is taken from below-mentioned URL:
http://blog.keyrus.co.uk/backup_ec2_instances_automatic_snapshots.html
----------------------

In the above code, 'boto' is the SDK for Python in AWS services, which allows us to write the code to make interaction with AWS services.

Step 4: Create a Lambda Function:

AWS ➔ Services ➔ Create function ➔ [*] Author from scratch ➔ Function name: auto_snapshot ➔ Runtime: python 3.7 ➔ [*] Use an existing role: select the role, which we created in the step number 2, and name it 'lambda_snap_role' ➔ Create function…

As the function is created, this will list all those services, which can be accessed by this function. Right now, we can see no trigger is attached here.

Service list accessed by Lambda Function
Fig: Service list accessed by Lambda Function
…navigate to 'Function code' ➔ delete the default provided code and paste the Green Coloured Code which we prepared in step number 3…
.
Location to paste the code
Fig: Location to paste the code
…navigate to Basic settings ➔ provide proper Description ➔ update timeout for minimum 30 sec otherwise there will be timeout and code will not be worked ➔ Save.

Update timeout period
Fig: Update timeout period

Step 5: Test the code:

Click on 'Test' button in Lambda Function ➔ this will open a 'Configure test event' window ➔ Event name: provide any name ➔ Create.

On the top of the page you can see the notification; Successfully created the function. Now, click on 'Test' to test the code.

If the code is correct, then the test will be successful and a snapshot will be created. But if there is an error, then go to the top and below the snapshot name, you can see a link to logs. You can verify the error in logs or can just click on Details to get a brief detail about that error.

In the below-mentioned screenshot, we can see 'Execution result: succeeded'. So, definitely, a snapshot will be created.

Code logs
Fig: Code logs
Navigate to AWS EC2 console for verification:

AWS ➔ Services ➔ EC2 ➔ Snapshots ➔ here, we can see that a snapshot is going to create, which is in the pending state right now.

Snapshot creation
Fig: Snapshot creation
So, we can see our code is working fine.

Step 6: Create a trigger for Lambda function to take a snapshot on a particular time:

Now, we will schedule our code to be run on a particular time, to take a snapshot. For this navigate to AWS Lambda service console and create a trigger for Lambda function.

AWS ➔ Services ➔ Lambda ➔ Functions ➔ select Lambda Function 'auto_snapshot' ➔ click on Add trigger ➔ from drop-down list select trigger 'CloudWatch Events' ➔ from the drop-down list choose 'Create a new rule' ➔ Rule name: auto_snapshot_using_LambdaFunction ➔ provide a rule description accordingly ➔ Rule type: Schedule expression…

There are 2 ways by which we can schedule the time: 1. Event pattern and 2. Schedule expression. Now, we are going to take snapshots every 2 minutes, which is not a good thing but for practice, we can see the quick change. In our last practical, we used corn but now, we will try rate also.

…in expressions type 'rate(2 minutes)' ➔ Add.

Now, we can see that a trigger for CloudWatch Events has been added.

Step 7: Verify the snapshot:

AWS ➔ Services ➔ EC2 ➔ Snapshots.

We can see a snapshot is being created, which is in pending right now state. After some time this will be in running state.

Enjoy!




2 comments:

Pages