S3 InComplete Multipart Uploads can occupy lot of storage in S3 and deleting it can reduce the storage space which may lead to lot of cost savings. As of now, there is no direct way in AWS Console or S3 Storage Browser to get the list of InComplete Multipart Uploads. Discover how to efficiently identify and eliminate incomplete multipart uploads through simple, actionable steps. Let us see how to do that in the below steps
Contents
- Authenticate AWS with Keys
- Get the InComplete Multipart Uploads
- Iterate through S3 InComplete Multipart Uploads
- Full Code
Authenticate AWS with Keys
The below code authenticates to AWS with access key and secret Key and also sets the Pagesize as well as Max no of results to be returned in each pagination.
import boto3
import csv
from tabulate import tabulate
import sys
import csv
"""
Step1: Getting the Arguments like AccessKey, SecretKey, S3 BucketName and CSV FileName
Step2: Authentication to AWS
Step3: Getting Multipart Uploads
Step4: Iterating through Multipart Uploads using Paginator
Step5: Writing to CSV if the list
"""
#Arguments using Access and Secret Key
#PageSize refers to maxResults returned in each pagination
#Pages refer to no of the pagination requests
accesskey=sys.argv[1]
secretaccesskey=sys.argv[2]
bucketname=sys.argv[3]
csvname=sys.argv[4]
pagesize=1000
pages=100
#Authentication
session=boto3.session(aws_access_key_id=AccessKeyId,aws_secret_access_key=SecretAccessKey)
s3=session.client('s3')
Get the InComplete Multipart Uploads
The below code get the S3 Bucket data from a particular account and uses the list multipart uploads function to get the data from a particular bucket.
data=s3_client.list_multipart_uploads(Bucket="xxxx")
print (data)
s3multipartdata=[]
while True:
paginator=s3.get_paginator('list_multipart_uploads')
responses=paginator.paginate(Bucket,MaxUploads=pagesize,KeyMarker=NextKeyMarker)
Iterate through S3 InComplete Multipart Uploads
Now since we have got the InComplete Multipart Uploads, iterate through the responses to get the list of InComplete Multipart Uploads and get the necessary details such as who initiated the upload, partname which was uploaded and date and add it to list. Then the same list can be used to write to as CSV file
for response in responses:
try:
incompleteuploads=response['Uploads']
NextKeyMarker=response['NextKeyMarker']
i=i+1
if i<=pages:
for row in incompleteuploads:
#There are also other fiels that can be returned in responses
initiator=incompleteuploads['Initiator']['ID']
partname=incompleteuploads['Key']
date=incompleteuploads['Initiated']
s3multipartdata.append([initiator, partname,date])
else:
headers=['Initiator','PartName','Date']
#Writing to CSV
with open(csvname,'w',newline='') as csvfile:
filewriter=csv.writer(csvfile,delimiter=',')
#Getting the list of MultiPart Uploads
rowcount=len(s3multipartdata)
filewriter.writerow(headers)
for x in range(0,rowcount):
filewriter.writerow(data[i])
except Exception as error:
headers=['Initiator','PartName','Date']
#Writing to CSV
with open(csvname,'w',newline='') as csvfile:
filewriter=csv.writer(csvfile,delimiter=',')
rowcount=len(s3multipartdata)
filewriter.writerow(headers)
for x in range(0,rowcount):
filewriter.writerow(data[i])
Full Code
The below code gives you a complete script by which you can get a CSV file with list of Incomplete Multipart Uploads for a specific S3 bucket. (Access and Secret Key should be passed as arguments when executing this file)
############################################################
## Created by @RamaneanAWS/@RamaneanTech
##
#############################################################
import boto3
import csv
from tabulate import tabulate
import sys
import csv
"""
Step1: Getting the Arguments like AccessKey, SecretKey, S3 BucketName and CSV FileName
Step2: Authentication to AWS
Step3: Getting Multipart Uploads
Step4: Iterating through Multipart Uploads using Paginator
Step5: Writing to CSV if the list
"""
#Arguments using Access and Secret Key
#PageSize refers to maxResults returned in each pagination
#Pages refer to no of the pagination requests
accesskey=sys.argv[1]
secretaccesskey=sys.argv[2]
bucketname=sys.argv[3]
csvname=sys.argv[4]
pagesize=1000
pages=100
#Authentication
session=boto3.session(aws_access_key_id=AccessKeyId,aws_secret_access_key=SecretAccessKey)
s3=session.client('s3')
#Getting Bucket
data=s3_client.list_multipart_uploads(Bucket="xxxx")
print (data)
s3multipartdata=[]
while True:
paginator=s3.get_paginator('list_multipart_uploads')
responses=paginator.paginate(Bucket,MaxUploads=pagesize,KeyMarker=NextKeyMarker)
#Iterting through each response and the Uploads
#NextKeyMarker from the response should be passed to get the next list of MultiPart Uploads
for response in responses:
try:
incompleteuploads=response['Uploads']
NextKeyMarker=response['NextKeyMarker']
i=i+1
if i<=pages:
#Getting details of S3 Incomplete Multipart Upload
for row in incompleteuploads:
#There are also other fiels that can be returned in responses
initator=incompleteuploads['Initiator']['ID']
partname=incompleteuploads['Key']
date=incompleteuploads['Initiated']
s3multipartdata.append([initator,partname,date])
else:
headers=['Initiator','PartName','Date']
#Writing to CSV
with open(csvname,'w',newline='') as csvfile:
filewriter=csv.writer(csvfile,delimiter=',')
#Getting the list of MultiPart Uploads
rowcount=len(s3multipartdata)
filewriter.writerow(headers)
for x in range(0,rowcount):
filewriter.writerow(data[i])
except Exception as error:
headers=['Initiator','PartName','Date']
#Writing to CSV with InComplete Multipart Upload Details
with open(csvname,'w',newline='') as csvfile:
filewriter=csv.writer(csvfile,delimiter=',')
rowcount=len(s3multipartdata)
filewriter.writerow(headers)
for x in range(0,rowcount):
filewriter.writerow(data[i])