Read and write from S3 buckets directly¶
You may want to read or write files to your S3 bucket from within a deployment. Deployments allow external internet connections, so you can perform these operations. In this how-to guide, we'll show you how it's done. Note that you can also connect a UbiOps bucket to your own S3 bucket. This will allow you to interact with your S3 bucket using our Client Libraries. See our docs to see how to set up this connection.
Setting up the right context¶
Each cloud provider has its own package for interacting with S3 buckets. For AWS, this package is called boto3
. Therefore, we need to add boto3
to our requirements.txt
file in our deployment_package
. To access our S3 bucket, we also need to provide our deployment with the right parameters. We advise to add these parameters as environment variables, so that they can be stored as a secret, and can be updated more conveniently.
Enabling interactions with your S3 bucket¶
After importing the os
package using import os
, we import the environment variables in the __init__
function of our deployment class, and subsequently load our s3
object that allows us to pull and push files from and to our S3 object storage:
def __init__():
#Import credentials and other variables that are required to interact with your S3 bucket,
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_ACCESS_KEY']
AWS_REGION_NAME = os.environ['AWS_S3_REGION_NAME']
self.BUCKET_NAME = os.environ['BUCKET_NAME']
#Create the S3 bucket object, that allows interaction.
self.s3 = boto3.client(
service_name = 's3'
region_name = AWS_S3_REGION
aws_access_key_id = AWS_ACCESS_KEY_ID
aws_secret_access_key = AWS_SECRET_ACCESS_KEY
)
Reading from and writing to your S3 bucket¶
Say you have a stored 10mb csv file 10mb_test_file.csv
inside your bucket test_bucket
. One way of fetching the content of this csv, is by first loading the file as an object, and then reading the body of this object:
def request():
csv_obj = self.s3.get_object(Bucket = self.BUCKET_NAME,
Key = '10mb_test_file.csv')
content = pd.read_csv(csv_obj['Body'], index_col=0)
You may want to process the content of our testfile, and push the results back to your bucket as a csv file. Say that you have stored your results in an object called results
, and want to send the results as a csv
to a path
of choice:
def request():
self.s3.upload_fileobj(fileobj = results,
bucket = self.BUCKET_NAME,
key = '/path/results.csv')
For more types of S3 interactions, see the Amazon S3 boto3 documentation.