AWS Data Wrangler

AWS Labs has a lot of open source code up on GitHub. In this post, we’re taking a look at AWS Data Wrangler. This project provides a smoother interface between python pandas DataFrames and various AWS Cloud data services.

I call out a few more details in the Twitter thread below…

AWS Data Wrangler is an interesting project from the AWS ProServe team

it aims to connect python pandas data frames to various AWS services

this thread (unrolled) is up at

the last thread is up at

if you've been anywhere near a data science project, you've probably seen either the scipy, numpy, or pandas projects in python...or all three

they are awesome

one of the fundamental units of these projects is the DataFrame,

the AWS Data Wrangler project lets you save DataFrames to various AWS data services

this could save a ton of time for your python projects

the repo has a broad spectrum of samples, all in Jupyter notebooks. I ❀️ that because it makes it easier to play with the code

find the tutorials at

this project installs via standard pip but is also available as a Lambda layer, in the AWS Glue shell, in AWS SageMaker Notebooks, and more

that flexibility is much appreciated

this library does more than just save and load data, the full API for it is up at

there’s a lot of very useful data manipulation functions here

all-in-all if you're using pandas, scipy, or numpy in your python project and your data on AWS, you'll want toβ€"at the very leastβ€"check out the AWS Data Wrangler at

