Azure Storage Blob - How to List Blob, Download Blob from Azure Storage container in Python (pypy libs)
February 03, 2021
Introduction
In this tutorial we will see:
- How to instantiate different classes required for talking to Azure storage container
-
How to authenticate
- if we have account key
- if we have sas_token
- No Auth (Just container name)
- How to use with Proxy
-
List Blobs for a storage container
- Attributes of each blob object
- Download blob
And, everything will be in Python
Pre-requisite
This tutorial is based upon Python-3.7
Pypy Dependency
We would require azure-storage-blob
. Code is tested with version 12.7.1
How to Authenticate and Instantiate
from azure.storage.blob import BlobServiceClient
# consider a dictionary container
container = {
'account_name': 'your_account_name',
'container_name': 'your_container_name',
'sas_token': 'xxxxxxx'
}
if "account_key" in container:
blob_service = BlobServiceClient(
account_url=account_url, credential=container["account_key"])
elif "sas_token" in container:
blob_service = BlobServiceClient(
account_url=account_url, credential=container["sas_token"])
else:
blob_service = BlobServiceClient(account_url=account_url)
# Now to get instance of class which has list_blob methods
container_client = blob_service.get_container_client(container['container_name'])
In above code, we are just instantiating client classes required for the operation and authenticate.
In my example, I have a sas_token
.
Complete Example for list and download blobs (with proxy configuration as well)
import os
from azure.storage.blob import BlobServiceClient
def _create_dirs(dest_path):
if not os.path.exists(dest_path):
os.makedirs(dest_path)
elif not os.path.isdir(dest_path):
shutil.rmtree(dest_path)
os.makedirs(dest_path)
def _get_container_service(container):
account_url = f'https://{container["account_name"]}.blob.core.windows.net'
proxies = None
if 'proxy' in container:
proxies = {'http': container['proxy']}
# If 'proxy' isn't specified in container block, check if 'https_proxy' is set.
elif 'https_proxy' in container:
proxies = {'https': container['https_proxy']}
# instantiate based upon credential
if "account_key" in container:
blob_service = BlobServiceClient(
account_url=account_url, credential=container["account_key"], proxies=proxies)
elif "sas_token" in container:
blob_service = BlobServiceClient(
account_url=account_url, credential=container["sas_token"], proxies=proxies)
else:
blob_service = BlobServiceClient(account_url=account_url, proxies=proxies)
return blob_service.get_container_client(container['container_name'])
def download_blobs(container, dest_path):
## You might want to handle some exceptions here
_create_dirs(dest_path)
# Get the container instance
blob_service = _get_container_service(container)
# Note: list_blobs returns an iterator
blob_list = blob_service.list_blobs()
for blob in blob_list:
fname = os.path.join(dest_path, blob.name)
print(f'Downloading {blob.name} to {fname}')
# get blob client which has download_blob method
blob_client = blob_service.get_blob_client(blob)
# create base dirs if not exists
_create_dirs(os.path.dirname(fname))
with open(fname, "wb") as download_file:
download_file.write(blob_client.download_blob().readall())
## main starts here
local_dest_path = './container_blob'
container = {
'account_name': 'your_account_name',
'container_name': 'your_container_name',
'sas_token': 'xxxxxxx'
}
download_blobs(container, local_dest_path)
Above script is very simple to understand. My container has nested directories and files. The code iterate over all files and downloads one by one.
Attributes of a Blob object
{
'name': 'fdg/cert_discovery.fdg',
'snapshot': None,
'content': None,
'properties': {
'blob_type': 'BlockBlob',
'last_modified': datetime.datetime(2019, 12, 2, 9, 42, 50, tzinfo=tzutc()),
'etag': '0x8D7770BFF1CC8A1',
'content_length': 423,
'content_range': None,
'append_blob_committed_block_count': None,
'page_blob_sequence_number': None,
'server_encrypted': True,
'copy': {
'id': None,
'source': None,
'status': None,
'progress': None,
'completion_time': None,
'status_description': None
},
'content_settings': {
'content_type': 'application/octet-stream',
'content_encoding': None,
'content_language': None,
'content_disposition': None,
'cache_control': None,
'content_md5': '3ycLC3CutKkybJtlgvEdsQ=='
},
'lease': {
'status': 'unlocked',
'state': 'available',
'duration': None
},
'blob_tier': None,
'blob_tier_change_time': None,
'blob_tier_inferred': False,
'deleted_time': None,
'remaining_retention_days': None,
'creation_time': datetime.datetime(2019, 11, 28, 11, 52, 5, tzinfo=tzutc())
},
'metadata': None,
'deleted': False
}
Usage with only Python library, not Azure libraries
For usage without Azure libraries, see: List and Download Azure blobs by Python Libraries
Let me know if you face any difficulties, and I will try to resolve them.
Similar Posts
Drupal Code: Fetch Link, Title, Category names, tag names from every article
See the code below: The output will be 4 columns separated by comma. You can…
Python 3 - Magical Methods and Tricks with extremely useful methods
This post will show some really awesome tricks in python. Get the power of a…
Twig Templating - Most useful functions and operations syntax
Introduction Twig is a powerful template engine for php. Drupal uses it heavily…
Python - How to apply patch to Python and Install Python via Pyenv
Introduction In this post, we will see how we can apply a patch to Python and…
How to configure Grafana (Free version) with oAuth Okta, with SSL on Docker,Nginx and Load dashboard from json
Introduction In this post, we will see: use Grafana Community Edition (Free…
Latest Posts
Authenticating Strapi backend with Next.js and next-auth using credentials and jwt
Introduction Strapi is a backend system provides basic crud operations with…
How to create Repository using Github Rest API, Configure Visibility and Assign a Team as Readonly
Introduction I had to create many repositories in an Github organization. I…
How to Download multiple Youtube Videos using Nodejs and Show a Progress Bar
Introduction I was trying to download some youtube videos for my kids. As I have…
Python - Some useful Pytest Commands
Introduction In this post, we will explore some useful command line options for…
Python - How to apply patch to Python and Install Python via Pyenv
Introduction In this post, we will see how we can apply a patch to Python and…
How to Install packages from command line and Dockerfile with Chocolatey
Introduction We will introduce a Package Manager for Windows: . In automations…