MinIO the object storage server

Posted on Wed 09 October 2019 in python

MinIO

MinIO is an open source object storage server that is compatible with Amazon S3 cloud storage service. In the following, its setup as docker container is discussed and how to access it using Python.

Installation of MinIO using Docker

Before starting with the installation of MinIO make sure that docker is installed (see Docker Install Overview) and that your user is in the docker group.

Subsequently, you can pull the corresponding minio docker image, prepare an access key and access secret and finally, run the docker container:

# pull the required minio docker image
docker pull minio/minio

# create directory where data is stored on the host (this is up to you)
mkdir -p ${HOME}/.minio/data

# generate an access key and a secret key
export MINIO_ACCESS_KEY="$(pwgen -c1 20)"
export MINIO_SECRET_KEY="$(pwgen -c1 38)"
printf "AccessKey: ${MINIO_ACCESS_KEY}\nSecretKey: ${MINIO_SECRET_KEY}\n"

# run the minio docker container with generated keys and link the volume
docker run -d -p 9000:9000 --name minio1 \
    -e MINIO_ACCESS_KEY \
    -e MINIO_SECRET_KEY \
    -v ${HOME}/.minio/data:/data \
    minio/minio server /data

You can now login to the web interface on http://127.0.0.1:9000 using the set access and secret key.

For further information about the setup see the MinIO Quickstart Guide.

Using MinIO with Python

To easily access MinIO using Python the corresponding minio Python module must This can be done using pip:

pip install minio

Several commands can be executed on the MinIO server. In the following, some of the basic commands are discussed to get an idea how to use the store.

First, the basic setup as well as the creation of a first bucket is shown.

import io
import json

import minio

# set the previously generated keys
ACCESS_KEY = "<your access key>"
SECRET_KEY = "<your secret key>"

# prepare the minio client
minio_client = minio.Minio(
    "127.0.0.1:9000",
    access_key=ACCESS_KEY,
    secret_key=SECRET_KEY,
    secure=False             # in production use https!
)

# create a new bucket, if not existing yet
if not minio_client.bucket_exists("mybucket"):
    minio_client.make_bucket("mybucket", location="eu-west-1")

# print all available buckets
bucket_names = [
    bucket.name
    for bucket in minio_client.list_buckets()
]
print(bucket_names)

Files can easily put into the bucket and stored by a chosen object name:

minio_client.fput_object(
    bucket_name="mybucket",
    object_name="yourfile.txt",
    file_path="/path/to/yourfile.txt"
)

There is no dedicated function to check the existence of an object in a bucket, however the query for the object's meta can be used instead:

minio_client.stat_object(
    bucket_name="mybucket",
    object_name="yourfile.txt"
)

When the object does not exist, a minio.error.NoSuchKey exception is thrown, otherwise the object's meta data is returned.

To obtain the file object back from the bucket and store it at the provided file path do:

# get object from the bucket
data = minio_client.fget_object(
    bucket_name="mybucket",
    object_name="yourfile.txt",
    file_path="/tmp/local_yourfile.txt"
)
print(data)

Putting an arbitrary object into a bucket is a little bit harder, since the data must be prepared to match the bucket format. For instance, to store a dictionary as json to the bucket:

def put_json(bucket_name, object_name, d):
    """
    jsonify a dict and write it as object to the bucket
    """
    # prepare data and corresponding data stream
    data = json.dumps(d).encode("utf-8")
    data_stream = io.BytesIO(data)
    data_stream.seek(0)

    # put data as object into the bucket
    minio_client.put_object(
        bucket_name=bucket_name,
        object_name=object_name,
        data=data_stream, length=len(data),
        content_type="application/json"
    )

# put the dictionary into the bucket
put_json("mybucket", "json/test.json", {"test": "me"})

Notice that the object name contains the path prefix json/. Path prefixes can be used to hierarchically structure the bucket. In this example, the json data will be stored in a separate subdirectory within the bucket.

To get the object back from the the bucket do:

def get_json(bucket_name, object_name):
    """
    get stored json object from the bucket
    """
    data = minio_client.get_object(bucket_name, object_name)
    return json.load(io.BytesIO(data.data))

j = get_json("mybucket", "json/test.json")
print(j)

To iterate over all stored objects in the json/ directory of the bucket you can simply do:

for obj in minio_client.list_objects(
    "mybucket", prefix='json/', recursive=True
):
    print(obj.object_name, obj.size)

Of course you can also delete an object from a bucket:

minio_client.remove_object("mybucket", "json/test.json")

To remove a bucket do:

minio_client.remove_bucket("mybucket")

Notice that only empty buckets i.e. without any objects can be removed.

The presented functionality is of course only a fraction of the available functionality. Please consult the complete Python client API reference for more information.