Othoz Gcsfs Save

Google Cloud Storage filesystem for PyFilesystem2

Project README

GCSFS

A Python filesystem abstraction of Google Cloud Storage (GCS) implemented as a PyFilesystem2 <https://github.com/PyFilesystem/pyfilesystem2>__ extension.

.. image:: https://img.shields.io/pypi/v/fs-gcsfs.svg :target: https://pypi.org/project/fs-gcsfs/

.. image:: https://img.shields.io/pypi/pyversions/fs-gcsfs.svg :target: https://pypi.org/project/fs-gcsfs/

.. image:: https://travis-ci.org/Othoz/gcsfs.svg?branch=master :target: https://travis-ci.org/Othoz/gcsfs

.. image:: https://readthedocs.org/projects/fs-gcsfs/badge/?version=latest :target: https://fs-gcsfs.readthedocs.io/en/latest/?badge=latest

With GCSFS, you can interact with Google Cloud Storage <https://cloud.google.com/storage/>__ as if it was a regular filesystem.

Apart from the nicer interface, this will highly decouple your code from the underlying storage mechanism: Exchanging the storage backend with an in-memory filesystem <https://pyfilesystem2.readthedocs.io/en/latest/reference/memoryfs.html>__ for testing or any other filesystem like S3FS <https://github.com/pyfilesystem/s3fs>__ becomes as easy as replacing gs://bucket_name with mem:// or s3://bucket_name.

For a full reference on all the PyFilesystem possibilities, take a look at the PyFilesystem Docs <https://pyfilesystem2.readthedocs.io/en/latest/index.html>__!

Documentation

  • GCSFS Documentation <http://fs-gcsfs.readthedocs.io/en/latest/>__
  • PyFilesystem Wiki <https://www.pyfilesystem.org>__
  • PyFilesystem Reference <https://docs.pyfilesystem.org/en/latest/reference/base.html>__

Installing

Install the latest GCSFS version by running::

$ pip install fs-gcsfs

Or in case you are using conda::

$ conda install -c conda-forge fs-gcsfs

Examples

Instantiating a filesystem on Google Cloud Storage (for a full reference visit the Documentation <http://fs-gcsfs.readthedocs.io/en/latest/index.html#reference>__):

.. code-block:: python

from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")

Alternatively you can use a FS URL <https://pyfilesystem2.readthedocs.io/en/latest/openers.html>__ to open up a filesystem:

.. code-block:: python

from fs import open_fs
gcsfs = open_fs("gs://mybucket/root_path?project=test&api_endpoint=http%3A//localhost%3A8888&strict=False")

Supported query parameters are:

  • project (str): Google Cloud project to use
  • api_endpoint (str): URL-encoded endpoint that will be passed to the GCS client's client_options <https://googleapis.dev/python/google-api-core/latest/client_options.html#google.api_core.client_options.ClientOptions>__
  • strict ("True" or "False"): Whether GCSFS will be opened in strict mode

You can use GCSFS like your local filesystem:

.. code-block:: python

>>> from fs_gcsfs import GCSFS
>>> gcsfs = GCSFS(bucket_name="mybucket")
>>> gcsfs.tree()
├── foo
│   ├── bar
│   │   ├── file1.txt
│   │   └── file2.csv
│   └── baz
│       └── file3.txt
└── file4.json
>>> gcsfs.listdir("foo")
["bar", "baz"]
>>> gcsfs.isdir("foo/bar")
True

Uploading a file is as easy as:

.. code-block:: python

from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")
with open("local/path/image.jpg", "rb") as local_file:
    with gcsfs.open("path/on/bucket/image.jpg", "wb") as gcs_file:
        gcs_file.write(local_file.read())

You can even sync an entire bucket on your local filesystem by using PyFilesystem's utility methods:

.. code-block:: python

from fs_gcsfs import GCSFS
from fs.osfs import OSFS
from fs.copy import copy_fs

gcsfs = GCSFS(bucket_name="mybucket")
local_fs = OSFS("local/path")

copy_fs(gcsfs, local_fs)

For exploring all the possibilities of GCSFS and other filesystems implementing the PyFilesystem interface, we recommend visiting the official PyFilesystem Docs <https://pyfilesystem2.readthedocs.io/en/latest/index.html>__!

Development

To develop on this project make sure you have pipenv <https://pipenv.readthedocs.io/en/latest/>__ installed and run the following from the root directory of the project::

$ pipenv install --dev --three

This will create a virtualenv with all packages and dev-packages installed.

Tests

All CI tests run against an actual GCS bucket provided by Othoz <http://othoz.com/>__.

In order to run the tests against your own bucket, make sure to set up a Service Account <https://cloud.google.com/iam/docs/service-accounts>__ with all necessary permissions:

  • storage.objects.get
  • storage.objects.list
  • storage.objects.create
  • storage.objects.update
  • storage.objects.delete

All five permissions listed above are e.g. included in the predefined Cloud Storage IAM Role <https://cloud.google.com/storage/docs/access-control/iam-roles>__ roles/storage.objectAdmin.

Expose your bucket name as an environment variable $TEST_BUCKET and run the tests via::

$ pipenv run pytest

Note that the tests mostly wait for I/O, therefore it makes sense to highly parallelize them with xdist <https://github.com/pytest-dev/pytest-xdist>__, e.g. by running the tests with::

$ pipenv run pytest -n 10

Credits

Credits go to S3FS <https://github.com/PyFilesystem/s3fs>__ which was the main source of inspiration and shares a lot of code with GCSFS.

Open Source Agenda is not affiliated with "Othoz Gcsfs" Project. README Source: Othoz/gcsfs
Stars
42
Open Issues
11
Last Commit
6 months ago
Repository
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating