Google Cloud Storage filesystem for PyFilesystem2
A Python filesystem abstraction of Google Cloud Storage (GCS) implemented as a PyFilesystem2 <https://github.com/PyFilesystem/pyfilesystem2>
__ extension.
.. image:: https://img.shields.io/pypi/v/fs-gcsfs.svg :target: https://pypi.org/project/fs-gcsfs/
.. image:: https://img.shields.io/pypi/pyversions/fs-gcsfs.svg :target: https://pypi.org/project/fs-gcsfs/
.. image:: https://travis-ci.org/Othoz/gcsfs.svg?branch=master :target: https://travis-ci.org/Othoz/gcsfs
.. image:: https://readthedocs.org/projects/fs-gcsfs/badge/?version=latest :target: https://fs-gcsfs.readthedocs.io/en/latest/?badge=latest
With GCSFS, you can interact with Google Cloud Storage <https://cloud.google.com/storage/>
__ as if it was a regular filesystem.
Apart from the nicer interface, this will highly decouple your code from the underlying storage mechanism: Exchanging the storage backend with an
in-memory filesystem <https://pyfilesystem2.readthedocs.io/en/latest/reference/memoryfs.html>
__ for testing or any other
filesystem like S3FS <https://github.com/pyfilesystem/s3fs>
__ becomes as easy as replacing gs://bucket_name
with mem://
or s3://bucket_name
.
For a full reference on all the PyFilesystem possibilities, take a look at the
PyFilesystem Docs <https://pyfilesystem2.readthedocs.io/en/latest/index.html>
__!
GCSFS Documentation <http://fs-gcsfs.readthedocs.io/en/latest/>
__PyFilesystem Wiki <https://www.pyfilesystem.org>
__PyFilesystem Reference <https://docs.pyfilesystem.org/en/latest/reference/base.html>
__Install the latest GCSFS version by running::
$ pip install fs-gcsfs
Or in case you are using conda::
$ conda install -c conda-forge fs-gcsfs
Instantiating a filesystem on Google Cloud Storage (for a full reference visit the
Documentation <http://fs-gcsfs.readthedocs.io/en/latest/index.html#reference>
__):
.. code-block:: python
from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")
Alternatively you can use a FS URL <https://pyfilesystem2.readthedocs.io/en/latest/openers.html>
__ to open up a filesystem:
.. code-block:: python
from fs import open_fs
gcsfs = open_fs("gs://mybucket/root_path?project=test&api_endpoint=http%3A//localhost%3A8888&strict=False")
Supported query parameters are:
project
(str): Google Cloud project to useapi_endpoint
(str): URL-encoded endpoint that will be passed to the GCS client's client_options <https://googleapis.dev/python/google-api-core/latest/client_options.html#google.api_core.client_options.ClientOptions>
__strict
("True" or "False"): Whether GCSFS will be opened in strict modeYou can use GCSFS like your local filesystem:
.. code-block:: python
>>> from fs_gcsfs import GCSFS
>>> gcsfs = GCSFS(bucket_name="mybucket")
>>> gcsfs.tree()
├── foo
│ ├── bar
│ │ ├── file1.txt
│ │ └── file2.csv
│ └── baz
│ └── file3.txt
└── file4.json
>>> gcsfs.listdir("foo")
["bar", "baz"]
>>> gcsfs.isdir("foo/bar")
True
Uploading a file is as easy as:
.. code-block:: python
from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")
with open("local/path/image.jpg", "rb") as local_file:
with gcsfs.open("path/on/bucket/image.jpg", "wb") as gcs_file:
gcs_file.write(local_file.read())
You can even sync an entire bucket on your local filesystem by using PyFilesystem's utility methods:
.. code-block:: python
from fs_gcsfs import GCSFS
from fs.osfs import OSFS
from fs.copy import copy_fs
gcsfs = GCSFS(bucket_name="mybucket")
local_fs = OSFS("local/path")
copy_fs(gcsfs, local_fs)
For exploring all the possibilities of GCSFS and other filesystems implementing the PyFilesystem interface, we recommend visiting the official
PyFilesystem Docs <https://pyfilesystem2.readthedocs.io/en/latest/index.html>
__!
To develop on this project make sure you have pipenv <https://pipenv.readthedocs.io/en/latest/>
__ installed
and run the following from the root directory of the project::
$ pipenv install --dev --three
This will create a virtualenv with all packages and dev-packages installed.
All CI tests run against an actual GCS bucket provided by Othoz <http://othoz.com/>
__.
In order to run the tests against your own bucket,
make sure to set up a Service Account <https://cloud.google.com/iam/docs/service-accounts>
__ with all necessary permissions:
All five permissions listed above are e.g. included in the predefined Cloud Storage IAM Role <https://cloud.google.com/storage/docs/access-control/iam-roles>
__ roles/storage.objectAdmin
.
Expose your bucket name as an environment variable $TEST_BUCKET
and run the tests via::
$ pipenv run pytest
Note that the tests mostly wait for I/O, therefore it makes sense to highly parallelize them with xdist <https://github.com/pytest-dev/pytest-xdist>
__, e.g. by running the tests with::
$ pipenv run pytest -n 10
Credits go to S3FS <https://github.com/PyFilesystem/s3fs>
__ which was the main source of inspiration and shares a lot of code with GCSFS.