Warning
This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.
Connecting Users and Data
Galaxy has countless ways for users to connect with things that might be considered their “data” - file sources (aka “remote files”), object stores (aka “storage locations”), data libraries, the upload API, visualizations, display applications, custom tools, etc…
This document is going to discuss two of these (file sources and object stores) that are most important Galaxy administrators and how to build Galaxy configurations that allow administrators to let users tie into various pieces of infrastructure (local and publicly available).
Datasets vs Files
File sources in Galaxy are a sprawling concept but essentially they provide users access to simple files (stored hierarchically into folders) that can be navigated and imported into Galaxy. Importing a “file” into Galaxy generally creates a copy of that file into a Galaxy “object store”. Once these files are stored in Galaxy, they become “datasets”. A Galaxy dataset is much more than a simple file - Galaxy datasets include various generic metadata a datatype, datatype specific metadata, and ownership and sharing rules managed by Galaxy.
Galaxy object stores (called “storage locations” in the UI) store datasets and global (accessible to all users) object stores are configured with the galaxy.yml
property object_store_config_file
(or object_store_config
for a configuration embedded right in galaxy.yml
) that defaults to object_store_conf.xml
or object_store_conf.yml
if either is present in Galaxy’s configuration directory. Galaxy file sources provide users access to raw files and global files sources are configured with the galaxy.yml
property file_sources_config_file
(or file_sources
for embedded configurations) that defaults to file_sources_conf.yml
if that file is present in Galaxy’s configuration directory.
Some of Galaxy’s most updated and complete administrator documentation can be found in configuration sample files - this is definitely the case for object stores and file sources. The relevant sample configuration files include file_sources_conf.yml.sample and object_store_conf.sample.yml.
File sources and object stores configured with the above files essentially are available to all users of your Galaxy instance - hence this document describes them as “global” file sources and object stores. File source configurations do allow some templating that does allow the a global file source to be materialized differently for different users. For instance, you as an admin may setup a Dropbox file source and may explicitly add custom user properties that allow that single Dropbox file source to read from a user’s preferences. Since there is just one Dropbox service and most people only have a single Dropbox account, this use case can be somewhat adequately addressed by the global file source and the global user preferences file. For a use case like Amazon S3 buckets though for instance, a single bucket file source that is parameterized one way is probably more clearly inadequate. For instance, users would very likely want to attach different buckets for different projects. Additionally, the Galaxy user interface doesn’t tie the user preferences to the particular file source and so this method introduces a huge education burden on your Galaxy instance. Finally, the templating available to file sources are not available for object stores - and allowing users to describe how they would like datasets stored and to pay for their own dataset storage are important use cases.
This document is going to describe Galaxy configuration template libraries that allow the administrator to setup templates for file sources and object stores that your users may instantiate as they see fit. User’s can instantiate multiple instances of any template, the template concept can apply to both file source and object store plugins, and the user interface is unified from the template configuration file (you as the admin do not need to explicitly declare user preferences and your users do not need to navigate seemingly unrelated preferences to get plugins to work).
Object Store Templates
Galaxy’s object store templates are configured as a YAML list of template objects. This list
can be placed object_store_templates.yml
in Galaxy configuration directory (or any path
pointed to by the configuration option object_store_templates_config_file
in galaxy.yml
).
Alternatively, the configuration can be placed directly into galaxy.yml
using the
object_store_templates
configuration option.
Warning
Object store selection within Galaxy is available only when the primary object store is a
distributed object store. All the other object stores provide stronger guarantees about how
datasets are stored. Object Store Templates will not currently work if the primary
object store (usually defined by object_store_config_file
) is a simple disk object
store, a hierarchal object store, or anything other than a distributed
object store.
Ongoing discussion on this topic can be found at this Galaxy Discussions post (#18157).
A minimal object store template might look something like:
- id: project_scratch
name: Project Scratch
version: 0
description: Folder on institutional scratch disk area bound to your user.
variables:
project_name:
type: path_component
help: The name of your project scratch.
configuration:
type: disk
files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
badges:
- type: faster
- type: less_secure
- type: not_backed_up
Object Store Types
disk
This is the most basic sort of object store template that just makes disk paths available to users for storing data. Paths can be built up from the user supplied variables, user details, supplied environment variables, etc.. The simple example just uses a user supplied project name and the user’s username to produce a unique path for each user defined object store.
- id: project_scratch
name: Project Scratch
version: 0
description: Folder on institutional scratch disk area bound to your user.
variables:
project_name:
type: path_component
help: The name of your project scratch.
configuration:
type: disk
files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
badges:
- type: faster
- type: less_secure
- type: not_backed_up
These sorts of object stores have no quota so be careful.
The syntax for the configuration
section of disk
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s object store infrastructure looks like this and should match a subset
of what you’d be able to add directly to object_store_conf.yml
(Galaxy’s global object
store configuration).
boto3
Object stores of the type boto3
can be used to access a wide variety of S3
compatible storage services including AWS S3. How you template them can result in widely
different experiences for your users and can result in addressing a wide variety of use cases.
Here is an example that is tailored for a specific storage service (e.g. CloudFlare R2) and exposes just the pieces of data CloudFlare users would need.
# https://developers.cloudflare.com/r2/examples/aws/boto3/
- id: cloudflare
version: 0
name: CloudFlare R2
description: |
This template can be used to connect to your [CloudFlare R2](https://developers.cloudflare.com/r2/)
storage. To use these templates you will need to generate
[CloudFlare R2 access tokens](https://developers.cloudflare.com/r2/api/s3/tokens/).
Following that tutorial, you should have an "Account ID", and "Access Key ID", and a
"Secret Access Key".
variables:
access_key:
label: Access Key ID
type: string
help: |
An Access Key ID generated according to the
[CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
account_id:
label: Account ID
type: string
help: |
Your account ID as available in the [CloudFlare dashboard](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/).
bucket:
label: Bucket
type: string
help: |
The name of a bucket you've created to store your Galaxy data. Documentation for how to create buckets
can be found in [this part of the CloudFlare R2 documentation](https://developers.cloudflare.com/r2/buckets/create-buckets/).
secrets:
secret_key:
label: Secret Access Key
help: |
A Secret Access Key generated according to the
[CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
configuration:
type: boto3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
endpoint_url: 'https://{{ variables.account_id}}.r2.cloudflarestorage.com/'
Templates can be much more generic or much less generic than this.
In one direction, all the bells and whistles could be exposed to your Galaxy users to allow them to connect to any S3 compatible storage. This requires a lot more sophistication from your users but also allows them to connect to many more services. This template is available here:
- id: generic_s3
version: 0
name: Any S3 Compatible Storage
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
storage locations. This template configuration allows using such service as a Galaxy storage
location as long as you are able to find the connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation] calls these an "access key ID",
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
this the ``access_key``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this whole value would be entered here.
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: boto3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
endpoint_url: '{{ variables.endpoint_url }}'
On the other hand, you might run a small lab with a dedicate MinIO storage service and just trust your user’s to define individual buckets by name:
- id: lab_minio_storage
version: 0
name: Lab Storage
description: Connect to our lab's local MinIO storage service.
variables:
bucket:
type: string
help: The bucket to connect to.
configuration:
type: boto3
auth:
access_key: 'XXXXXXXXfillinaccess'
secret_key: 'YYYYYYYYfillinsecret'
bucket:
name: '{{ variables.bucket }}'
connection:
endpoint_url: 'https://storage.ourawesomelab.org:9000'
badges:
- type: slower
- type: less_secure
- type: less_stable
If you want to just target AWS S3 and let your users utilize that as quickly and easily as possible that templates might look like this:
The syntax for the configuration
section of boto3
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s object store infrastructure looks like this and should match a subset
of what you’d be able to add directly to object_store_conf.yml
(Galaxy’s global object
store configuration).
azure_blob
Here is a “production grade” Azure template that can be essentially used to connect to any Azure storage container.
- id: azure
version: 0
name: Azure Blob Storage
description: |
This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
configuration:
type: azure_blob
auth:
account_name: '{{ variables.account_name }}'
account_key: '{{ secrets.account_key}}'
container:
name: '{{ variables.container_name }}'
variables:
container_name:
label: Container Name
type: string
help: |
The name of your Azure Blob Storage container. More information on containers can be found
in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
account_name:
label: Storage Account Name
type: string
help: |
The name of your Azure Blob Storage account. More information on containers can be found in the
[Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
secrets:
account_key:
label: Account Key
help: |
The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).
This template might be adapted to hide connection details from say users of a individual lab and just expose what container they should use. That might look something like:
- id: lab_azure_storage
version: 0
name: Azure Blob Storage for our Lab
description: |
This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
configuration:
type: azure_blob
auth:
account_name: 'XXXXXXXXfillinaccount'
account_key: 'XXXXXXXXXfillinkey'
container:
name: '{{ variables.container_name }}'
variables:
container_name:
label: Container Name
type: string
help: |
The name of your Azure Blob Storage container in our lab space. Contact
our lab Galaxy admin awesomelabsnate@ourawesomelab.org if you're unsure
what container you should store your data in.
This example is a little contrived though, if a small lab or institution has just a few containers it would likely be a much easier user experience to just wrap them all in a Galaxy hierarchical object store, document them there, and make them available to your whole Galaxy instance.
The syntax for the configuration
section of azure_blob
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s object store infrastructure looks like this and should match a subset
of what you’d be able to add directly to object_store_conf.yml
(Galaxy’s global object
store configuration).
aws_s3
(Legacy)
Object stores of the type aws_s3
are be used to treat AWS Simple Storage Service (S3) buckets
as Galaxy object stores. See Amazon documentation for information on S3
and how to create buckets
and how to create access keys.
- id: aws_s3_legacy
version: 0
name: Amazon Web Services S3 Storage (Legacy)
description: |
Amazon's Simple Storage Service (S3) is Amazon's primary cloud storage service.
More information on S3 can be found in [Amazon's documentation](https://aws.amazon.com/s3/).
variables:
access_key:
label: Access Key ID
type: string
help: |
A security credential for interacting with AWS services can be created from your
AWS web console. Creating an "Access Key" creates a pair of keys used to identify
and authenticate access to your AWS account - the first part of the pair is
"Access Key ID" and should be entered here. The second part of your key is the secret
part called the "Secret Access Key". Place that in the secure part of this form below.
bucket:
label: Bucket
type: string
help: |
The [AWS S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. You will need to create a bucket to use in your AWS web console before
using this form.
secrets:
secret_key:
label: Secret Access Key
help: |
See the documentation above used "Access Key ID" for information about access key pairs.
configuration:
type: aws_s3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
The aws_s3
object store is older and more well tested than the boto3
object store, but
the boto3
object store is built using a newer, more robust, and more feature-rich client
library so it should probably be the object store you use instead of this.
The syntax for the configuration
section of aws_s3
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s object store infrastructure looks like this and should match a subset
of what you’d be able to add directly to object_store_conf.yml
(Galaxy’s global object
store configuration).
generic_s3
(Legacy)
Object stores of the type generic_s3
can be used to access a wide variety of S3
compatible storage services. How you template them can result in widely different
experiences for your users and can result in addressing a wide variety of use cases.
Here is an example that is tailored for a specific storage service (e.g. CloudFlare R2) and exposes just the pieces of data CloudFlare users would need.
# https://developers.cloudflare.com/r2/examples/aws/boto3/
- id: cloudflare_legacy
version: 0
name: CloudFlare R2
description: |
This template can be used to connect to your [CloudFlare R2](https://developers.cloudflare.com/r2/)
storage. To use these templates you will need to generate
[CloudFlare R2 access tokens](https://developers.cloudflare.com/r2/api/s3/tokens/).
Following that tutorial, you should have an "Account ID", and "Access Key ID", and a
"Secret Access Key".
variables:
access_key:
label: Access Key ID
type: string
help: |
An Access Key ID generated according to the
[CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
account_id:
label: Account ID
type: string
help: |
Your account ID as available in the [CloudFlare dashboard](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/).
bucket:
label: Bucket
type: string
help: |
The name of a bucket you've created to store your Galaxy data. Documentation for how to create buckets
can be found in [this part of the CloudFlare R2 documentation](https://developers.cloudflare.com/r2/buckets/create-buckets/).
secrets:
secret_key:
label: Secret Access Key
help: |
A Secret Access Key generated according to the
[CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
configuration:
type: generic_s3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
host: '{{ variables.account_id}}.r2.cloudflarestorage.com'
port: 443
is_secure: true
Templates can be much more generic or much less generic than this.
In one direction, all the bells and whistles could be exposed to your Galaxy users to allow them to connect to any S3 compatible storage. This requires a lot more sophistication from your users but also allows them to connect to many more services. This template is available here:
- id: generic_s3_legacy
version: 0
name: Any S3 Compatible Storage (Legacy)
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
storage locations. This template configuration allows using such service as a Galaxy storage
location as long as you are able to find the connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation] calls these an "access key ID",
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
this the ``access_key``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
host:
label: Connection Host
type: string
help: |
The [hostname](https://en.wikipedia.org/wiki/Hostname) used to connect to the target
S3 compatible service.
If the documentation for your storage service has something called an ``endpoint_url``,
this can be used to determine this value. For instance, the CloudFlare documentation
describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint and shave off the ``https://``,
so if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
port:
label: Connection Port
type: integer
default: 443
help: |
The [port](https://en.wikipedia.org/wiki/Port_(computer_networking)) used to connect
to the target S3 compatible service. This might be ``443`` if you cannot find a relevant
port - this is the default for secure HTTP connections.
If the documentation for your storage service has something called an ``endpoint_url``,
this can be used to determine this value. The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``.
The ``:9000`` here indicates this port should be specified as ``9000``. Alternatively, the
CloudFlare documentation describes its endpoints ``https://<accountid>.r2.cloudflarestorage.com``.
Here there is no number at the end of the URL so the port is ``443`` as long the URL starts
with ``https``.
connection_path:
label: Connection Path
type: string
default: ""
help: |
This is an advanced configuration option and it is very likely best to just keep this empty
for most storage services. If specified, it will be the prefix in the URL for the S3 compatible
API after the host and port to reach the target API.
secure:
label: Use HTTPS?
type: boolean
default: true
help: |
This is an advanced configuration option and if this option is not checked, you should not assume
your data is secure at all. This should only ever be unchecked during testing new or experimental
services with data and keys you do not care about.
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: generic_s3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
host: '{{ variables.host }}'
port: '{{ variables.port }}'
is_secure: '{{ variables.secure }}'
conn_path: '{{ variables.connection_path }}'
On the other hand, you might run a small lab with a dedicate MinIO storage service and just trust your user’s to define individual buckets by name:
- id: lab_minio_storage_legacy
version: 0
name: Lab Storage (Legacy)
description: Connect to our lab's local MinIO storage service.
variables:
bucket:
type: string
help: The bucket to connect to.
configuration:
type: generic_s3
auth:
access_key: 'XXXXXXXXfillinaccess'
secret_key: 'YYYYYYYYfillinsecret'
bucket:
name: '{{ variables.bucket }}'
connection:
host: 'storage.ourawesomelab.org'
port: 9000
is_secure: true
badges:
- type: slower
- type: less_secure
- type: less_stable
The syntax for the configuration
section of generic_s3
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s object store infrastructure looks like this and should match a subset
of what you’d be able to add directly to object_store_conf.yml
(Galaxy’s global object
store configuration).
YAML Syntax
Ready To Use Production Object Store Templates
The templates are sufficiently generic that they may make sense for a variety of Galaxy instances, address a variety of potential use cases, and do not need any additional tailoring, parameterization, or other customization. These assume your Galaxy instance has a Vault configured and you’re comfortable with it storing your user’s secrets.
Allow Users to Define Azure Blob Storage as Object Stores
- id: azure
version: 0
name: Azure Blob Storage
description: |
This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
configuration:
type: azure_blob
auth:
account_name: '{{ variables.account_name }}'
account_key: '{{ secrets.account_key}}'
container:
name: '{{ variables.container_name }}'
variables:
container_name:
label: Container Name
type: string
help: |
The name of your Azure Blob Storage container. More information on containers can be found
in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
account_name:
label: Storage Account Name
type: string
help: |
The name of your Azure Blob Storage account. More information on containers can be found in the
[Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
secrets:
account_key:
label: Account Key
help: |
The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).
Allow Users to Define Generic S3 Compatible Storage Services as Object Stores
- id: generic_s3
version: 0
name: Any S3 Compatible Storage
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
storage locations. This template configuration allows using such service as a Galaxy storage
location as long as you are able to find the connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation] calls these an "access key ID",
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
this the ``access_key``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this whole value would be entered here.
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: boto3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
endpoint_url: '{{ variables.endpoint_url }}'
Allow Users to Define AWS S3 Buckets as Object Stores
- id: aws_s3
version: 0
name: Amazon Web Services S3 Storage
description: |
Amazon's Simple Storage Service (S3) is Amazon's primary cloud storage service.
More information on S3 can be found in [Amazon's documentation](https://aws.amazon.com/s3/).
variables:
access_key:
label: Access Key ID
type: string
help: |
A security credential for interacting with AWS services can be created from your
AWS web console. Creating an "Access Key" creates a pair of keys used to identify
and authenticate access to your AWS account - the first part of the pair is
"Access Key ID" and should be entered here. The second part of your key is the secret
part called the "Secret Access Key". Place that in the secure part of this form below.
bucket:
label: Bucket
type: string
help: |
The [AWS S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. You will need to create a bucket to use in your AWS web console before
using this form.
secrets:
secret_key:
label: Secret Access Key
help: |
See the documentation above used "Access Key ID" for information about access key pairs.
configuration:
type: boto3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
Allow Users to Define Google Cloud Provider S3 Interop Storage Buckets as Object Stores
This template includes descriptions of how to generate HMAC keys used by this interoperability layer provided by Google and lots of links to relevant Google Cloud Storage documentation.
# https://cloud.google.com/storage/docs/aws-simple-migration
- id: gcp_s3_interop
version: 0
name: Google Cloud Storage
description: |
This template can be used to connect to your [Google Cloud Storage](https://cloud.google.com/storage).
To use these templates you will need to generate
[HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys) - these
can be linked to your user or a service account. Additionally, you will need to defined
a [default Google cloud project](https://cloud.google.com/storage/docs/aws-simple-migration#defaultproj)
to allow Galaxy to access your Google Cloud Storage via the interfaces
described by this template.
variables:
access_key:
label: Access ID
type: string
help: |
This will be given to you by Google when you generate [HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys)
to use your storage.
bucket:
label: Bucket
type: string
help: |
The name of a [bucket](https://cloud.google.com/storage/docs/buckets) you've created to store your Galaxy data. Documentation for how to create buckets
can be found in [this part of the Google Cloud Storage documentation](https://cloud.google.com/storage/docs/creating-buckets).
secrets:
secret_key:
label: Secret Key
help: |
This will be given to you by Google when you generate [HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys)
to use your storage. It should be 40 characters long and look something like the example used
the Google documentation - `bGoa+V7g/yqDXvKRqq+JTFn4uQZbPiQJo4pf9RzJ`.
configuration:
type: boto3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
connection:
endpoint_url: 'https://storage.googleapis.com/'
File Source Templates
Galaxy’s file source templates are configured as a YAML list of template objects. This list
can be placed file_source_templates.yml
in Galaxy configuration directory (or any path
pointed to by the configuration option file_source_templates_config_file
in galaxy.yml
).
Alternatively, the configuration can be placed directly into galaxy.yml
using the
file_source_templates
configuration option.
File Source Types
posix
The syntax for the configuration
section of posix
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
s3fs
- id: s3fs
version: 0
name: S3 Compatible Storage with Credentials
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3. This template configuration allows
using such service as a Galaxy storage location as long as you are able to find the
connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this value would be entered here.
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: s3fs
endpoint_url: '{{ variables.endpoint_url }}'
key: '{{ variables.access_key }}'
secret: '{{ secrets.secret_key }}'
bucket: '{{ variables.bucket }}'
- id: s3fs
version: 1
name: S3 Compatible Storage with Credentials
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3. This template configuration allows
using such service as a Galaxy storage location as long as you are able to find the
connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this value would be entered here.
writable:
label: Writable?
type: boolean
help: Is this a bucket you have permission to write to?
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: s3fs
endpoint_url: '{{ variables.endpoint_url }}'
key: '{{ variables.access_key }}'
secret: '{{ secrets.secret_key }}'
bucket: '{{ variables.bucket }}'
writable: '{{ variables.writable }}'
- id: aws_public
version: 0
name: Amazon Web Services Public Bucket
description: Setup anonymous access to a public AWS bucket.
configuration:
type: s3fs
bucket: "{{ variables.bucket }}"
writable: false
anon: true
variables:
bucket:
label: Bucket
type: string
help: |
The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
anonymously access.
- id: aws_private
version: 0
name: Amazon Web Services Private Bucket
description: Setup access to a private AWS bucket using a secret access key.
configuration:
type: s3fs
bucket: "{{ variables.bucket }}"
writable: "{{ variables.writable }}"
secret: "{{ secrets.secret_key }}"
key: "{{ variables.access_key }}"
variables:
access_key:
label: Access Key ID
type: string
help: |
The "access key ID" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
bucket:
label: Bucket
type: string
help: |
The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
access. This should be a bucket the user described by the Access Key ID has access to.
writable:
label: Writable?
type: boolean
help: Is this a bucket you have permission to write to?
secrets:
secret_key:
label: Secret Access Key
help: |
The "secret access key" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
ftp
- id: ftp
version: 0
name: An FTP Server
description: |
This template allows connecting to FTP servers. This file source plugin should
support FTP and FTPS servers.
configuration:
type: ftp
host: "{{ variables.host }}"
user: "{{ variables.user }}"
port: "{{ variables.port }}"
passwd: "{{ secrets.password }}"
writable: "{{ variables.writable }}"
variables:
host:
label: FTP Host
type: string
help: Host of FTP Server to connect to.
user:
label: FTP User
type: string
help: |
Username to connect with. Leave this blank to connect to the server
anonymously (if allowed by target server).
writable:
label: Writable?
type: boolean
help: Is this an FTP server you have permission to write to?
port:
label: FTP Port
type: integer
help: Port used to connect to the FTP server.
default: 21
secrets:
password:
label: FTP Password
help: |
Password to connect to FTP server with. Leave this blank to connect
to the server anonymously (if allowed by target server).
The syntax for the configuration
section of ftp
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
azure
The syntax for the configuration
section of azure
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
webdav
The syntax for the configuration
section of webdav
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
dropbox
The syntax for the configuration
section of dropbox
templates looks like this.
At runtime, after the configuration
template is expanded, the resulting dictionary
passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset
of what you’d be able to add directly to file_sources_conf.yml
(Galaxy’s global file source
configuration).
YAML Syntax
Ready To Use Production File Source Templates
The templates are sufficiently generic that they may make sense for a variety of Galaxy instances, address a variety of potential use cases, and do not need any additional tailoring, parameterization, or other customization. These (mostly) assume your Galaxy instance has a Vault configured and you are comfortable with it storing your user’s secrets.
Allow Users to Define Generic FTP Servers as File Sources
- id: ftp
version: 0
name: An FTP Server
description: |
This template allows connecting to FTP servers. This file source plugin should
support FTP and FTPS servers.
configuration:
type: ftp
host: "{{ variables.host }}"
user: "{{ variables.user }}"
port: "{{ variables.port }}"
passwd: "{{ secrets.password }}"
writable: "{{ variables.writable }}"
variables:
host:
label: FTP Host
type: string
help: Host of FTP Server to connect to.
user:
label: FTP User
type: string
help: |
Username to connect with. Leave this blank to connect to the server
anonymously (if allowed by target server).
writable:
label: Writable?
type: boolean
help: Is this an FTP server you have permission to write to?
port:
label: FTP Port
type: integer
help: Port used to connect to the FTP server.
default: 21
secrets:
password:
label: FTP Password
help: |
Password to connect to FTP server with. Leave this blank to connect
to the server anonymously (if allowed by target server).
Allow Users to Define Azure Blob Storage as File Sources
- id: azure
version: 0
name: Azure Blob Storage
description: |
This template allows connecting to [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
configuration:
type: azure
container_name: "{{ variables.container_name }}"
account_name: "{{ variables.account_name }}"
account_key: "{{ secrets.account_key }}"
namespace_type: "{{ 'hierarchical' if variables.hierarchical else 'flat' }}"
writable: "{{ variables.writable }}"
variables:
container_name:
label: Container Name
type: string
help: |
The name of your Azure Blob Storage container. More information on containers can be found
in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
account_name:
label: Storage Account Name
type: string
help: |
The name of your Azure Blob Storage account. More information on containers can be found in the
[Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
hierarchical:
label: Hierarchical?
type: boolean
default: true
help: |
Is this storage hierarchical (e.g. does it use a Azure Data Lake Storage Gen2 hierarchical namespace)?
More information on Data Lake Storage namespaces can be found in the
[Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace).
writable:
label: Writable?
type: boolean
default: true
help: Allow Galaxy to write data to this Azure Blob Storage container.
secrets:
account_key:
label: Account Key
help: |
The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).
Allow Users to Define Generic S3 Compatible Storage as File Sources
- id: s3fs
version: 0
name: S3 Compatible Storage with Credentials
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3. This template configuration allows
using such service as a Galaxy storage location as long as you are able to find the
connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this value would be entered here.
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: s3fs
endpoint_url: '{{ variables.endpoint_url }}'
key: '{{ variables.access_key }}'
secret: '{{ secrets.secret_key }}'
bucket: '{{ variables.bucket }}'
- id: s3fs
version: 1
name: S3 Compatible Storage with Credentials
description: |
The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
of an unofficial standard for cloud storage across a variety of vendors and services.
Many vendors offer storage APIs compatible with S3. This template configuration allows
using such service as a Galaxy storage location as long as you are able to find the
connection details and have the relevant credentials.
Given the amount of information needed to connect to such a service, this is a bit of an
advanced template and probably should not be used to connect to a service if a more
specific template is available.
variables:
access_key:
label: Access Key ID
type: string
help: |
The less secure part of your access tokens or access keys that describe the user
that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_access_key_id``.
bucket:
label: Bucket
type: string
help: |
The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
store your datasets in. How to setup buckets for your storage will vary from service to service
but all S3 compatible storage services should have the concept of a bucket to namespace
a grouping of your data together with.
endpoint_url:
label: S3-Compatible API Endpoint
type: string
help: |
If the documentation for your storage service has something called an ``endpoint_url``,
For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
you would substitute your CloudFlare account ID into the endpoint url and use that value.
So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
this value would be entered here.
writable:
label: Writable?
type: boolean
help: Is this a bucket you have permission to write to?
secrets:
secret_key:
label: Secret Access Key
help: |
The secret key used to connect to the S3 compatible storage with for the given access key.
The [Amazon documentation] calls these an "secret access key" and
the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
this the ``secret_key``.
configuration:
type: s3fs
endpoint_url: '{{ variables.endpoint_url }}'
key: '{{ variables.access_key }}'
secret: '{{ secrets.secret_key }}'
bucket: '{{ variables.bucket }}'
writable: '{{ variables.writable }}'
Allow Users to Define Publicly Accessible AWS S3 Buckets as File Sources
- id: aws_public
version: 0
name: Amazon Web Services Public Bucket
description: Setup anonymous access to a public AWS bucket.
configuration:
type: s3fs
bucket: "{{ variables.bucket }}"
writable: false
anon: true
variables:
bucket:
label: Bucket
type: string
help: |
The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
anonymously access.
Allow Users to Define Private AWS S3 Buckets as File Sources
- id: aws_private
version: 0
name: Amazon Web Services Private Bucket
description: Setup access to a private AWS bucket using a secret access key.
configuration:
type: s3fs
bucket: "{{ variables.bucket }}"
writable: "{{ variables.writable }}"
secret: "{{ secrets.secret_key }}"
key: "{{ variables.access_key }}"
variables:
access_key:
label: Access Key ID
type: string
help: |
The "access key ID" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
bucket:
label: Bucket
type: string
help: |
The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
access. This should be a bucket the user described by the Access Key ID has access to.
writable:
label: Writable?
type: boolean
help: Is this a bucket you have permission to write to?
secrets:
secret_key:
label: Secret Access Key
help: |
The "secret access key" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
Allow Users to Define WebDAV Servers as File Sources
- id: webdav
version: 0
name: WebDAV
description: |
The WebDAV protocol is a simple way to access files over the internet. This template
configuration allows you to connect to a WebDAV server.
variables:
url:
label: Server Domain (e.g. https://myowncloud.org)
type: string
help: |
The domain of the WebDAV server you are connecting to. This should be the full URL
including the protocol (http or https) and the domain name.
root:
label: WebDAV server Path (should end with /remote.php/webdav, e.g. /a/sub/path/remote.php/webdav)
type: string
help: |
The full server path to the WebDAV service. Ensure the path includes /remote.php/webdav.
login:
label: Username
type: string
help: |
The username to use to connect to the WebDAV server. This should be the username you use
to log in to the WebDAV server.
writable:
label: Writable?
type: boolean
default: false
help: Allow Galaxy to write data to this WebDAV server.
secrets:
password:
label: Password
help: |
The password to use to connect to the WebDAV server. This should be the password you use
to log in to the WebDAV server.
configuration:
type: webdav
url: '{{ variables.url }}'
root: '{{ variables.root }}'
login: '{{ variables.login }}'
writable: '{{ variables.writable }}'
password: '{{ secrets.password }}'
Production OAuth 2.0 File Source Templates
Unlike the examples in the previous section. These examples will require a bit of configuration on the part of the admin. This is to obtain client credentials from the external service and register an OAuth 2.0 redirection callback with the remote service.
Dropbox
Once you have OAuth 2.0 client credentials from Dropbox (called oauth2_client_id
and
oauth2_client_secret
here), the following configuration can be used configure your Galaxy
instance to enable Dropbox.
- id: dropbox
name: Dropbox
description: Connect to your Dropbox account to download and upload files.
configuration:
type: dropbox
oauth2_client_id: "{{ environment.oauth2_client_id }}"
oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
writable: true
environment:
oauth2_client_id:
type: variable
variable: GALAXY_DROPBOX_APP_CLIENT_ID
oauth2_client_secret:
type: variable
variable: GALAXY_DROPBOX_APP_CLIENT_SECRET
To use this template - you’ll need to make your credentials available to Galaxy’s
web and job handler processes using the environment variables GALAXY_DROPBOX_APP_CLIENT_ID
and GALAXY_DROPBOX_APP_CLIENT_SECRET
. Your jobs themselves do not require
these secrets to be set and will not be given the secrets.
If you’d like to configure these secrets explicit - you can configure them explicitly in the configuration. If your configuration file is managed by Ansible, these secrets could potentially be populated from your Ansible vault.
- id: dropbox
name: Dropbox
description: Connect to your Dropbox account to download and upload files.
configuration:
type: dropbox
oauth2_client_id: abcdefgh
oauth2_client_secret: ijklmnopqr
To obtain the OAuth 2.0 credentials from Dropbox, you’ll need to navigate to your Dropbox Apps and create a new app for your Galaxy instance with the “Create app” button.
The only option available is “Scoped access” and this works fine for typical Galaxy use cases. You will however want to click “Full Dropbox” to request full access to your user’s account. You will also need to give your “App” a name here, this should likely be something related to your Galaxy instances name.
After your app is created, you’ll be presented with a management screen for it. The first thing you’ll want to do is navigate to the permissions tab and enable permissions to read and write to files and directories so the file source plugin works properly:
Next, navigate back to the “Settings” tab. You’ll need to register a callback
for your Galaxy instance (it will need HTTPS enabled). This should be the URL
to your Galaxy instance with oauth2_callback
appended to it.
Finally you’ll be able to find the oauth2_client_id
and oauth2_client_secret
to configured your Galaxy with on this settings page.
Until you have 50 users, your App will be considered a “development” application. The upshot of this is that your user’s will get a scary message during authorization but there seems to be no way around this. 50 users would definitely be considered a production Galaxy instance but Dropbox operates on a different scale.
For more information on what Dropbox considers a “development” app versus a “production” app - checkout the Dropbox documentation.
Playing Nicer with Ansible
Many large instances of Galaxy are configured with Ansible and much of the existing administrator
documentation leverages Ansible. The configuration template files using Jinja templating and so
does Ansible by default. This might result in a lack of clarity of when templates (strings
starting with {{
and ending with }}
) are being evaluated. Ansible templates are evaluated
at deploy time and the configuration objects describing plugins are evaluated at Galaxy runtime.
The easiest way to fix this is probably to store these templates files in your Ansible as plain files
and not templates. If you’d like to use Ansible templating to build up these files you’ll very
likely need to tell either Galaxy or Ansible to use something other than {{
and }}
for
templating variables. This can be done by placing a directive at the top of your template that
is consumed by Ansible. For instance, to have [%
and %]
used instead of {{
and }}
by Ansible at deploy time, the file could start with:
#jinja2:variable_start_string:'[%' , variable_end_string:'%]'
In this case, variables wrapped by [%
and %]
are expanded by Ansible and use the Ansible
environment and {{
and }}
are reserved for Galaxy templating.
Alternatively, Galaxy can be configured to use a custom template on a per-configuration
object basis by setting the template_start
and/or template_end
variables.
The following template chunk shows how to override the templating Galaxy does for a particular object store configuration. Similar templating overrides work for file source plugin templates.
- id: project_disk
version: 0
name: Project Disk
description: |
Disk in our institutional ``/data`` directory for you user's project.""
configuration:
type: posix
root: '/data/projects/@= user.username | ensure_path_component =@/@= variables.project_name | ensure_path_component =@'
template_start: '@='
template_end: '=@'
variables:
project_name:
type: path_component
help: Project name used in path.
https://github.com/ansible/ansible/pull/75306
https://stackoverflow.com/questions/12083319/add-custom-tokens-in-jinja2-e-g-somevar
Jinja Template Reference
Galaxy configuration file templating uses Jinja to template values and connect inputs, configuration, and the runtime environment into concrete configuration YAML blocks.
Jinja is fairly straight forward to learn but this document provides tons of examples and one can probably adapt them to whatever you’re interested in building without really needing to dig deeply into Jinja. However, this section does outline what Galaxy does inject into the Jinja environment to serve as a reference.
Even the most exotic configurations will likely only scratch the surface of what Jinja allows and implements. The only relevant Jinja documentation you’ll need in these cases is probably just those documents on variables, filters, and the list of builtin filters.
variables
This is a typed dictionary object is populated with user supplied values defined via the the variables
section of the configuration template and filled in by the user when they
created a new object store or file source.
secrets
This is a dictionary of strings populated with user supplied secrets defined via the the secrets
section of the configuration template and filled in by the user when they
created a new object store or file source.
A deep dive into these can be found in the User Secrets section of this document.
environment
This dictionary object is populated with admin-supplied values defined via the the environment
section of the configuration template.
A deep dive into these can be found in the Admin Secrets section of this document.
- id: admin_secret_directory
version: 0
name: Secret Directory with Defaults
description: An directory constructed from admin secrets or defaults.
configuration:
type: posix
root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
environment:
var:
type: variable
variable: GALAXY_SECRET_HOME_VAR
default: default_var
sec:
type: secret
vault_key: "secret_directory_file_source/my_secret"
default: default_sec
user
This dictionary object exposes information about user configuring and using a target template
configuration. These values are populated from the galaxy_user
table of the Galaxy database.
The current properties exposed include:
Key |
Description |
---|---|
|
string corresponding the username of the Galaxy user |
|
string corresponding the email of the Galaxy user |
|
integer primary key of user object in the Galaxy database |
The simple example of project scratch storage used to describe these concepts made use the Galaxy user’s username to generate unique paths.
- id: project_scratch
name: Project Scratch
version: 0
description: Folder on institutional scratch disk area bound to your user.
variables:
project_name:
type: path_component
help: The name of your project scratch.
configuration:
type: disk
files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
badges:
- type: faster
- type: less_secure
- type: not_backed_up
ensure_path_component
This Jinja filter
will fail template evaluation if the value it is applied to is not
a simple directory name. If it contain ..
or /
or in some other way might
be used to attempt path exploitation of cause odd path-related bugs. This is
useful when producing paths for disk
object stores or posix
file sources.
When taking inputs from users, setting the type of path_component
instead of
string
allows the client to validate potential issues way before this point,
but many path components might be built from environment variables or usernames
or sources like this that are not explicitly user inputs.
An example of an object store template that uses this is the simple scratch example that was used to introduce concepts at the start of the object store template documentation above.
- id: project_scratch
name: Project Scratch
version: 0
description: Folder on institutional scratch disk area bound to your user.
variables:
project_name:
type: path_component
help: The name of your project scratch.
configuration:
type: disk
files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
badges:
- type: faster
- type: less_secure
- type: not_backed_up
asbool
This Jinja filter will use Galaxy configuration style logic to convert string values into boolean ones.
When taking inputs from users, setting the type of boolean
is sufficient to ensure
a variable is boolean, but “secrets” and environment variables and many other things
are likely to be of type string but should be used in a template that expects boolean
values.
An example of an object store template that uses this is secure
environment parameter
on the simple minio example.
- id: minio
version: 0
name: Institutional S3 Storage
description: Connect to our institutional MinIO storage service.
variables:
access_key:
type: string
help: A description of the user account used to connect to your storage.
bucket:
type: string
help: The bucket to connect to.
secrets:
secret_key:
help: The secret key used to connect to MinIO with for the given access key.
environment:
host:
type: variable
variable: GALAXY_MINIO_HOST
default: localhost
port:
type: variable
variable: GALAXY_MINIO_PORT
default: "9000"
secure:
type: variable
variable: GALAXY_MINIO_IS_SECURE
default: "true"
connection_path:
type: variable
variable: GALAXY_MINIO_CONNECTION_PATH
default: ""
configuration:
type: generic_s3
auth:
access_key: '{{ variables.access_key }}'
secret_key: '{{ secrets.secret_key }}'
bucket:
name: '{{ variables.bucket }}'
use_reduced_redundancy: false
connection:
host: '{{ environment.host }}'
port: '{{ environment.port | int }}'
is_secure: '{{ environment.secure | asbool }}'
conn_path: '{{ environment.connection_path }}'
badges:
- type: slower
- type: less_secure
- type: less_stable
Connecting Configuration Templates to Secrets
User Secrets
Most of the examples in this document use secrets of one kind or another. For instance, in the FTP example - the password field is a secret.
- id: ftp
version: 0
name: An FTP Server
description: |
This template allows connecting to FTP servers. This file source plugin should
support FTP and FTPS servers.
configuration:
type: ftp
host: "{{ variables.host }}"
user: "{{ variables.user }}"
port: "{{ variables.port }}"
passwd: "{{ secrets.password }}"
writable: "{{ variables.writable }}"
variables:
host:
label: FTP Host
type: string
help: Host of FTP Server to connect to.
user:
label: FTP User
type: string
help: |
Username to connect with. Leave this blank to connect to the server
anonymously (if allowed by target server).
writable:
label: Writable?
type: boolean
help: Is this an FTP server you have permission to write to?
port:
label: FTP Port
type: integer
help: Port used to connect to the FTP server.
default: 21
secrets:
password:
label: FTP Password
help: |
Password to connect to FTP server with. Leave this blank to connect
to the server anonymously (if allowed by target server).
Instead of being saved in the database in plain text, Galaxy will use a configured Vault to store this data. Check out Galaxy admin documentation on Storing secrets in the vault for descriptions of how to configure a vault. Most interesting user defined file sources and/or object stores will require a Galaxy Vault.
In this FTP example, a new Vault key will be created for each FTP instance the user creates. The user file source APIs and management user interface will be responsible for orchestration of storing and updating secrets. The Vault key for this password will be something like:
/galaxy/user/<user_id>/file_source_config/<file_source_instance_uuid>/password
Here user_id
is the primary key of the User object in the database and
file_source_instance_uuid
is the uuid
value corresponding to the user_file_source
table in the database.
User defined object stores are stored in a similar fashion but at:
/galaxy/user/<user_id>/object_store_config/<object_store_instance_uuid>/<secret_name>
During the creation of an object store or file source, the secrets will be appended to the generated form as password fields.
After an object store has been created, a user has the option to edit the settings in the UI. Most of the settings appear in a simple form - but the secrets are managed and updated individually in the “Secrets” tab.
Admin Secrets
Administrators may define secrets that are available to all users and aren’t parameterized on a per-instance basis. These secrets can be injected into template instances through Vault keys or through environment variables.
Each template may optionally define an environment
key where these can be defined. The
following template entry describes a file source that injects the environment variable
GALAXY_SECRET_HOME_VAR
into the template as environment.var
and injects the Vault
key secret_directory_file_source/my_secret
into the template as environment.var
.
This template uses these variables to construct a root path for a posix
file source
but the same secrets could just as easily store cloud keys and configure an S3 object store.
- id: admin_secret_directory
version: 0
name: Secret Directory
description: An directory constructed from admin secrets.
configuration:
type: posix
root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
environment:
var:
type: variable
variable: GALAXY_SECRET_HOME_VAR
sec:
type: secret
vault_key: "secret_directory_file_source/my_secret"
If you’d like to make the target secrets optional, default values can also be setup.
The following block demonstrates the same configuration but with default values of
default_var
for the default var
value and default_sec
for the default sec
value. These will be used in the target Vault keys are absent or the target environment
variable is not defined at runtime.
- id: admin_secret_directory
version: 0
name: Secret Directory with Defaults
description: An directory constructed from admin secrets or defaults.
configuration:
type: posix
root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
environment:
var:
type: variable
variable: GALAXY_SECRET_HOME_VAR
default: default_var
sec:
type: secret
vault_key: "secret_directory_file_source/my_secret"
default: default_sec
OAuth 2.0 Enabled Configurations
OAuth 2.0 has become an industry standard for allowing users of various services (e.g. Dropbox or Google Drive) to authorize other services (e.g. Galaxy) fine grained access to the services. There is a bit of a dance the services need to do but the result can be a fairly nice end-user experience. The framework for configuring user defined data access templates can support OAuth 2.0.
Galaxy keeps track of which plugin type
s (currently only file source types) require
OAuth2 to work properly and will take care of authorization redirection, saving refresh tokens,
etc.. implicitly. One such type
is dropbox
. Here is the production Dropbox
template distributed with Galaxy.
- id: dropbox
name: Dropbox
description: Connect to your Dropbox account to download and upload files.
configuration:
type: dropbox
oauth2_client_id: "{{ environment.oauth2_client_id }}"
oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
writable: true
environment:
oauth2_client_id:
type: variable
variable: GALAXY_DROPBOX_APP_CLIENT_ID
oauth2_client_secret:
type: variable
variable: GALAXY_DROPBOX_APP_CLIENT_SECRET
OAuth2 enabled plugin types include template definitions that include oauth2_client_id
and oauth2_client_secret
in the configuration (as shown in the following specification
and in the above examples).
The above example defines these secrets using environment variables but they can stored in Galaxy’s Vault explicitly by the admin or written right to the configuration files as shown in the next two examples:
- id: dropbox
name: Dropbox
description: Connect to your Dropbox account to download and upload files.
configuration:
type: dropbox
oauth2_client_id: "{{ environment.oauth2_client_id }}"
oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
writable: true
environment:
oauth2_client_id:
type: secret
vault_key: "dropbox_file_source/client_id"
oauth2_client_secret:
type: secret
vault_key: "dropbox_file_source/client_secret"
- id: dropbox
name: Dropbox
description: Connect to your Dropbox account to download and upload files.
configuration:
type: dropbox
oauth2_client_id: abcdefgh
oauth2_client_secret: ijklmnopqr
Looking at the configuration objects that get generated at runtime
from these templates though - oauth2_client_id
and oauth2_client_secret
no longer
appear and instead have been replaced with a oauth2_access_token
parameter.
Galaxy will take care of stripping out the client (e.g. Galaxy server) information and
replacing it with short-term access tokens generated for the user’s resources.
Normally, a UUID is created for each user configured instance object and this is used
to store the template’s explicitly listed secrets in Galaxy’s Vault. For OAuth 2.0
plugin types - before user’s are even prompted for configuration metadata they are redirected
to the remote service and prompted to authorize Galaxy to act on their behalf when using
the remote service. If they authorize this, the remote service will send an authorization code
to https://<galaxy_url>/oauth2_callback
along with state information
to recover which instance is being configured. At this point, Galaxy will fetch a refresh token
from the remote resource using the
supplied authorization code. The refresh token is stored in the Vault in key associated with
the UUID of the object that will be created when the user finishes the creation process.
Specifically it is stored at
/galaxy/user/<user_id>/file_source_config/<file_source_instance_uuid>/_oauth2_refresh_token
Here is the prefix at the end of _
is indicating that Galaxy is managing this instead of
it being listed explicitly in a secrets
section of the template configuration like the
explicit Vault secrets discussed in this document.
Galaxy knows how to fetch an access token
from this
refresh token that is actually used to interact with the remote resource. This is the property
oauth2_access_token
that is injected into the configuration object shown above and passed
along to the actual object store or file source plugin implementation.