Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

Connecting Users and Data

Galaxy has countless ways for users to connect with things that might be considered their “data” - file sources (aka “remote files”), object stores (aka “storage locations”), data libraries, the upload API, visualizations, display applications, custom tools, etc…

This document is going to discuss two of these (file sources and object stores) that are most important Galaxy administrators and how to build Galaxy configurations that allow administrators to let users tie into various pieces of infrastructure (local and publicly available).

Datasets vs Files

File sources in Galaxy are a sprawling concept but essentially they provide users access to simple files (stored hierarchically into folders) that can be navigated and imported into Galaxy. Importing a “file” into Galaxy generally creates a copy of that file into a Galaxy “object store”. Once these files are stored in Galaxy, they become “datasets”. A Galaxy dataset is much more than a simple file - Galaxy datasets include various generic metadata a datatype, datatype specific metadata, and ownership and sharing rules managed by Galaxy.

Galaxy object stores (called “storage locations” in the UI) store datasets and global (accessible to all users) object stores are configured with the galaxy.yml property object_store_config_file (or object_store_config for a configuration embedded right in galaxy.yml) that defaults to object_store_conf.xml or object_store_conf.yml if either is present in Galaxy’s configuration directory. Galaxy file sources provide users access to raw files and global files sources are configured with the galaxy.yml property file_sources_config_file (or file_sources for embedded configurations) that defaults to file_sources_conf.yml if that file is present in Galaxy’s configuration directory.

Some of Galaxy’s most updated and complete administrator documentation can be found in configuration sample files - this is definitely the case for object stores and file sources. The relevant sample configuration files include file_sources_conf.yml.sample and object_store_conf.sample.yml.

File sources and object stores configured with the above files essentially are available to all users of your Galaxy instance - hence this document describes them as “global” file sources and object stores. File source configurations do allow some templating that does allow the a global file source to be materialized differently for different users. For instance, you as an admin may setup a Dropbox file source and may explicitly add custom user properties that allow that single Dropbox file source to read from a user’s preferences. Since there is just one Dropbox service and most people only have a single Dropbox account, this use case can be somewhat adequately addressed by the global file source and the global user preferences file. For a use case like Amazon S3 buckets though for instance, a single bucket file source that is parameterized one way is probably more clearly inadequate. For instance, users would very likely want to attach different buckets for different projects. Additionally, the Galaxy user interface doesn’t tie the user preferences to the particular file source and so this method introduces a huge education burden on your Galaxy instance. Finally, the templating available to file sources are not available for object stores - and allowing users to describe how they would like datasets stored and to pay for their own dataset storage are important use cases.

This document is going to describe Galaxy configuration template libraries that allow the administrator to setup templates for file sources and object stores that your users may instantiate as they see fit. User’s can instantiate multiple instances of any template, the template concept can apply to both file source and object store plugins, and the user interface is unified from the template configuration file (you as the admin do not need to explicitly declare user preferences and your users do not need to navigate seemingly unrelated preferences to get plugins to work).

Object Store Templates

Galaxy’s object store templates are configured as a YAML list of template objects. This list can be placed object_store_templates.yml in Galaxy configuration directory (or any path pointed to by the configuration option object_store_templates_config_file in galaxy.yml). Alternatively, the configuration can be placed directly into galaxy.yml using the object_store_templates configuration option.

Warning

Object store selection within Galaxy is available only when the primary object store is a distributed object store. All the other object stores provide stronger guarantees about how datasets are stored. Object Store Templates will not currently work if the primary object store (usually defined by object_store_config_file) is a simple disk object store, a hierarchal object store, or anything other than a distributed object store.

Ongoing discussion on this topic can be found at this Galaxy Discussions post (#18157).

A minimal object store template might look something like:

- id: project_scratch
  name: Project Scratch
  version: 0
  description: Folder on institutional scratch disk area bound to your user.
  variables:
    project_name:
      type: path_component
      help: The name of your project scratch.
  configuration:
    type: disk
    files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
    badges:
    - type: faster
    - type: less_secure
    - type: not_backed_up

Object Store Types

disk

This is the most basic sort of object store template that just makes disk paths available to users for storing data. Paths can be built up from the user supplied variables, user details, supplied environment variables, etc.. The simple example just uses a user supplied project name and the user’s username to produce a unique path for each user defined object store.

- id: project_scratch
  name: Project Scratch
  version: 0
  description: Folder on institutional scratch disk area bound to your user.
  variables:
    project_name:
      type: path_component
      help: The name of your project scratch.
  configuration:
    type: disk
    files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
    badges:
    - type: faster
    - type: less_secure
    - type: not_backed_up

These sorts of object stores have no quota so be careful.

The syntax for the configuration section of disk templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s object store infrastructure looks like this and should match a subset of what you’d be able to add directly to object_store_conf.yml (Galaxy’s global object store configuration).

boto3

Object stores of the type boto3 can be used to access a wide variety of S3 compatible storage services including AWS S3. How you template them can result in widely different experiences for your users and can result in addressing a wide variety of use cases.

Here is an example that is tailored for a specific storage service (e.g. CloudFlare R2) and exposes just the pieces of data CloudFlare users would need.

# https://developers.cloudflare.com/r2/examples/aws/boto3/
- id: cloudflare
  version: 0
  name: CloudFlare R2
  description: |
    This template can be used to connect to your [CloudFlare R2](https://developers.cloudflare.com/r2/)
    storage. To use these templates you will need to generate
    [CloudFlare R2 access tokens](https://developers.cloudflare.com/r2/api/s3/tokens/).
    Following that tutorial, you should have an "Account ID", and "Access Key ID", and a
    "Secret Access Key".
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        An Access Key ID generated according to the
        [CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
    account_id:
      label: Account ID
      type: string
      help: |
        Your account ID as available in the [CloudFlare dashboard](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/).
    bucket:
      label: Bucket
      type: string
      help: |
        The name of a bucket you've created to store your Galaxy data. Documentation for how to create buckets
        can be found in [this part of the CloudFlare R2 documentation](https://developers.cloudflare.com/r2/buckets/create-buckets/).
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        A Secret Access Key generated according to the
        [CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
  configuration:
    type: boto3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      endpoint_url: 'https://{{ variables.account_id}}.r2.cloudflarestorage.com/'

Templates can be much more generic or much less generic than this.

In one direction, all the bells and whistles could be exposed to your Galaxy users to allow them to connect to any S3 compatible storage. This requires a lot more sophistication from your users but also allows them to connect to many more services. This template is available here:

- id: generic_s3
  version: 0
  name: Any S3 Compatible Storage
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
    storage locations. This template configuration allows using such service as a Galaxy storage
    location as long as you are able to find the connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation] calls these an "access key ID",
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
        this the ``access_key``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this whole value would be entered here.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: boto3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      endpoint_url: '{{ variables.endpoint_url }}'

On the other hand, you might run a small lab with a dedicate MinIO storage service and just trust your user’s to define individual buckets by name:

- id: lab_minio_storage
  version: 0
  name: Lab Storage
  description: Connect to our lab's local MinIO storage service.
  variables:
    bucket:
      type: string
      help: The bucket to connect to.
  configuration:
    type: boto3
    auth:
      access_key: 'XXXXXXXXfillinaccess'
      secret_key: 'YYYYYYYYfillinsecret'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      endpoint_url: 'https://storage.ourawesomelab.org:9000'
    badges:
    - type: slower
    - type: less_secure
    - type: less_stable

If you want to just target AWS S3 and let your users utilize that as quickly and easily as possible that templates might look like this:

The syntax for the configuration section of boto3 templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s object store infrastructure looks like this and should match a subset of what you’d be able to add directly to object_store_conf.yml (Galaxy’s global object store configuration).

azure_blob

Here is a “production grade” Azure template that can be essentially used to connect to any Azure storage container.

- id: azure
  version: 0
  name: Azure Blob Storage
  description: |
    This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
  configuration:
    type: azure_blob
    auth:
      account_name: '{{ variables.account_name }}'
      account_key: '{{ secrets.account_key}}'
    container:
      name: '{{ variables.container_name }}'
  variables:
    container_name:
      label: Container Name
      type: string
      help: |
        The name of your Azure Blob Storage container. More information on containers can be found
        in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
    account_name:
      label: Storage Account Name
      type: string
      help: |
        The name of your Azure Blob Storage account. More information on containers can be found in the
        [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
  secrets:
    account_key:
      label: Account Key
      help: |
        The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
        on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).

This template might be adapted to hide connection details from say users of a individual lab and just expose what container they should use. That might look something like:

- id: lab_azure_storage
  version: 0
  name: Azure Blob Storage for our Lab
  description: |
    This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
  configuration:
    type: azure_blob
    auth:
      account_name: 'XXXXXXXXfillinaccount'
      account_key: 'XXXXXXXXXfillinkey'

    container:
      name: '{{ variables.container_name }}'

  variables:
    container_name:
      label: Container Name
      type: string
      help: |
        The name of your Azure Blob Storage container in our lab space. Contact
        our lab Galaxy admin awesomelabsnate@ourawesomelab.org if you're unsure
        what container you should store your data in. 

This example is a little contrived though, if a small lab or institution has just a few containers it would likely be a much easier user experience to just wrap them all in a Galaxy hierarchical object store, document them there, and make them available to your whole Galaxy instance.

The syntax for the configuration section of azure_blob templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s object store infrastructure looks like this and should match a subset of what you’d be able to add directly to object_store_conf.yml (Galaxy’s global object store configuration).

aws_s3 (Legacy)

Object stores of the type aws_s3 are be used to treat AWS Simple Storage Service (S3) buckets as Galaxy object stores. See Amazon documentation for information on S3 and how to create buckets and how to create access keys.

- id: aws_s3_legacy
  version: 0
  name: Amazon Web Services S3 Storage (Legacy)
  description: |
    Amazon's Simple Storage Service (S3) is Amazon's primary cloud storage service.
    More information on S3 can be found in [Amazon's documentation](https://aws.amazon.com/s3/).
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        A security credential for interacting with AWS services can be created from your
        AWS web console. Creating an "Access Key" creates a pair of keys used to identify
        and authenticate access to your AWS account - the first part of the pair  is
        "Access Key ID" and should be entered here. The second part of your key is the secret
        part called the "Secret Access Key". Place that in the secure part of this form below.
    bucket:
      label: Bucket
      type: string
      help: |
        The [AWS S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. You will need to create a bucket to use in your AWS web console before
        using this form.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        See the documentation above used "Access Key ID" for information about access key pairs.

  configuration:
    type: aws_s3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'

The aws_s3 object store is older and more well tested than the boto3 object store, but the boto3 object store is built using a newer, more robust, and more feature-rich client library so it should probably be the object store you use instead of this.

The syntax for the configuration section of aws_s3 templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s object store infrastructure looks like this and should match a subset of what you’d be able to add directly to object_store_conf.yml (Galaxy’s global object store configuration).

generic_s3 (Legacy)

Object stores of the type generic_s3 can be used to access a wide variety of S3 compatible storage services. How you template them can result in widely different experiences for your users and can result in addressing a wide variety of use cases.

Here is an example that is tailored for a specific storage service (e.g. CloudFlare R2) and exposes just the pieces of data CloudFlare users would need.

# https://developers.cloudflare.com/r2/examples/aws/boto3/
- id: cloudflare_legacy
  version: 0
  name: CloudFlare R2
  description: |
    This template can be used to connect to your [CloudFlare R2](https://developers.cloudflare.com/r2/)
    storage. To use these templates you will need to generate
    [CloudFlare R2 access tokens](https://developers.cloudflare.com/r2/api/s3/tokens/).
    Following that tutorial, you should have an "Account ID", and "Access Key ID", and a
    "Secret Access Key".
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        An Access Key ID generated according to the
        [CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
    account_id:
      label: Account ID
      type: string
      help: |
        Your account ID as available in the [CloudFlare dashboard](https://developers.cloudflare.com/fundamentals/setup/find-account-and-zone-ids/).
    bucket:
      label: Bucket
      type: string
      help: |
        The name of a bucket you've created to store your Galaxy data. Documentation for how to create buckets
        can be found in [this part of the CloudFlare R2 documentation](https://developers.cloudflare.com/r2/buckets/create-buckets/).
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        A Secret Access Key generated according to the
        [CloudFlare R2 access tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/).
  configuration:
    type: generic_s3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      host: '{{ variables.account_id}}.r2.cloudflarestorage.com'
      port: 443
      is_secure: true

Templates can be much more generic or much less generic than this.

In one direction, all the bells and whistles could be exposed to your Galaxy users to allow them to connect to any S3 compatible storage. This requires a lot more sophistication from your users but also allows them to connect to many more services. This template is available here:

- id: generic_s3_legacy
  version: 0
  name: Any S3 Compatible Storage (Legacy)
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
    storage locations. This template configuration allows using such service as a Galaxy storage
    location as long as you are able to find the connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation] calls these an "access key ID",
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
        this the ``access_key``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    host:
      label: Connection Host
      type: string
      help: |
        The [hostname](https://en.wikipedia.org/wiki/Hostname) used to connect to the target
        S3 compatible service.

        If the documentation for your storage service has something called an ``endpoint_url``,
        this can be used to determine this value. For instance, the CloudFlare documentation
        describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint and shave off the ``https://``,
        so if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
    port:
      label: Connection Port
      type: integer
      default: 443
      help: |
        The [port](https://en.wikipedia.org/wiki/Port_(computer_networking)) used to connect
        to the target S3 compatible service. This might be ``443`` if you cannot find a relevant
        port - this is the default for secure HTTP connections.

        If the documentation for your storage service has something called an ``endpoint_url``,
        this can be used to determine this value. The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``.
        The ``:9000`` here indicates this port should be specified as ``9000``. Alternatively, the
        CloudFlare documentation describes its endpoints ``https://<accountid>.r2.cloudflarestorage.com``.
        Here there is no number at the end of the URL so the port is ``443`` as long the URL starts
        with ``https``.
    connection_path:
      label: Connection Path
      type: string
      default: ""
      help: |
        This is an advanced configuration option and it is very likely best to just keep this empty
        for most storage services. If specified, it will be the prefix in the URL for the S3 compatible
        API after the host and port to reach the target API.
    secure:
      label: Use HTTPS?
      type: boolean
      default: true
      help: |
        This is an advanced configuration option and if this option is not checked, you should not assume
        your data is secure at all. This should only ever be unchecked during testing new or experimental
        services with data and keys you do not care about.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: generic_s3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      host: '{{ variables.host }}'
      port: '{{ variables.port  }}'
      is_secure: '{{ variables.secure }}'
      conn_path: '{{ variables.connection_path }}'

On the other hand, you might run a small lab with a dedicate MinIO storage service and just trust your user’s to define individual buckets by name:

- id: lab_minio_storage_legacy
  version: 0
  name: Lab Storage (Legacy)
  description: Connect to our lab's local MinIO storage service.
  variables:
    bucket:
      type: string
      help: The bucket to connect to.
  configuration:
    type: generic_s3
    auth:
      access_key: 'XXXXXXXXfillinaccess'
      secret_key: 'YYYYYYYYfillinsecret'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      host: 'storage.ourawesomelab.org'
      port: 9000
      is_secure: true
    badges:
    - type: slower
    - type: less_secure
    - type: less_stable

The syntax for the configuration section of generic_s3 templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s object store infrastructure looks like this and should match a subset of what you’d be able to add directly to object_store_conf.yml (Galaxy’s global object store configuration).

YAML Syntax

galaxy.objectstore.templates.models

Ready To Use Production Object Store Templates

The templates are sufficiently generic that they may make sense for a variety of Galaxy instances, address a variety of potential use cases, and do not need any additional tailoring, parameterization, or other customization. These assume your Galaxy instance has a Vault configured and you’re comfortable with it storing your user’s secrets.

Allow Users to Define Azure Blob Storage as Object Stores

- id: azure
  version: 0
  name: Azure Blob Storage
  description: |
    This template allows storing dataset in [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
  configuration:
    type: azure_blob
    auth:
      account_name: '{{ variables.account_name }}'
      account_key: '{{ secrets.account_key}}'
    container:
      name: '{{ variables.container_name }}'
  variables:
    container_name:
      label: Container Name
      type: string
      help: |
        The name of your Azure Blob Storage container. More information on containers can be found
        in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
    account_name:
      label: Storage Account Name
      type: string
      help: |
        The name of your Azure Blob Storage account. More information on containers can be found in the
        [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
  secrets:
    account_key:
      label: Account Key
      help: |
        The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
        on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).

Screenshot

Allow Users to Define Generic S3 Compatible Storage Services as Object Stores

- id: generic_s3
  version: 0
  name: Any S3 Compatible Storage
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3 - Galaxy calls these ``generic_s3``
    storage locations. This template configuration allows using such service as a Galaxy storage
    location as long as you are able to find the connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation] calls these an "access key ID",
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``. Internally to Galaxy, we often just call
        this the ``access_key``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this whole value would be entered here.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: boto3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      endpoint_url: '{{ variables.endpoint_url }}'

Allow Users to Define AWS S3 Buckets as Object Stores

- id: aws_s3
  version: 0
  name: Amazon Web Services S3 Storage
  description: |
    Amazon's Simple Storage Service (S3) is Amazon's primary cloud storage service.
    More information on S3 can be found in [Amazon's documentation](https://aws.amazon.com/s3/).
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        A security credential for interacting with AWS services can be created from your
        AWS web console. Creating an "Access Key" creates a pair of keys used to identify
        and authenticate access to your AWS account - the first part of the pair  is
        "Access Key ID" and should be entered here. The second part of your key is the secret
        part called the "Secret Access Key". Place that in the secure part of this form below.
    bucket:
      label: Bucket
      type: string
      help: |
        The [AWS S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. You will need to create a bucket to use in your AWS web console before
        using this form.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        See the documentation above used "Access Key ID" for information about access key pairs.
  configuration:
    type: boto3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'

Screenshot

Allow Users to Define Google Cloud Provider S3 Interop Storage Buckets as Object Stores

This template includes descriptions of how to generate HMAC keys used by this interoperability layer provided by Google and lots of links to relevant Google Cloud Storage documentation.

# https://cloud.google.com/storage/docs/aws-simple-migration
- id: gcp_s3_interop
  version: 0
  name: Google Cloud Storage
  description: |
    This template can be used to connect to your [Google Cloud Storage](https://cloud.google.com/storage).
    To use these templates you will need to generate
    [HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys) - these
    can be linked to your user or a service account. Additionally, you will need to defined
    a [default Google cloud project](https://cloud.google.com/storage/docs/aws-simple-migration#defaultproj)
    to allow Galaxy to access your Google Cloud Storage via the interfaces
    described by this template.
  variables:
    access_key:
      label: Access ID
      type: string
      help: |
        This will be given to you by Google when you generate [HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys)
        to use your storage.
    bucket:
      label: Bucket
      type: string
      help: |
        The name of a [bucket](https://cloud.google.com/storage/docs/buckets) you've created to store your Galaxy data. Documentation for how to create buckets
        can be found in [this part of the Google Cloud Storage documentation](https://cloud.google.com/storage/docs/creating-buckets).
  secrets:
    secret_key:
      label: Secret Key
      help: |
        This will be given to you by Google when you generate [HMAC Keys](https://cloud.google.com/storage/docs/authentication/hmackeys)
        to use your storage. It should be 40 characters long and look something like the example used
        the Google documentation - `bGoa+V7g/yqDXvKRqq+JTFn4uQZbPiQJo4pf9RzJ`.
  configuration:
    type: boto3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
    connection:
      endpoint_url: 'https://storage.googleapis.com/'

Screenshot

File Source Templates

Galaxy’s file source templates are configured as a YAML list of template objects. This list can be placed file_source_templates.yml in Galaxy configuration directory (or any path pointed to by the configuration option file_source_templates_config_file in galaxy.yml). Alternatively, the configuration can be placed directly into galaxy.yml using the file_source_templates configuration option.

File Source Types

posix

The syntax for the configuration section of posix templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

s3fs

- id: s3fs
  version: 0
  name: S3 Compatible Storage with Credentials
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3. This template configuration allows
    using such service as a Galaxy storage location as long as you are able to find the
    connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
        calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this value would be entered here.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: s3fs
    endpoint_url: '{{ variables.endpoint_url }}'
    key: '{{ variables.access_key }}'
    secret: '{{ secrets.secret_key }}'
    bucket: '{{ variables.bucket }}'

- id: s3fs
  version: 1
  name: S3 Compatible Storage with Credentials
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3. This template configuration allows
    using such service as a Galaxy storage location as long as you are able to find the
    connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
        calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this value would be entered here.
    writable:
      label: Writable?
      type: boolean
      help: Is this a bucket you have permission to write to?
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: s3fs
    endpoint_url: '{{ variables.endpoint_url }}'
    key: '{{ variables.access_key }}'
    secret: '{{ secrets.secret_key }}'
    bucket: '{{ variables.bucket }}'
    writable: '{{ variables.writable }}'
- id: aws_public
  version: 0
  name: Amazon Web Services Public Bucket
  description: Setup anonymous access to a public AWS bucket.
  configuration:
    type: s3fs
    bucket: "{{ variables.bucket }}"
    writable: false
    anon: true
  variables:
    bucket:
      label: Bucket
      type: string
      help: |
        The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        anonymously access.
- id: aws_private
  version: 0
  name: Amazon Web Services Private Bucket
  description: Setup access to a private AWS bucket using a secret access key.
  configuration:
    type: s3fs
    bucket: "{{ variables.bucket }}"
    writable: "{{ variables.writable }}"
    secret: "{{ secrets.secret_key }}"
    key: "{{ variables.access_key }}"
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The "access key ID" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
    bucket:
      label: Bucket
      type: string
      help: |
        The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        access. This should be a bucket the user described by the Access Key ID has access to.
    writable:
      label: Writable?
      type: boolean
      help: Is this a bucket you have permission to write to?
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The "secret access key" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

ftp

- id: ftp
  version: 0
  name: An FTP Server
  description: |
    This template allows connecting to FTP servers. This file source plugin should
    support FTP and FTPS servers.
  configuration:
    type: ftp
    host: "{{ variables.host }}"
    user: "{{ variables.user }}"
    port: "{{ variables.port }}"
    passwd: "{{ secrets.password }}"
    writable: "{{ variables.writable }}"
  variables:
    host:
      label: FTP Host
      type: string
      help: Host of FTP Server to connect to.
    user:
      label: FTP User
      type: string
      help: |
        Username to connect with. Leave this blank to connect to the server
        anonymously (if allowed by target server).
    writable:
      label: Writable?
      type: boolean
      help: Is this an FTP server you have permission to write to?
    port:
      label: FTP Port
      type: integer
      help: Port used to connect to the FTP server.
      default: 21
  secrets:
    password:
      label: FTP Password
      help: |
        Password to connect to FTP server with. Leave this blank to connect
        to the server anonymously (if allowed by target server).

The syntax for the configuration section of ftp templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

azure

The syntax for the configuration section of azure templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

webdav

The syntax for the configuration section of webdav templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

dropbox

The syntax for the configuration section of dropbox templates looks like this.

At runtime, after the configuration template is expanded, the resulting dictionary passed to Galaxy’s file source plugin infrastructure looks like this and should match a subset of what you’d be able to add directly to file_sources_conf.yml (Galaxy’s global file source configuration).

YAML Syntax

galaxy.files.templates.models

Ready To Use Production File Source Templates

The templates are sufficiently generic that they may make sense for a variety of Galaxy instances, address a variety of potential use cases, and do not need any additional tailoring, parameterization, or other customization. These (mostly) assume your Galaxy instance has a Vault configured and you are comfortable with it storing your user’s secrets.

Allow Users to Define Generic FTP Servers as File Sources

- id: ftp
  version: 0
  name: An FTP Server
  description: |
    This template allows connecting to FTP servers. This file source plugin should
    support FTP and FTPS servers.
  configuration:
    type: ftp
    host: "{{ variables.host }}"
    user: "{{ variables.user }}"
    port: "{{ variables.port }}"
    passwd: "{{ secrets.password }}"
    writable: "{{ variables.writable }}"
  variables:
    host:
      label: FTP Host
      type: string
      help: Host of FTP Server to connect to.
    user:
      label: FTP User
      type: string
      help: |
        Username to connect with. Leave this blank to connect to the server
        anonymously (if allowed by target server).
    writable:
      label: Writable?
      type: boolean
      help: Is this an FTP server you have permission to write to?
    port:
      label: FTP Port
      type: integer
      help: Port used to connect to the FTP server.
      default: 21
  secrets:
    password:
      label: FTP Password
      help: |
        Password to connect to FTP server with. Leave this blank to connect
        to the server anonymously (if allowed by target server).

Screenshot

Allow Users to Define Azure Blob Storage as File Sources

- id: azure
  version: 0
  name: Azure Blob Storage
  description: |
    This template allows connecting to [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction).
  configuration:
    type: azure
    container_name: "{{ variables.container_name }}"
    account_name: "{{ variables.account_name }}"
    account_key: "{{ secrets.account_key }}"
    namespace_type: "{{ 'hierarchical' if variables.hierarchical else 'flat' }}"
    writable: "{{ variables.writable }}"
  variables:
    container_name:
      label: Container Name
      type: string
      help: |
        The name of your Azure Blob Storage container. More information on containers can be found
        in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers).
    account_name:
      label: Storage Account Name
      type: string
      help: |
        The name of your Azure Blob Storage account. More information on containers can be found in the
        [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview).
    hierarchical:
      label: Hierarchical?
      type: boolean
      default: true
      help: |
        Is this storage hierarchical (e.g. does it use a Azure Data Lake Storage Gen2 hierarchical namespace)?
        More information on Data Lake Storage namespaces can be found in the
        [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace).
    writable:
      label: Writable?
      type: boolean
      default: true
      help: Allow Galaxy to write data to this Azure Blob Storage container.
  secrets:
    account_key:
      label: Account Key
      help: |
        The Azure Blob Storage account key to use to access your Azure Blob Storage data. More information
        on account keys can be found in the [Azure Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage).

Screenshot

Allow Users to Define Generic S3 Compatible Storage as File Sources

- id: s3fs
  version: 0
  name: S3 Compatible Storage with Credentials
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3. This template configuration allows
    using such service as a Galaxy storage location as long as you are able to find the
    connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
        calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this value would be entered here.
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: s3fs
    endpoint_url: '{{ variables.endpoint_url }}'
    key: '{{ variables.access_key }}'
    secret: '{{ secrets.secret_key }}'
    bucket: '{{ variables.bucket }}'

- id: s3fs
  version: 1
  name: S3 Compatible Storage with Credentials
  description: |
    The APIs used to connect to Amazon's S3 (Simple Storage Service) have become something
    of an unofficial standard for cloud storage across a variety of vendors and services.
    Many vendors offer storage APIs compatible with S3. This template configuration allows
    using such service as a Galaxy storage location as long as you are able to find the
    connection details and have the relevant credentials.

    Given the amount of information needed to connect to such a service, this is a bit of an
    advanced template and probably should not be used to connect to a service if a more
    specific template is available.
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The less secure part of your access tokens or access keys that describe the user
        that is accessing the data. The [Amazon documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html)
        calls these an "access key ID", the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_access_key_id``.
    bucket:
      label: Bucket
      type: string
      help: |
        The [bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        store your datasets in. How to setup buckets for your storage will vary from service to service
        but all S3 compatible storage services should have the concept of a bucket to namespace
        a grouping of your data together with.
    endpoint_url:
      label: S3-Compatible API Endpoint
      type: string
      help: |
        If the documentation for your storage service has something called an ``endpoint_url``,
        For instance, the CloudFlare documentation describes its endpoints as ``https://<accountid>.r2.cloudflarestorage.com``. Here
        you would substitute your CloudFlare account ID into the endpoint url and use that value.
        So if your account ID was ``galactian``, you would enter ``galactian.r2.cloudflarestorage.com``.
        The [MinIO](https://min.io/docs/minio/linux/integrations/aws-cli-with-minio.html)
        documentation describes the endpoint URL for its Play service as ``https://play.min.io:9000``,
        this value would be entered here.
    writable:
      label: Writable?
      type: boolean
      help: Is this a bucket you have permission to write to?
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The secret key used to connect to the S3 compatible storage with for the given access key.

        The [Amazon documentation] calls these an "secret access key" and
        the [CloudFlare documentation](https://developers.cloudflare.com/r2/examples/aws/boto3/)
        describes these as ``aws_secret_access_key``. Internally to Galaxy, we often just call
        this the ``secret_key``.
  configuration:
    type: s3fs
    endpoint_url: '{{ variables.endpoint_url }}'
    key: '{{ variables.access_key }}'
    secret: '{{ secrets.secret_key }}'
    bucket: '{{ variables.bucket }}'
    writable: '{{ variables.writable }}'

Allow Users to Define Publicly Accessible AWS S3 Buckets as File Sources

- id: aws_public
  version: 0
  name: Amazon Web Services Public Bucket
  description: Setup anonymous access to a public AWS bucket.
  configuration:
    type: s3fs
    bucket: "{{ variables.bucket }}"
    writable: false
    anon: true
  variables:
    bucket:
      label: Bucket
      type: string
      help: |
        The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        anonymously access.

Screenshot

Allow Users to Define Private AWS S3 Buckets as File Sources

- id: aws_private
  version: 0
  name: Amazon Web Services Private Bucket
  description: Setup access to a private AWS bucket using a secret access key.
  configuration:
    type: s3fs
    bucket: "{{ variables.bucket }}"
    writable: "{{ variables.writable }}"
    secret: "{{ secrets.secret_key }}"
    key: "{{ variables.access_key }}"
  variables:
    access_key:
      label: Access Key ID
      type: string
      help: |
        The "access key ID" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
    bucket:
      label: Bucket
      type: string
      help: |
        The [Amazon Web Services Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) to
        access. This should be a bucket the user described by the Access Key ID has access to.
    writable:
      label: Writable?
      type: boolean
      help: Is this a bucket you have permission to write to?
  secrets:
    secret_key:
      label: Secret Access Key
      help: |
        The "secret access key" as defined in the [Amazon Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).

Allow Users to Define WebDAV Servers as File Sources

- id: webdav
  version: 0
  name: WebDAV
  description: |
    The WebDAV protocol is a simple way to access files over the internet. This template
    configuration allows you to connect to a WebDAV server.
  variables:
    url:
      label: Server Domain (e.g. https://myowncloud.org)
      type: string
      help: |
        The domain of the WebDAV server you are connecting to. This should be the full URL
        including the protocol (http or https) and the domain name.
    root:
      label: WebDAV server Path (should end with /remote.php/webdav, e.g. /a/sub/path/remote.php/webdav)
      type: string
      help: |
        The full server path to the WebDAV service. Ensure the path includes /remote.php/webdav.
    login:
      label: Username
      type: string
      help: |
        The username to use to connect to the WebDAV server. This should be the username you use
        to log in to the WebDAV server.
    writable:
      label: Writable?
      type: boolean
      default: false
      help: Allow Galaxy to write data to this WebDAV server.
  secrets:
    password:
      label: Password
      help: |
        The password to use to connect to the WebDAV server. This should be the password you use
        to log in to the WebDAV server.
  configuration:
    type: webdav
    url: '{{ variables.url }}'
    root: '{{ variables.root }}'
    login: '{{ variables.login }}'
    writable: '{{ variables.writable }}'
    password: '{{ secrets.password }}'

Screenshot

Production OAuth 2.0 File Source Templates

Unlike the examples in the previous section. These examples will require a bit of configuration on the part of the admin. This is to obtain client credentials from the external service and register an OAuth 2.0 redirection callback with the remote service.

Dropbox

Once you have OAuth 2.0 client credentials from Dropbox (called oauth2_client_id and oauth2_client_secret here), the following configuration can be used configure your Galaxy instance to enable Dropbox.

- id: dropbox
  name: Dropbox
  description: Connect to your Dropbox account to download and upload files.
  configuration:
    type: dropbox
    oauth2_client_id: "{{ environment.oauth2_client_id }}"
    oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
    writable: true
  environment:
    oauth2_client_id:
      type: variable
      variable: GALAXY_DROPBOX_APP_CLIENT_ID
    oauth2_client_secret:
      type: variable
      variable: GALAXY_DROPBOX_APP_CLIENT_SECRET

To use this template - you’ll need to make your credentials available to Galaxy’s web and job handler processes using the environment variables GALAXY_DROPBOX_APP_CLIENT_ID and GALAXY_DROPBOX_APP_CLIENT_SECRET. Your jobs themselves do not require these secrets to be set and will not be given the secrets.

If you’d like to configure these secrets explicit - you can configure them explicitly in the configuration. If your configuration file is managed by Ansible, these secrets could potentially be populated from your Ansible vault.

- id: dropbox
  name: Dropbox
  description: Connect to your Dropbox account to download and upload files.
  configuration:
    type: dropbox
    oauth2_client_id: abcdefgh
    oauth2_client_secret: ijklmnopqr

To obtain the OAuth 2.0 credentials from Dropbox, you’ll need to navigate to your Dropbox Apps and create a new app for your Galaxy instance with the “Create app” button.

Screenshot of Dropbox

The only option available is “Scoped access” and this works fine for typical Galaxy use cases. You will however want to click “Full Dropbox” to request full access to your user’s account. You will also need to give your “App” a name here, this should likely be something related to your Galaxy instances name.

Screenshot of Dropbox

After your app is created, you’ll be presented with a management screen for it. The first thing you’ll want to do is navigate to the permissions tab and enable permissions to read and write to files and directories so the file source plugin works properly:

Screenshot of Dropbox

Next, navigate back to the “Settings” tab. You’ll need to register a callback for your Galaxy instance (it will need HTTPS enabled). This should be the URL to your Galaxy instance with oauth2_callback appended to it.

Screenshot of Dropbox

Finally you’ll be able to find the oauth2_client_id and oauth2_client_secret to configured your Galaxy with on this settings page.

Screenshot of Dropbox

Until you have 50 users, your App will be considered a “development” application. The upshot of this is that your user’s will get a scary message during authorization but there seems to be no way around this. 50 users would definitely be considered a production Galaxy instance but Dropbox operates on a different scale.

For more information on what Dropbox considers a “development” app versus a “production” app - checkout the Dropbox documentation.

Playing Nicer with Ansible

Many large instances of Galaxy are configured with Ansible and much of the existing administrator documentation leverages Ansible. The configuration template files using Jinja templating and so does Ansible by default. This might result in a lack of clarity of when templates (strings starting with {{ and ending with }}) are being evaluated. Ansible templates are evaluated at deploy time and the configuration objects describing plugins are evaluated at Galaxy runtime.

The easiest way to fix this is probably to store these templates files in your Ansible as plain files and not templates. If you’d like to use Ansible templating to build up these files you’ll very likely need to tell either Galaxy or Ansible to use something other than {{ and }} for templating variables. This can be done by placing a directive at the top of your template that is consumed by Ansible. For instance, to have [% and %] used instead of {{ and }} by Ansible at deploy time, the file could start with:

#jinja2:variable_start_string:'[%' , variable_end_string:'%]'

In this case, variables wrapped by [% and %] are expanded by Ansible and use the Ansible environment and {{ and }} are reserved for Galaxy templating.

Alternatively, Galaxy can be configured to use a custom template on a per-configuration object basis by setting the template_start and/or template_end variables.

The following template chunk shows how to override the templating Galaxy does for a particular object store configuration. Similar templating overrides work for file source plugin templates.

- id: project_disk
  version: 0
  name: Project Disk
  description: |
    Disk in our institutional ``/data`` directory for you user's project.""
  configuration:
    type: posix
    root: '/data/projects/@= user.username | ensure_path_component =@/@= variables.project_name | ensure_path_component =@'
    template_start: '@='
    template_end: '=@'
  variables:
    project_name:
      type: path_component
      help: Project name used in path.
  • https://github.com/ansible/ansible/pull/75306

  • https://stackoverflow.com/questions/12083319/add-custom-tokens-in-jinja2-e-g-somevar

Jinja Template Reference

Galaxy configuration file templating uses Jinja to template values and connect inputs, configuration, and the runtime environment into concrete configuration YAML blocks.

Jinja is fairly straight forward to learn but this document provides tons of examples and one can probably adapt them to whatever you’re interested in building without really needing to dig deeply into Jinja. However, this section does outline what Galaxy does inject into the Jinja environment to serve as a reference.

Even the most exotic configurations will likely only scratch the surface of what Jinja allows and implements. The only relevant Jinja documentation you’ll need in these cases is probably just those documents on variables, filters, and the list of builtin filters.

variables

This is a typed dictionary object is populated with user supplied values defined via the the variables section of the configuration template and filled in by the user when they created a new object store or file source.

secrets

This is a dictionary of strings populated with user supplied secrets defined via the the secrets section of the configuration template and filled in by the user when they created a new object store or file source.

A deep dive into these can be found in the User Secrets section of this document.

environment

This dictionary object is populated with admin-supplied values defined via the the environment section of the configuration template.

A deep dive into these can be found in the Admin Secrets section of this document.

- id: admin_secret_directory
  version: 0
  name: Secret Directory with Defaults
  description: An directory constructed from admin secrets or defaults.
  configuration:
    type: posix
    root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
  environment:
    var:
      type: variable
      variable: GALAXY_SECRET_HOME_VAR
      default: default_var
    sec:
      type: secret
      vault_key: "secret_directory_file_source/my_secret"
      default: default_sec

user

This dictionary object exposes information about user configuring and using a target template configuration. These values are populated from the galaxy_user table of the Galaxy database. The current properties exposed include:

Key

Description

username

string corresponding the username of the Galaxy user

email

string corresponding the email of the Galaxy user

id

integer primary key of user object in the Galaxy database

The simple example of project scratch storage used to describe these concepts made use the Galaxy user’s username to generate unique paths.

- id: project_scratch
  name: Project Scratch
  version: 0
  description: Folder on institutional scratch disk area bound to your user.
  variables:
    project_name:
      type: path_component
      help: The name of your project scratch.
  configuration:
    type: disk
    files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
    badges:
    - type: faster
    - type: less_secure
    - type: not_backed_up

ensure_path_component

This Jinja filter will fail template evaluation if the value it is applied to is not a simple directory name. If it contain .. or / or in some other way might be used to attempt path exploitation of cause odd path-related bugs. This is useful when producing paths for disk object stores or posix file sources.

When taking inputs from users, setting the type of path_component instead of string allows the client to validate potential issues way before this point, but many path components might be built from environment variables or usernames or sources like this that are not explicitly user inputs.

An example of an object store template that uses this is the simple scratch example that was used to introduce concepts at the start of the object store template documentation above.

- id: project_scratch
  name: Project Scratch
  version: 0
  description: Folder on institutional scratch disk area bound to your user.
  variables:
    project_name:
      type: path_component
      help: The name of your project scratch.
  configuration:
    type: disk
    files_dir: '/scratch/for_galaxy/{{ user.username | ensure_path_component }}/{{ variables.project_name | ensure_path_component }}'
    badges:
    - type: faster
    - type: less_secure
    - type: not_backed_up

asbool

This Jinja filter will use Galaxy configuration style logic to convert string values into boolean ones.

When taking inputs from users, setting the type of boolean is sufficient to ensure a variable is boolean, but “secrets” and environment variables and many other things are likely to be of type string but should be used in a template that expects boolean values.

An example of an object store template that uses this is secure environment parameter on the simple minio example.

- id: minio
  version: 0
  name: Institutional S3 Storage
  description: Connect to our institutional MinIO storage service.
  variables:
    access_key:
      type: string
      help: A description of the user account used to connect to your storage.
    bucket:
      type: string
      help: The bucket to connect to.
  secrets:
    secret_key:
      help: The secret key used to connect to MinIO with for the given access key.
  environment:
    host:
      type: variable
      variable: GALAXY_MINIO_HOST
      default: localhost
    port:
      type: variable
      variable: GALAXY_MINIO_PORT
      default: "9000"
    secure:
      type: variable
      variable: GALAXY_MINIO_IS_SECURE
      default: "true"
    connection_path:
      type: variable
      variable: GALAXY_MINIO_CONNECTION_PATH
      default: ""
  configuration:
    type: generic_s3
    auth:
      access_key: '{{ variables.access_key }}'
      secret_key: '{{ secrets.secret_key }}'
    bucket:
      name: '{{ variables.bucket }}'
      use_reduced_redundancy: false
    connection:
      host: '{{ environment.host }}'
      port: '{{ environment.port | int  }}'
      is_secure: '{{ environment.secure | asbool }}'
      conn_path: '{{ environment.connection_path }}'
    badges:
    - type: slower
    - type: less_secure
    - type: less_stable

Connecting Configuration Templates to Secrets

User Secrets

Most of the examples in this document use secrets of one kind or another. For instance, in the FTP example - the password field is a secret.

- id: ftp
  version: 0
  name: An FTP Server
  description: |
    This template allows connecting to FTP servers. This file source plugin should
    support FTP and FTPS servers.
  configuration:
    type: ftp
    host: "{{ variables.host }}"
    user: "{{ variables.user }}"
    port: "{{ variables.port }}"
    passwd: "{{ secrets.password }}"
    writable: "{{ variables.writable }}"
  variables:
    host:
      label: FTP Host
      type: string
      help: Host of FTP Server to connect to.
    user:
      label: FTP User
      type: string
      help: |
        Username to connect with. Leave this blank to connect to the server
        anonymously (if allowed by target server).
    writable:
      label: Writable?
      type: boolean
      help: Is this an FTP server you have permission to write to?
    port:
      label: FTP Port
      type: integer
      help: Port used to connect to the FTP server.
      default: 21
  secrets:
    password:
      label: FTP Password
      help: |
        Password to connect to FTP server with. Leave this blank to connect
        to the server anonymously (if allowed by target server).

Instead of being saved in the database in plain text, Galaxy will use a configured Vault to store this data. Check out Galaxy admin documentation on Storing secrets in the vault for descriptions of how to configure a vault. Most interesting user defined file sources and/or object stores will require a Galaxy Vault.

In this FTP example, a new Vault key will be created for each FTP instance the user creates. The user file source APIs and management user interface will be responsible for orchestration of storing and updating secrets. The Vault key for this password will be something like:

/galaxy/user/<user_id>/file_source_config/<file_source_instance_uuid>/password

Here user_id is the primary key of the User object in the database and file_source_instance_uuid is the uuid value corresponding to the user_file_source table in the database.

User defined object stores are stored in a similar fashion but at:

/galaxy/user/<user_id>/object_store_config/<object_store_instance_uuid>/<secret_name>

During the creation of an object store or file source, the secrets will be appended to the generated form as password fields.

After an object store has been created, a user has the option to edit the settings in the UI. Most of the settings appear in a simple form - but the secrets are managed and updated individually in the “Secrets” tab.

Admin Secrets

Administrators may define secrets that are available to all users and aren’t parameterized on a per-instance basis. These secrets can be injected into template instances through Vault keys or through environment variables.

Each template may optionally define an environment key where these can be defined. The following template entry describes a file source that injects the environment variable GALAXY_SECRET_HOME_VAR into the template as environment.var and injects the Vault key secret_directory_file_source/my_secret into the template as environment.var. This template uses these variables to construct a root path for a posix file source but the same secrets could just as easily store cloud keys and configure an S3 object store.

- id: admin_secret_directory
  version: 0
  name: Secret Directory
  description: An directory constructed from admin secrets.
  configuration:
    type: posix
    root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
  environment:
    var:
      type: variable
      variable: GALAXY_SECRET_HOME_VAR
    sec:
      type: secret
      vault_key: "secret_directory_file_source/my_secret"

If you’d like to make the target secrets optional, default values can also be setup. The following block demonstrates the same configuration but with default values of default_var for the default var value and default_sec for the default sec value. These will be used in the target Vault keys are absent or the target environment variable is not defined at runtime.

- id: admin_secret_directory
  version: 0
  name: Secret Directory with Defaults
  description: An directory constructed from admin secrets or defaults.
  configuration:
    type: posix
    root: /path/to/data/{{ environment.var }}/{{ environment.sec }}
  environment:
    var:
      type: variable
      variable: GALAXY_SECRET_HOME_VAR
      default: default_var
    sec:
      type: secret
      vault_key: "secret_directory_file_source/my_secret"
      default: default_sec

OAuth 2.0 Enabled Configurations

OAuth 2.0 has become an industry standard for allowing users of various services (e.g. Dropbox or Google Drive) to authorize other services (e.g. Galaxy) fine grained access to the services. There is a bit of a dance the services need to do but the result can be a fairly nice end-user experience. The framework for configuring user defined data access templates can support OAuth 2.0.

Galaxy keeps track of which plugin types (currently only file source types) require OAuth2 to work properly and will take care of authorization redirection, saving refresh tokens, etc.. implicitly. One such type is dropbox. Here is the production Dropbox template distributed with Galaxy.

- id: dropbox
  name: Dropbox
  description: Connect to your Dropbox account to download and upload files.
  configuration:
    type: dropbox
    oauth2_client_id: "{{ environment.oauth2_client_id }}"
    oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
    writable: true
  environment:
    oauth2_client_id:
      type: variable
      variable: GALAXY_DROPBOX_APP_CLIENT_ID
    oauth2_client_secret:
      type: variable
      variable: GALAXY_DROPBOX_APP_CLIENT_SECRET

OAuth2 enabled plugin types include template definitions that include oauth2_client_id and oauth2_client_secret in the configuration (as shown in the following specification and in the above examples).

The above example defines these secrets using environment variables but they can stored in Galaxy’s Vault explicitly by the admin or written right to the configuration files as shown in the next two examples:

- id: dropbox
  name: Dropbox
  description: Connect to your Dropbox account to download and upload files.
  configuration:
    type: dropbox
    oauth2_client_id: "{{ environment.oauth2_client_id }}"
    oauth2_client_secret: "{{ environment.oauth2_client_secret }}"
    writable: true
  environment:
    oauth2_client_id:
      type: secret
      vault_key: "dropbox_file_source/client_id"
    oauth2_client_secret:
      type: secret
      vault_key: "dropbox_file_source/client_secret"
- id: dropbox
  name: Dropbox
  description: Connect to your Dropbox account to download and upload files.
  configuration:
    type: dropbox
    oauth2_client_id: abcdefgh
    oauth2_client_secret: ijklmnopqr

Looking at the configuration objects that get generated at runtime from these templates though - oauth2_client_id and oauth2_client_secret no longer appear and instead have been replaced with a oauth2_access_token parameter. Galaxy will take care of stripping out the client (e.g. Galaxy server) information and replacing it with short-term access tokens generated for the user’s resources.

Normally, a UUID is created for each user configured instance object and this is used to store the template’s explicitly listed secrets in Galaxy’s Vault. For OAuth 2.0 plugin types - before user’s are even prompted for configuration metadata they are redirected to the remote service and prompted to authorize Galaxy to act on their behalf when using the remote service. If they authorize this, the remote service will send an authorization code to https://<galaxy_url>/oauth2_callback along with state information to recover which instance is being configured. At this point, Galaxy will fetch a refresh token from the remote resource using the supplied authorization code. The refresh token is stored in the Vault in key associated with the UUID of the object that will be created when the user finishes the creation process. Specifically it is stored at

/galaxy/user/<user_id>/file_source_config/<file_source_instance_uuid>/_oauth2_refresh_token

Here is the prefix at the end of _ is indicating that Galaxy is managing this instead of it being listed explicitly in a secrets section of the template configuration like the explicit Vault secrets discussed in this document.

Galaxy knows how to fetch an access token from this refresh token that is actually used to interact with the remote resource. This is the property oauth2_access_token that is injected into the configuration object shown above and passed along to the actual object store or file source plugin implementation.