Galaxy Interactive Environments =============================== A GIE is a Docker container, launched by Galaxy, proxied by Galaxy, with some extra sugar inside the container to allow users to interact easily with their Galaxy histories. How GIEs Work ------------- A GIE is primarily composed of a Docker container, and the Galaxy visualization component. Galaxy visualization plugins are rendered using Mako templates and Mako templates in turn can run Python code. GIEs build upon visualization plugins, adding features to allow for container management and proxying. This Python code in the Mako templates is used to launch the Docker container within which a GIE runs. Once this container is launched, we notify a proxy built into Galaxy which helps coordinate a 1:1 mapping of users and their docker containers. Here's a simple diagram recapping the above: .. image:: interactive_environments.png Deploying GIEs -------------- Deploying GIEs is not a trivial operation. They have complex interactions with numerous services, you'll need to be a fairly competent SysAdmin to debug all of the possible problems that can occur during deployment. After the initial hurdle, most find that GIEs require little to no maintenance. An `Ansible `__ role for installing and managing GIEs can be found on `Github `__ and `Ansible Galaxy `__. Setting up the Proxy ^^^^^^^^^^^^^^^^^^^^ The Galaxy IE Proxy is a NodeJS+Sqlite3 application. The NodeJS that is installed by default into the Galaxy Virtualenv is suitable for an execution environment for the Galaxy IE Proxy. - Note that if you have NodeJS installed under Ubuntu, it often installs to ``/usr/bin/nodejs``, whereas ``npm`` expects it to be ``/usr/bin/node``. You may need to create that symlink yourself. Once Node and npm are ready to go, you'll need to install the dependencies .. code-block:: console $ cd $GALAXY_ROOT/lib/galaxy/web/proxy/js $ npm install Running ``node lib/main.js --help`` should produce some useful help text .. code-block:: console Usage: main [options] Options: -h, --help output usage information -V, --version output the version number --ip Public-facing IP of the proxy --port Public-facing port of the proxy --cookie Cookie proving authentication --sessions Routes file to monitor --verbose There are two ways to handle actually running the proxy. The first is to have Galaxy automatically launches the proxy as needed. This is the default configuration as of 2014. Alternately, the proxy can be stated manually or via a system such as Supervisord. Assuming that the ``$GALAXY_ROOT`` environment variable refers to the location of the Galaxy installation, the command for launching the proxy is: .. code-block:: console $ node $GALAXY_ROOT/lib/galaxy/web/proxy/js/lib/main.js --ip 0.0.0.0 \ --port 8800 --sessions $GALAXY_ROOT/database/session_map.sqlite \ --cookie galaxysession --verbose And this can be configured in your supervisord config by adding: .. code-block:: console [program:galaxy_nodejs_proxy] directory = GALAXY_ROOT command = GALAXY_ROOT/lib/galaxy/web/proxy/js/lib/main.js --sessions database/session_map.sqlite --ip 0.0.0.0 --port 8800 autostart = true autorestart = unexpected user = GALAXY_USER startsecs = 5 redirect_stderr = true where ``GALAXY_ROOT`` is the location of your Galaxy installation and ``GALAXY_USER`` is the username of the user that Galaxy runs as. Configuring the Proxy ^^^^^^^^^^^^^^^^^^^^^ Configuration is all managed in ``galaxy.yml``. The default arguments used for the proxy are: .. code-block:: yaml dynamic_proxy_manage: true dynamic_proxy_session_map: database/session_map.sqlite dynamic_proxy_bind_port: 8800 dynamic_proxy_bind_ip: 0.0.0.0 dynamic_proxy_debug: true As you can see most of these variables map directly to the command line arguments to the NodeJS script. There are a few extra parameters which will be needed if you run Galaxy behind an upstream proxy like nginx or Apache: .. code-block:: yaml dynamic_proxy_external_proxy: true dynamic_proxy_prefix: gie_proxy The first option says that you have Galaxy and the Galaxy NodeJS proxy wrapped in an upstream proxy like Apache or NGINX. This will cause Galaxy to connect users to the same port as Galaxy is being served on (so 80/443), rather than directing them to port 8800. The second option is closely entertwined with the first option. When Galaxy is accessed, it sets a cookie called ``galaxysession``. This cookie generally cannot be sent with requests to different domains and different ports, so Galaxy and the dynamic proxy must be accessible on the same port and protocol. In addition, the cookie is only accessible to URLs that share the same prefix as the Galaxy URL. For example, if you're running Galaxy under a URL like ``https://f.q.d.n/galaxy/``, the cookie is only accessible to URLs that look like ``https://f.q.d.n/galaxy/*``. The second (``dynamic_proxy_prefix``) option sets the URL path that's used to differentiate requests that should go through the proxy to those that should go to Galaxy. You will need to add special upstream proxy configuration to handle this, and you'll need to use the same ``dynamic_proxy_prefix`` in your ``galaxy.yml`` that you use in your URL routes. In the examples below, we assume that your Galaxy installation is available at a URL such as ``https://f.q.d.n/galaxy``. If instead it is available at a URL like ``https://f.q.d.n``, remove the ``/galaxy`` prefix from the examples. For example ``/galaxy/gie_proxy`` would become ``/gie_proxy``. Remember that ``gie_proxy`` is the value you use for the ``dynamic_proxy_prefix`` option. If you use a different value in that option you should change the examples accordingly. **Apache** .. code-block:: apache # Project Jupyter specific. Other IEs may require their own routes. ProxyPass /galaxy/gie_proxy/jupyter/ipython/api/kernels ws://localhost:8800/galaxy/gie_proxy/jupyter/ipython/api/kernels # Global GIE configuration ProxyPass /galaxy/gie_proxy http://localhost:8800/galaxy/gie_proxy ProxyPassReverse /galaxy/gie_proxy http://localhost:8800/galaxy/gie_proxy # Normal Galaxy configuration ProxyPass /galaxy http://localhost:8000/galaxy ProxyPassReverse /galaxy http://localhost:8000/galaxy Please note you will need to be using apache2.4 with ``mod_proxy_wstunnel``. **Nginx** .. code-block:: nginx # Global GIE configuration location /galaxy/gie_proxy { proxy_pass http://localhost:8800/galaxy/gie_proxy; proxy_redirect off; } # Project Jupyter specific. Other IEs may require their own routes. location ~ ^/galaxy/gie_proxy/jupyter/(?[^/]+)/api/kernels(?.*?)$ { proxy_pass http://localhost:8800/galaxy/gie_proxy/jupyter/$nbtype/api/kernels$rest; proxy_redirect off; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } If you proxy static content, you may find the following rule useful for proxying to GIE and other visualization plugin static content. .. code-block:: nginx location ~ ^/static/plugins/(?.+?)/(?.+?)/static/(?.*?)$ { alias /path/to/galaxy-dist/config/plugins/$plug_type/$vis_name/static/$static_file; } Docker on Another Host ^^^^^^^^^^^^^^^^^^^^^^ You might want to run your IEs on a host different to the one that hosts your Galaxy webserver, since IEs on the same host as the webserver compete for resources with that webserver and introduce some security considerations which could be mitigated by moving containers to a separate host. This feature has been available since 15.07 and is used in production at the University of Freiburg and on usegalaxy.org. First you need to configure a second host to be Docker enabled. In the following we call this host ``gx-docker`` You need to start the Docker daemon and bind it to a TCP port, not to a Unix socket as is the default. For example you can start the daemon with .. code-block:: console $ docker -H 0.0.0.0:4243 -d On your client, the Galaxy webserver, you can now install a Docker client. This can also be done on older systems like Scientific-Linux, CentOS 6, which do not have Docker support by default. The client just talks to the Docker daemon on host ``gx-docker``, and does not run anything itself, locally. You can test your configuration for example by starting busybox from your client on the Docker host with .. code-block:: console $ docker -H tcp://gx-docker:4243 run -it busybox sh So far so good! Note, however, that unless restricted by a firewall, this mode of operation is insecure, as any client could connect and run containers on ``gx-docker``. If this is a concern at your site, follow the instructions in the Docker documentation to `Protect the Docker daemon socket `__. Now we need to configure Galaxy to use our new Docker host to start the Interactive Environments. For that we need to edit the Jupyter GIE configuration, ``jupyter.ini`` to use our custom docker host .. code-block:: ini [main] [docker] command = docker -H tcp://gx-docker:4243 {docker_args} docker_hostname = gx-docker Please adapt your ``command`` as needed. The Jupyter GIE supports getting and fetching Galaxy history datasets entirely through the Galaxy API so it is not necessary to share a filesystem with ``gx-docker``. However, other GIE plugins may still require this. For those GIE plugins, we need to configure a share mount point between the Docker host and Galaxy. Unfortunately, this can not be a NFS mount. Docker does not like NFS yet. You could for example use a sshfs mount with the following script .. code-block:: bash if mount | grep ^gx-docker:/var/tmp/gx-docker; then echo "/var/tmp/gx-docker already mounted." else sshfs gx-docker:/var/tmp/gx-docker /var/tmp/gx-docker echo 'Mounting ...' fi This will let Galaxy and the Docker host share temporary files. Docker Engine Swarm Mode ^^^^^^^^^^^^^^^^^^^^^^^^ As of Docker Engine version 1.12, Docker Engine can be configured to provide a cluster of Docker Engines in a configuration known as *Docker Engine swarm mode*. This replaces the previous and similarly named *Docker Swarm* clustering solution, which is not compatible with swarm mode. `The Docker Engine swarm mode documentation `__ fully explains the differences, but the major difference is that whereas under Docker Swarm one could run commands on the swarm with ``docker run``, Docker Engine swarm mode requires one to create persistent services with ``docker service create`` and to remove those services once no longer in use with ``docker service rm``. Galaxy supports both Docker Engine swarm mode and the legacy Docker Swarm system. Legacy Docker Swarm is supported without any special configuration, because the containers are still run with ``docker run`` as before. To support Docker Engine swarm mode, additional configuration is required. Begin by editing your GIE plugin's ini configuration file (e.g. ``jupyter.ini``) and set the ``docker_connect_port`` in addition to any other relevant options. Unless you are using a non-standard Docker image, the correct value for ``docker_connect_port`` should be suggested to you in the sample configuration file: .. code-block:: ini [docker] docker_connect_port = 8888 Note that your Galaxy server does not need to be a member of the swarm itself. It can use the method outlined above in the `Docker on Another Host`_ section to connect as a client to a Docker daemon acting as a swarm mode manager. Once configured, you should see that your GIE containers are started and run as services, which you can inspect using the ``docker service ls`` command and other ``docker service`` subcommands.