Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

How SQLAlchemy sessions are handled in Galaxy

Access to the Session object registry

We have a central resource from which to grab the Session object registry see sessionmaker in the SQLAlchemy documentation, that is app.model.session. You can either use it directly as a proxy to the current session or call app.model.session() to get the proxied Session object, or use the context manager (e.g. with app.model.session() as session:), as detailed in the SQLAlchemy documentation.

Session object registry scopes

The default scope of the Session object registry is thread local. This means that accessing a Session through the Session object registry will return the same Session in the same thread. This default scope can be changed if necessary.

WSGI, ASGI, Celery

In the context of the web app (WSGI and ASGI) there is a clearly delineated lifespan for a session, and that lifespan isn’t tied to a thread. The scope of the Session object registry is set at the beginning of a web request using app.model.set_request_id(request_id). At the end of a web request we remove the scope using app.model.unset_request_id(request_id). This also closes and removes any active Session object. Similarly, we set the scope for each Celery task execution using set_request_id and unset_request_id.

By using this mechanism we are certain that web requests and celery tasks receive a new Session object and that any resources held by the session are released.

Job and Workflow handlers

Job and Workflow handlers perform multiple pieces of business logic in separate threads and care needs to be taken when deciding when to start a transaction, when to do a rollback and if and where a session lifespan can be introduced.

The following paragraph discusses the choices made in https://github.com/galaxyproject/galaxy/blob/064360dab053426ce166bb7274d6a3a5790eb7cc/lib/galaxy/jobs/runners/__init__.py#L835-L843:

The JobHandlerQueue.monitor method is executed as a thread within the job handler process, and all work that requires access to the database is happening inside check_watched_items, which iterates over jobs assigned to this handler process. By setting the scope to a new, random uuid around check_watched_items we ensure that even if the session becomes corrupt the session is properly discarded and the next iteration of check_watched_items receives a new, clean session to work with. As the work within JobHandlerQueue.monitor is happening within a single thread it is not technically necessary to set a custom scope, and one could simply call app.model.session.close; however, there is very little cost associated to setting a custom scope and it becomes very clear where the session lifespan starts and ends.

As a guiding principle an attempt should be made to manage the session state and database-related exception handling as high up in the calling stack as possible.