quetzal.app package

quetzal.app.create_app(config_name=None)

Submodules

quetzal.app.background module

Background tasks

quetzal.app.background.backup_logs(app)
quetzal.app.background.hello()

quetzal.app.hacks module

Hacks needed to circumvent connexion validation

There is a bug on the connexion library concerning content negotiation and response validation. See https://github.com/zalando/connexion/issues/860

Until this issue is fixed, we need to find a way to avoid a false validation error when a requests sends an ‘application/octet-stream’ accept header when downloading files

class quetzal.app.hacks.CustomResponseValidator(operation, mimetype, validator=None)

Bases: connexion.decorators.response.ResponseValidator

validate_response_with_request(request, data, status_code, headers, url)

quetzal.app.models module

class quetzal.app.models.ApiKey(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

class quetzal.app.models.BaseMetadataKeys

Bases: enum.Enum

Set of metadata keys that exist in the base metadata family

The base metadata family is completely managed by Quetzal; a user cannot set or change its values (with the exception of the value for the path or filename keys). This enumeration defines the set of keys that exist in this family.

CHECKSUM = 'checksum'

MD5 checksum of the file

DATE = 'date'

Date when this file was created.

FILENAME = 'filename'

Filename, without its path component.

ID = 'id'

Unique file identifier.

PATH = 'path'

Path component of the filename.

SIZE = 'size'

Size in bytes of the file.

STATE = 'state'

State of the file; see FileState.

URL = 'url'

URL where this file is stored.

class quetzal.app.models.Family(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

Quetzal metadata family

In quetzal, metadata are organized in semantic groups that have a name and a version number. This is the definition of a metadata _family_. This class represents this definition. It is attached to a workspace, until the workspace is committed: at this point the family will be disassociated from the workspace to become global (available as public information).

id

Identifier and primary key of a family.

Type:int
name

Name of the family.

Type:str
version

Version of the family. Can be None during a workspace creation, and until its initialization, to express the latest available version.

Type:int
description

Human-readable description of the family and its contents, documentation, and any other useful comment.

Type:str
fk_workspace_id

Reference to the workspace that uses this family. When None, it means that this family and all its associated metadata is public.

Type:int

Extra attributes

metadata_set
All Metadata entries associated to this family.
increment()

Create a new family with the same name but next version number

The new family will be associated to the same workspace.

class quetzal.app.models.FileState

Bases: enum.Enum

State of a Quetzal file

Quetzal files have a status, saved in their base metadata under the state key. It can only have the values defined in this enumeration.

DELETED = 'deleted'

File has been deleted.

Deleted files will have their metadata cleared when the workspace is committed.

If it was an already committed file, its contents will not be removed from the global data storage directory or bucket, but its metadata will be cleared. If it was a file that was not committed yet, it will be erased from its workspace data directory or bucket.

Deleted files are not considered in queries.

READY = 'ready'

File is ready

It has been uploaded, it can be downloaded, its metadata can be changed and when its workspace is committed, it will be moved to the global data storage directory or bucket.

TEMPORARY = 'temporary'

File is ready but temporary

Like READY, but this file will not be considered when the workspace is committed. That is, it will not be copied to the global data storage directory or bucket.

class quetzal.app.models.Metadata(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

Quetzal unstructured metadata

Quetzal defines metadata as a dictionary associated with a family. Families define the semantic organization and versioning of metadata, while this class gathers all the metadata key and values in a dictionary, represented as a JSON object.

id

Identifier and primary key of a metadata entry.

Type:int
id_file

Unique identifier of a file as a UUID number version 4. This identifier is also present and must be the same as the id entry in the json member.

Type:uuid.UUID
json

A json representation of metadata. Keys are metadata names and values are the related values. It may be a nested object if needed.

Type:dict

Extra attributes

family
The related Family associated to this metadata.
static get_latest(file_id, family)

Retrieve the latest metadata of a file under a particular family

static get_latest_global(file_id=None, family_name=None)

Retrieve the latest metadata of a file under a particular family

to_dict()

Return a dictionary representation of the metadata

Used to conform to the metadata details object on the OpenAPI specification.

Returns:Dictionary representation of this object.
Return type:dict
update(json)

Update the underlying json metadata with the values of a new one

This function takes the current json saved in this metadata object and updates it (like dict.update) with the new values found in the json input parameter. This does not remove any key; it adds new keys or changes any existing one.

Since SQLAlchemy does not detect changes on a JSONB column unless a new object is assigned to it, this function creates a new dictionary and replaces the previous one.

Changes still need to be committed through a DB session object.

Parameters:json (dict) – A new metadata object that will update over the existing one
Returns:
Return type:self
class quetzal.app.models.MetadataQuery(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

Query for metadata on Quetzal

Queries on Quetzal are temporarily saved as objects. This was initially thought as a mechanism for easier and faster paginations, to avoid verifying that a query is valid every time and possibly to compile these queries if needed.

id

Identifier and primary key of a query.

Type:int
dialect

Dialect used on this query.

Type:QueryDialect
code

String representation of the query. May change in the future.

Type:str
fk_workspace_id

Reference to the Workspace where this query is applied. If None, the query is applied on the global, committed metadata.

Type:int
fk_user_id

Reference to the User who created this query.

Type:int
static get_or_404(qid)

Get a workspace by id or raise an APIException

static get_or_create(dialect, code, workspace, owner)

Retrieve a query by its fields or create a new one

to_dict(results=None)

Create a dict representation of the query and its results

Used to conform to the OpenAPI specification of the paginable query results

Parameters:results (dict) – Results as a paginable object.
Returns:Dictionary representation of this object.
Return type:dict
class quetzal.app.models.QueryDialect

Bases: enum.Enum

Query dialects supported by Quetzal

class quetzal.app.models.Role(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

Authorization management role on Quetzal

Quetzal operations are protected by an authorization system based on roles. A user may have one to many roles; a role defines what operations the associated users can do.

Note that the n to n relationship of roles and users is implemented through the roles_users_table.

id

Identifier and primary key of a role.

Type:int
name

Unique name of the role.

Type:str
description

Human-readable description of the role.

Type:str

Extra attributes

users
Set of users associated with this role. This attribute is defined through a backref in User.
class quetzal.app.models.User(**kwargs)

Bases: flask_login.mixins.UserMixin, sqlalchemy.ext.declarative.api.Model

Quetzal user

Almost all operations on Quetzal can only be done with an authenticated user. This model defines the internal information that Quetzal needs for bookeeping its users, permissions, emails, etc.

id

Identifier and primary key of a user.

Type:int
username

Unique string identifier of a user (e.g. admin, alice, bob).

Type:str
email

Unique e-mail address of a user.

Type:str
password_hash

Internal representation of the user password with salt.

Type:str
token

Unique, temporary authorization token.

Type:str
token_expiration

Expiration date of autorization token.

Type:datetime
active

Whether this user is active (and consequently can perform operations) or not.

Type:bool

Extra attributes

roles
Set of Roles associated with this user.
workspaces
Set of Workspaces owned by this user.
queries
Set of Queries created by this user.
check_password(password)

Check if a password is correct.

Parameters:password (str) – The password to verify against the hash-salted stored password.
Returns:True when the provided password matches the hash-salted stored one.
Return type:bool
static check_token(token)

Retrieve a user by token

No user will be returned when the token is expired or does not exist.

Parameters:token (str) – Authorization token.
Returns:user – User with the provided token, or None when either the token was not found or it was expired.
Return type:User
get_token(expires_in=3600)

Create or retrieve an authorization token

When a user already has an authorization token, it returns it.

If there is no authorization token or the existing authorization token for this user is expired, this function will create a new one as a random string.

The changes on this instance are not propagated to the database (this must be done by the caller), but this instance added to the current database session.

Parameters:expires_in (int) – Expiration time, in seconds from the current date, used when creating a new token.
Returns:The authorization token
Return type:str
is_active

Property accessor for active.

Needed to conform to the flask_login.UserMixin interface.

revoke_token()

Revoke the authorization token

The changes on this instance are not propagated to the database (this must be done by the caller), but this instance added to the current database session.

set_password(password)

Change the password of this user.

This function set and store the new password as a salt-hashed string.

The changes on this instance are not propagated to the database (this must be done by the caller), but this instance added to the current database session.

Parameters:password (str) – The new password.
class quetzal.app.models.Workspace(**kwargs)

Bases: sqlalchemy.ext.declarative.api.Model

Quetzal workspace

In Quetzal, all operations on files and metadata are sandboxed in workspaces. Workspaces define the exact metadata families and versions, which in turn provides a snapshot of what files and metadata are available. This is the base of the reproducibility of dataset in Quetzal and the traceability of the data changes.

Workspaces also provide a storage directory or bucket where the user can upload new and temporary data files.

id

Identifier and primary key of a workspace.

Type:int
name

Short name for a workspace. Unique together with the owner’s username.

Type:str
_state

State of the workspace. Do not use directly, use its property accessors.

Type:WorkspaceState
description

Human-readable description of the workspace, its purpose, and any other useful comment.

Type:str
creation_date

Date when the workspace was created.

Type:datetime
temporary

When True, Quetzal will know that this workspace is intended for temporary operations and may be deleted automatically when not used for a while. When False, only its owner may delete it.

Type:bool
data_url

URL to the data directory or bucket where new files associated to this workspace will be saved.

Type:str
pg_schema_name

Used when creating structured views of the structured metadata, this schema name is the postgresql schema where temporary tables exists with a copy of the unstructured metadata.

Type:str
fk_user_id

Owner of this workspace as a foreign key to a User.

Type:int
fk_last_metadata_id

Reference to the most recent Metadata object that has been committed at the time when this workspace was created. This permits to have a reference to which global metadata entries should be taken into account when determining the metadata in this workspace.

Type:int

Extra attributes

families
Set of Families (including its version) used for this workspace.
queries
Set of Queries created on this workspace.
can_change_metadata

Returns True when metadata can be changed on the current workspace state

get_base_family()

Get the base family instance associated with this workspace

get_current_metadata()

Get the metadata that has been added or modified in this workspace

In contrast to get_previous_metadata(), this function only retrieves the metadata that has been changed on this workspace after its creation.

get_metadata()

Get a union of the previous and new metadata of this workspace

This function uses a combination of the results of get_previous_metadata() and get_current_metadata() to obtain the merged version of both. This represents the definitive metadata of each file, regardless of changes before or after the creation of this workspace.

static get_or_404(wid)

Get a workspace by id or raise a quetzal.app.api.exceptions.ObjectNotFoundException

get_previous_metadata()

Get the global metadata of this workspace

The global metadata is the metadata that already has been committed, but it must also have a version value that is under the values declared for this workspace.

make_schema_name()

Generate a unique schema name for its internal structured metadata views

state

Property accessor for the workspace state

to_dict()

Return a dictionary representation of the workspace

This is used in particular to adhere to the OpenAPI specification of workspace details objects.

Returns:Dictionary representation of this object.
Return type:dict
class quetzal.app.models.WorkspaceState

Bases: enum.Enum

Status of a workspace.

Workspaces in Quetzal have a state that defines what operations can be performed on them. This addresses the need for long-running tasks that modify the workspace, such as initialization, committing, deleting, etc.

The transitions from one state to another is defined on this enumeration on the transitions() function. The following diagram illustrates the possible state transitions:

The verification of state transitions is implemented in the quetzal.app.models.Workspace.state property setter function.

COMMITTING = 'committing'

The workspace is committing its files and metadata.

The workpace will remain on this state until the committing routine finishes. No operation is possible until then.

CONFLICT = 'conflict'

The workspace detected a conflict during its commit routine.

The workpace will remain on this state until the administrator fixes this situation. No operation is possible.

DELETED = 'deleted'

The workspace has been deleted.

The instance of the workspace remains in database for bookeeping, but there is no operation possible with it at this point.

DELETING = 'deleting'

The workspace is deleting its files and itself.

The workpace will remain on this state until the deleting routine finishes. No operation is possible.

INITIALIZING = 'initializing'

The workspace has just been created.

The workspace will remain on this state until the initialization routine finishes. No operation is possible until then.

INVALID = 'invalid'

The workspace has encountered an unexpected error.

The workpace will remain on this state until the administrator fixes this situation. No operation is possible.

READY = 'ready'

The workspace is ready.

The workspace can now be scanned, updated, committed or deleted. Files can be uploaded to it and their metadata can be changed.

SCANNING = 'scanning'

The workspace is updating its internal views.

The workpace will remain on this state until the scanning routine finishes. No operation is possible until then.

UPDATING = 'updating'

The workspace is updating its metadata version definition.

The workpace will remain on this state until the updating routine finishes. No operation is possible until then.

quetzal.app.models.roles_users_table = Table('roles_users', MetaData(bind=None), Column('fk_user_id', Integer(), ForeignKey('user.id'), table=<roles_users>), Column('fk_role_id', Integer(), ForeignKey('role.id'), table=<roles_users>), schema=None)

Auxiliary table associating users and roles

quetzal.app.routes module

quetzal.app.routes.favicon()
quetzal.app.routes.health()
quetzal.app.routes.index()

quetzal.app.security module

class quetzal.app.security.CommitWorkspacePermission(workspace_id)

Bases: flask_principal.Permission

class quetzal.app.security.ReadWorkspacePermission(workspace_id)

Bases: flask_principal.Permission

quetzal.app.security.WorkspaceNeed

alias of quetzal.app.security.workspace_need

class quetzal.app.security.WriteWorkspacePermission(workspace_id)

Bases: flask_principal.Permission

quetzal.app.security.load_identity(sender, identity)