Overview

The architecture overview is designed to give a visual guide to the structure of the new fraXses Sprinbok release.

The Platform Development team have gone to great lengths to ensure that fraXses will not be locked in to one platform provider whilst at the same time leveraging all the beneficial aspects of Kubernetes.

High Level Architecture

fraXses will run on any flavour of Kubernetes but there are a few considerations that need to be addressed prior to installing the platform.

Storage

The fraXses platform attempts to leveage the native storage from each of the cloud providers (or CEPH if on prem). This allows the client to have a view on the cost of storage as well as compute resources (in a managed service). The fraXses platform requires two types of storage which in Kubernetes translates to presistent volumes with the following modes:

  1. ReadWriteOnce

    This is used by the majority of the containers in the cluster and normally maps to a the standard storage mechanism of the platform provider, for example, AWS EBS (Elastic Block Storage), Azure Disk or Google Persistent Disk. For on-prem installations we use CEPH RBD (Resilient Block Storage).

  2. ReadWriteMany

    This is used by the parts of the fraXses platform that requires sharable access to a file share more akin to a shared network drive. This maps to AWS EFS (Elastic File Storage) and for on-prem CEPH provides CEPH-FS. For Azure and Google, there is a still an open question regarding the best practice for this type of storage. We do, however, have a back up in all cases which is to supply Hadoop as the sahred storage in the Kubernetes cluster.

This is something that ALWAYS needs to be discussed with the client prior to installation.

High Level Architecture

Ingress and Certificates

How to access the various components of fraXses externally to the cluster is normally very dependant on the client. For on prem installations we provide node ports to the components but also ingresses via an nginx ingress controller which will provide ingress of the form /, i.e. /legoz for the legoz UI.

In managed installations this can be a little more involved depending on how the client wants access set up.

This is something that ALWAYS needs to be discussed with the client prior to installation.

Node Sizes

The discussion of node sizes is very similar to the discussion of vm sizes in the old microservices (like Leopard). There is baseline resource information that can be supplied by the platform team but this is based on the size of nodes to run the platform and does not include size of data queries etc. This is a seperate discussion that needs to had with the client. Whilst scaling horizontally in Kubernetes is easy, scaling vertically is more involved. As a concrete example, if the client has 5 nodes with 12GB each, they will be unable to pull a data query larger than 12GB because the availble memory on any single node is too small. In this situation a process of replacing the nodes, cordoning and draining the containers would need to be had; and this isn't automated to any degree.

This is something that ALWAYS needs to be discussed with the client prior to installation.