Training Machine Learning algorithms is a complex and demanding feat. Google TensorFlow API will allow you to execute learning tasks using a distributed or GPU accelerated environment. Also, you can publish your models using TensorFlow server, which allows you to advance quickly from training to production.
TensorFlow is installed in Finis Terrae with and without GPU, so you can change the configuration that best adapts to your needs. In addition to this, distributed executions are enabled. And, you will always be able to access your training metrics and computational graphs using Tensorboard.
Finis Terrae has been conceived for the efficient resolution of large parallel problems. As a result, Finis Terrae includes more than 7700 cores, distributed in nodes with 2 Intel Xeon E5 Haswell processors. 4 nodes include 2 additional NVIDIA Tesla K80. These nodes are interconnected by a FDR InfiniBand low-latency network. They also have access to a high-performance parallel storage system based on Lustre, capable of providing simultaneously a high capacity (NET 760 Terabytes) and, above all, a high performance (greater than 20 Gigabytes per second). Thus, calculations are not delayed by the disks input/output operations and are speeded up by GPUs when you so request them.
Access to the system is enabled by using VPN to guarantee security. There is also the possibility of using a web portal and there is a remote visualization environment to allow an easy analysis of your results from your browser.
Consumption is billed based on elapsed core hours, each core hour includes up to 6GB of memory. This amount of memory can be increased by using additional cores. Service has a setup fee of 200€ per company (only the first time you contract a service), which includes 3 user accounts, and up to 600 GB (+200 scratch). Should you request additional user accounts, these would be charged separately. Users can run any free software application, whenever the corresponding license T&C so allow (CESGA offers some preinstalled CUDA-enabled libraries and applications). The execution of your jobs will use a batch system based on Slurm without priority. A maximum number of jobs in execution is applied using this service.
Published cost is based on CESGA GPU services. Lower cost applies when you do not use GPUs, following CESGA non-prioritized, prioritized or exclusive HPC services prices.Tensorflow server is on demand using a virtual machine (without GPU) which is executed on CESGA Cloud services. In this case, prices of this service are applied.
To get more information, contact our Transfer department.