SURF DTL Interest Group Compute Resources for Life Science Research

SURF – DTL Interest Group Compute Resources for Life Science Research Second meeting: April 22, 2015

Summary of kick-off meeting Update on NFU Data4LifeSciences WP 7 This meeting: best practices HPC facilities - NL -  Usage -  Maintenance -  Support -  Business model Interest Group Compute Resources for Life Science Research

April 22, 2015

• Compute needs are increasing • How to accommodate for peak capacity needs

Interest Group Compute Resources for Life Science Research

April 22, 2015

Summary of kick-off meeting SIG Goal set: Share expertise in compute resources for life science research Discussions on scale-out models - Depending on the use cases one or more scale-out models may be appropriate. - Relatively simple solutions that connect clusters at different locations would work with a fast network. - Complex infrastructures for high-throughtput data analyses provide solutions for accounting, job submission and data provenance. Interest Group Compute Resources for Life Science Research

April 22, 2015

Sharing expertise & best practises: SURF-DTL Special Interest Group

• Kick-off: January 20th • 43 members • Yearplan 2015 available • Topics: scale out models, best practices 5

NFU Data4LifeSciences WP7. Facilities for high-throughput data processing. • Workgroup: UMCU, UMCG, Vumc, AMC, SURFsara, SURFnet • Goals: - How to efficiently use compute resources in NL - Share software en package installations on compute facilities • Share expertise, best practices using compute resources (SIG) • Workplan: 2015 Q1-Q4 - Inventory of compute facilities NL - Criteria and requirements - Use cases - Pilots - Evaluation: business models, best practices - Advice / project plan for harmonization compute resources NL 6

Scale out models & AAI

model

status

involved

1. Federated cloud

Proof of Concept, in pilot phase

UMCG, VUmc, SURFsara

2. Grid infrastructure

Production: Grid (LSG, AMC, SURFsara, LUMC EGI), evaluation

3. Sharing clusters by federated IDs

Initiation, in pilot phase UMCU, LUMC, SURFsara, SURFnet

4. Hybrid clustercloud: cluster in cloud

Initiation, in pilot phase UMCU, LUMC, SURFsara

Werkgroep: SURFsara (lead), UMCU, LUMC, AMC, Vumc, UMCG, SURFnet

Inventory compute facilities – UMC’s NL UMC

facility

capacity

note

UMCG/ RUG

Meerdere decentrale clusters

1.680 cores (totaal)

Gedeelde faciliteiten met RUG

LUMC

Shark cluster

544 cores

UMCU

Research HPC cluster

720 cores

VUmc

Meerdere clusters decentraal, NCAgrid en cloud

560 cores (totaal)

others

..

..

..

SURFsara

Meerdere infrastructures: cluster, cloud, grid

25.000 (excl. cloud)

Nationaal beschikbaar op aanvraag

Uitbreiding naar 2000 cores verwacht, ook beschikbaar voor UU

Several service delivery models •  Local services -  provider: research ICT (integrated or separated from central ICT, diagnostics) or central ICT -  mainly cluster computing -  Support levels differ -  capacity differs per site, depending on research programs •  National services (when local services not available or not part of core business) -  provider: SURFsara and partners (grid) -  different flavors in compute services available -  high scalability (up to EU resources) •  Combination of local and national services -  different service delivery models possible

Service delivery model: example RCCS • National Research Capacity Computing Service (RCCS)

What

Who

maintenance

National (SURFsara)

User management: handle requests & access

Local PI (VU)

• Platform as a Service

Account management

National (SURFsara)

Functional support

Local PI (VU)

Technical support

National (SURFsara)

financed

VU

• Used for cohort genomics studies

Use of local facilities: trends •  De huidige bezetting loopt uiteen van 50-85%, maar ligt gemiddeld op 80-85%. 10-15% is beschikbaar om op te schalen voor overflow. •  Piekmomenten zijn vaak in de zomer en over de kerstvakantie. •  Vraag wordt over het algemeen per jaar verdubbeld, in elk geval bij het UMCU en UMCG/RUG. •  Use cases -  Er blijven nieuwe use-cases komen. Onderzoekers zullen steeds meer de mogelijkheden aangrijpen om meer computational resources te gebruiken, bijv. daar waar eerst 1000 permutaties werden gedaan, worden nu al snel 10.000 permutaties gedaan als hiervoor resources beschikbaar zijn. -  Daarnaast komen er steeds meer smaken op basis van workload, bijv. speciale machines voor moleculaire dynamica. •  Er wordt bij het LUMC en UMCU gebruik gemaakt van een fairshare model op basis van budget.

Example: Project ALS MinE • amyotrofische laterale sclerose: 200.000 patients worldwide • Project 22.500 genomes to analyze • Too big for local resources: UMCU HPC facility • Use Life Science Grid

• 4,000 samples: 1,080,000 core hours (123.2 core years), 320 TB • 5,500 samples: 1,485,000 core hours (2015), 880 TB • 11,400 samples: 3,078,000 core hours or 351 core years (2016), 1.8 PB

Compute resource needs • Manage needs & capacity -  Close contact with research groups / projects in pipeline -  Gather needs: for how long, how much -  Determine capacity and infrastructure needs • Balance peak capacity by using central or shared resources

SURFsara Compute Services applications in life science research Service

Description

Applications

Characteristics

HPC Cartesius

High performance / capability computing.

Simulations and prediction models, e.g. cell simulations

National supercomputer with accelerators for specific applications

Research Capacity Batch processing, Computing Service capacity computing

Directly scalable from local cluster environment, serviced Parallel applications, e.g. genetics, RNAseq and medical environment on application imaging analyses level

HPC Cloud

Virtualized / cloud compute infrastructure

High-throughput workflows, e.g. Dynamic and flexible, for any Galaxy, interactive workflow and OS, self-service visualizations, large memory IaaS applications

Grid, Life Science Grid

Large scale distributed cluster computing

Large scale cohort genetics, The sky is the limit. Requires RNAseq analyses, imputations, job submission protocols, molecular modeling middleware

Big data services, e.g. Hadoop, NoSQL

Data-centric computing Mining of large amounts of data, Hadoop data processing taking advantage of data e.g. annotations, ‘big data’ requires MapReduce enabled locality analytics algorithms

Data I/O intensive – CPU cycle intenstive

Considerations … • Managing the needs -  use cases -  different infrastructures required -  complexity of multi-center studies • Finances -  cores versus costs -  hardware + FTEs (maintenance + support)

Total costs

• Business model / delivery model: SLA’s etc. • Expertise: maintenance & support

? Cores

NFU Data4LifeSciences WP7. Future work • Work out optimal use of compute resources NL - Architecture: scale-out model - Usage - Maintenance - Support level - Business model • Benchmarking of use cases - Walltime - Memory usage - CPU usage - IOPS: input and output operations per second • Data & software parallelisation

This meeting. Discussion Reflect on findings so far How to efficiently use local and national services -  usage -  maintenance -  support -  business model & governance • Best practices, e.g. documentation by Ansible playbook Focus on particular topics for next meeting?

Agenda • 13:15 Welcome & introduction • 13:30 HPC Genome Coordination Center Groningen/RUG - Pieter Neerincx • 13:55 Compute resources WeNMR - Alexandre Bonvin • 14:20 Cluster computing for brain imaging - Keith Cover • 14:45 Tea & coffee break • 15:05 Computing for the Life Sciences - Patrick Kemmeren • 15:30 HPC WUR - Hendrik-Jan Megens • 15:55 HPC LUMC - Martijn Vermaat • 16:20 Discussion • 16:45 Drinks

SURF DTL Interest Group Compute Resources for Life Science Research

Recommend Documents