Senior HPC Infrastructure Engineer
  £100000.0 - £130000.0 per annum + £130,000
  Winchester, Hampshire
  permanent,full-time


Your new company
Join a pioneering organisation at the forefront of AI and High Performance Computing (HPC) infrastructure. With a strong focus on innovation and ethical computing, this company is building scalable, GPU-optimised environments that support cutting-edge research and enterprise workloads.

Your new role
This is a fully remote, hands-on technical role where you'll lead the design, deployment, and optimisation of large-scale AI and HPC clusters. You'll architect end-to-end solutions across compute, storage, and networking - working closely with internal teams, OEMs, and external suppliers to deliver high-performance infrastructure.

You'll be responsible for creating detailed technical designs, including hardware specifications, data centre layouts, cabling, and power/cooling requirements.

You'll install and tune Linux-based operating systems, configure SLURM job schedulers, and optimise high-speed networking technologies such as Infiniband and RoCE.

The role also involves scripting and automation (Ansible, Terraform), troubleshooting complex distributed systems, and mentoring junior engineers and service teams.This is an ideal opportunity for someone who thrives in project-led infrastructure work and wants to shape the future of AI and HPC platforms.

What you'll need to succeed
To be successful in this role, you'll bring:HPC Cluster Expertise:

Proven experience designing, deploying, and scaling large HPC environments (hundreds to thousands of nodes).
SLURM Scheduler Configuration: Deep understanding of SLURM partitions, priorities, and resource management.
Networking: Strong knowledge of high-performance networking (Infiniband, RoCE, RDMA) and troubleshooting interconnectivity issues.
Linux Systems: Advanced Linux administration skills, including performance tuning and OS-level troubleshooting.
Storage Systems: Experience with parallel/distributed file systems (e.g. Lustre, Ceph, WEKA, VAST).
Automation & Scripting: Proficiency in Bash, Python, and tools like Ansible and Terraform for deployment and maintenance.
Monitoring & Resilience: Experience implementing monitoring solutions and ensuring high availability and security compliance.
Documentation & Mentoring: Excellent written communication skills and a collaborative approach to mentoring and knowledge sharing.

Desirable Experience

* Containerisation in HPC (Singularity, Docker, Apptainer)
* Familiarity with AI/ML workflows, GPU-aware MPI, and NVLink
* Experience in cloud, academic, or research environments
* Vendor hardware validation and data centre planning

What you'll get in return

* Share options.
* Unlimited holiday policy.
* 100% Remote working.
* Fantastic opportunities to develop - they make a habit of promoting in-house.
* A great team with a passion for working collaboratively.
* Enhanced family-friendly policies.
* A truly flexible workplace!

What you need to do now
If you're interested in this role, click 'apply now' to forward an up-to-date copy of your CV, or call us now.
If this job isn't quite right for you, but you are looking for a new position, please contact us for a confidential discussion about your career.

Hays Specialist Recruitment Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept the T&C's, Privacy Policy and Disclaimers which can be found at hays.co.uk


Advertiser: Agency

Reference: 4714745

Posted on: 2025-10-13 15:02:15

Send me Alert for jobs in: 

IT & Telecoms - Winchester, Hampshire

Email Address

By creating a job alert, you agree to our Terms . You can change your consent settings at any time by unsubscribing or as detailed in our terms.

Similar Jobs:

Senior GCP DevOps Engineer (Google Cloud Platform)

  Hays Specialist Recruitment Ltd

  £65000 - £80000 per annum + £65-80k + Equity

  Bournemouth, Dorset

Infrastructure engineer

  Hays Specialist Recruitment

  £55000.0 - £60000.0 per annum + £55,000 - £60,000

  Kent

Head of Technical Operations

  Hays Specialist Recruitment Ltd

  £700.0 - £800.0 per day + £700 - £800

  City of London, Greater London

Scientific Software Engineers (C++, Masters or PhD)

  Hays Specialist Recruitment Ltd

  £60000.0 - £120000.0 per annum + £60000 - £120000

  Guildford, Surrey

Digital Solutions Engineer (Network)

  Canterbury Christ Church University

  £42,254 - £44,746 per annum

  Canterbury, Kent

Geoscience Software Engineer (C++, Masters or PhD)

  Hays Specialist Recruitment Ltd

  £60000 - £120000 per annum + £60-120k

  Guildford, Surrey

Senior HPC Infrastructure Engineer
  £100000.0 - £130000.0 per annum + £130,000
  Winchester, Hampshire
  permanent,full-time


Your new company
Join a pioneering organisation at the forefront of AI and High Performance Computing (HPC) infrastructure. With a strong focus on innovation and ethical computing, this company is building scalable, GPU-optimised environments that support cutting-edge research and enterprise workloads.

Your new role
This is a fully remote, hands-on technical role where you'll lead the design, deployment, and optimisation of large-scale AI and HPC clusters. You'll architect end-to-end solutions across compute, storage, and networking - working closely with internal teams, OEMs, and external suppliers to deliver high-performance infrastructure.

You'll be responsible for creating detailed technical designs, including hardware specifications, data centre layouts, cabling, and power/cooling requirements.

You'll install and tune Linux-based operating systems, configure SLURM job schedulers, and optimise high-speed networking technologies such as Infiniband and RoCE.

The role also involves scripting and automation (Ansible, Terraform), troubleshooting complex distributed systems, and mentoring junior engineers and service teams.This is an ideal opportunity for someone who thrives in project-led infrastructure work and wants to shape the future of AI and HPC platforms.

What you'll need to succeed
To be successful in this role, you'll bring:HPC Cluster Expertise:

Proven experience designing, deploying, and scaling large HPC environments (hundreds to thousands of nodes).
SLURM Scheduler Configuration: Deep understanding of SLURM partitions, priorities, and resource management.
Networking: Strong knowledge of high-performance networking (Infiniband, RoCE, RDMA) and troubleshooting interconnectivity issues.
Linux Systems: Advanced Linux administration skills, including performance tuning and OS-level troubleshooting.
Storage Systems: Experience with parallel/distributed file systems (e.g. Lustre, Ceph, WEKA, VAST).
Automation & Scripting: Proficiency in Bash, Python, and tools like Ansible and Terraform for deployment and maintenance.
Monitoring & Resilience: Experience implementing monitoring solutions and ensuring high availability and security compliance.
Documentation & Mentoring: Excellent written communication skills and a collaborative approach to mentoring and knowledge sharing.

Desirable Experience

* Containerisation in HPC (Singularity, Docker, Apptainer)
* Familiarity with AI/ML workflows, GPU-aware MPI, and NVLink
* Experience in cloud, academic, or research environments
* Vendor hardware validation and data centre planning

What you'll get in return

* Share options.
* Unlimited holiday policy.
* 100% Remote working.
* Fantastic opportunities to develop - they make a habit of promoting in-house.
* A great team with a passion for working collaboratively.
* Enhanced family-friendly policies.
* A truly flexible workplace!

What you need to do now
If you're interested in this role, click 'apply now' to forward an up-to-date copy of your CV, or call us now.
If this job isn't quite right for you, but you are looking for a new position, please contact us for a confidential discussion about your career.

Hays Specialist Recruitment Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept the T&C's, Privacy Policy and Disclaimers which can be found at hays.co.uk


Advertiser: Agency

Reference: 4714745

Posted on: 2025-10-13 15:02:15

I want to receive the latest job alerts for:

IT & Telecoms jobs in Winchester, Hampshire

By creating a job alert, you agree to our Terms . You can change your consent settings at any time by unsubscribing or as detailed in our terms.

Similar Jobs:

Senior GCP DevOps Engineer (Google Cloud Platform)

  Hays Specialist Recruitment Ltd

  £65000 - £80000 per annum + £65-80k + Equity

  Bournemouth, Dorset

Infrastructure engineer

  Hays Specialist Recruitment

  £55000.0 - £60000.0 per annum + £55,000 - £60,000

  Kent

Head of Technical Operations

  Hays Specialist Recruitment Ltd

  £700.0 - £800.0 per day + £700 - £800

  City of London, Greater London

Scientific Software Engineers (C++, Masters or PhD)

  Hays Specialist Recruitment Ltd

  £60000.0 - £120000.0 per annum + £60000 - £120000

  Guildford, Surrey

Digital Solutions Engineer (Network)

  Canterbury Christ Church University

  £42,254 - £44,746 per annum

  Canterbury, Kent

Geoscience Software Engineer (C++, Masters or PhD)

  Hays Specialist Recruitment Ltd

  £60000 - £120000 per annum + £60-120k

  Guildford, Surrey

Not logged in into Jobsinsoutheast.com?


Log in or register here.

   Log in with your Google account



Copyright © 1999 - 2025 JIK SOFTWARE LTD