HPC Guru (Twitter)
👍 Slop by @thedeadline - with a name like that, it has to be good :-)
#HPC #Slurm #Top twitter.com/thedeadline/st…
👍 Slop by @thedeadline - with a name like that, it has to be good :-)
#HPC #Slurm #Top twitter.com/thedeadline/st…
Twitter
Douglas Eadline
TTY slop (Slurm Top) for Limulus HPC systems is about done. Slurm queue in top pane, wwtop in bottom pane. Select a job, hit return & only nodes for job displayed. Move to Host pane, select a host, hit return, a top snapshot from the node is displayed. Does…
HPC Guru (Twitter)
RT @johnclinford: The AHUG Cloud Hackathon for #Arm #HPC has started! 36 teams, 61 clusters, 110 apps/mini-apps, 5 days, 4 AWS, #gravition2, #efa, #spack, #reframe, #slurm ... it's awesome!
RT @johnclinford: The AHUG Cloud Hackathon for #Arm #HPC has started! 36 teams, 61 clusters, 110 apps/mini-apps, 5 days, 4 AWS, #gravition2, #efa, #spack, #reframe, #slurm ... it's awesome!
HPC Guru (Twitter)
RT @hpcjoe: Fellow #HPC ers who use #SLURM for their schedulers, I'm curious about something. How many of you are using REST API based tooling to interact with it (as in submit/delete/status jobs) to any great extent? Or are you using just CLI/C-API?
RT @hpcjoe: Fellow #HPC ers who use #SLURM for their schedulers, I'm curious about something. How many of you are using REST API based tooling to interact with it (as in submit/delete/status jobs) to any great extent? Or are you using just CLI/C-API?
insideHPC.com (Twitter)
Check out Lenovo's new article on insideHPC: "Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager"
wp.me/p3RLHQ-o7J
@Lenovodc @Lenovo @SchedMD #Slurm #HPC #HPCAI #AI #clusters #workloadmanager
Check out Lenovo's new article on insideHPC: "Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager"
wp.me/p3RLHQ-o7J
@Lenovodc @Lenovo @SchedMD #Slurm #HPC #HPCAI #AI #clusters #workloadmanager
High-Performance Computing News Analysis | insideHPC
Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager
[SPONSORED GUEST ARTICLE] In HPC, leveraging compute resources to the maximum is a constant goal and a constant source of pressure. The higher the usage [...]
insideHPC.com (Twitter)
Check out Lenovo's new article on insideHPC: "Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager" wp.me/p3RLHQ-o7J
@Lenovo @Lenovodc @SchedMD #HPC #HPCclusters @slurmwlm #slurm
Check out Lenovo's new article on insideHPC: "Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager" wp.me/p3RLHQ-o7J
@Lenovo @Lenovodc @SchedMD #HPC #HPCclusters @slurmwlm #slurm
High-Performance Computing News Analysis | insideHPC
Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager
[SPONSORED GUEST ARTICLE] In HPC, leveraging compute resources to the maximum is a constant goal and a constant source of pressure. The higher the usage [...]
HPC Guru (Twitter)
#AWS believes it has finally created a cloud service that will break through with skeptical #HPC and #supercomputing customers
Parallel Computing Service: @awscloud adds support for the #Slurm scheduler to ease the transition to #cloud
https://www.hpcwire.com/2024/08/29/aws-perfects-cloud-service-for-supercomputing-customers/
via @HPCwire
#AWS believes it has finally created a cloud service that will break through with skeptical #HPC and #supercomputing customers
Parallel Computing Service: @awscloud adds support for the #Slurm scheduler to ease the transition to #cloud
https://www.hpcwire.com/2024/08/29/aws-perfects-cloud-service-for-supercomputing-customers/
via @HPCwire
X (formerly Twitter)
#AWS - Search / X
See posts about #AWS on X. See what people are saying and join the conversation.
HPC Guru (Twitter)
ML clusters @Meta:
o Use #Slurm on top of bare-metal allocations
o MTTF of 1024-GPU jobs is 7.9 hours - ~2 orders-of-magnitude lower than 8-GPU jobs (47.7 days)
o Restart time is 5-20 minutes after a failure
o Meta rediscovers idea of “forward progress” as defined by NNSA
ML clusters @Meta:
o Use #Slurm on top of bare-metal allocations
o MTTF of 1024-GPU jobs is 7.9 hours - ~2 orders-of-magnitude lower than 8-GPU jobs (47.7 days)
o Restart time is 5-20 minutes after a failure
o Meta rediscovers idea of “forward progress” as defined by NNSA
HPC Guru (Twitter)
#AWS Parallel Computing Service (PCS) now supports accounting with #Slurm version 24.11
https://aws.amazon.com/about-aws/whats-new/2025/05/aws-pcs-accounting-slurm-version-24-11/
#HPC via @awscloud
#AWS Parallel Computing Service (PCS) now supports accounting with #Slurm version 24.11
https://aws.amazon.com/about-aws/whats-new/2025/05/aws-pcs-accounting-slurm-version-24-11/
#HPC via @awscloud
X (formerly Twitter)
#AWS - Search / X
See posts about #AWS on X. See what people are saying and join the conversation.