HPC Guru (Twitter)
.@Arm-powered numerical weather prediction: Porting & running the @ECMWF model on the #Fugaku #supercomputer
o 32GB of #HBM2 is not quite sufficient
o #FP16 is probably not usable as a general purpose compute format
#HPC #PASC22 via @s_e_hatfield
-----------
@s_e_hatfield:
And that's a wrap on my #PASC22 poster. Poster 48, next Tuesday at 9am, Foyer 2nd Floor. Come and say hi ;) https://t.co/yQfOBRHLgT
.@Arm-powered numerical weather prediction: Porting & running the @ECMWF model on the #Fugaku #supercomputer
o 32GB of #HBM2 is not quite sufficient
o #FP16 is probably not usable as a general purpose compute format
#HPC #PASC22 via @s_e_hatfield
-----------
@s_e_hatfield:
And that's a wrap on my #PASC22 poster. Poster 48, next Tuesday at 9am, Foyer 2nd Floor. Come and say hi ;) https://t.co/yQfOBRHLgT
HPC Guru (Twitter)
RT @HatemLtaief: @HPC_Guru @Arm @ECMWF @s_e_hatfield But #FP32 accumulation with #FP16 operands is usable, as long as u cherry pick where to apply it. #climate #nvidia
https://ieeexplore.ieee.org/document/9442267/
This is indeed a missing kernel on #fugaku from #Fujitsu scientific library. #ARM performance library does not have it either.
RT @HatemLtaief: @HPC_Guru @Arm @ECMWF @s_e_hatfield But #FP32 accumulation with #FP16 operands is usable, as long as u cherry pick where to apply it. #climate #nvidia
https://ieeexplore.ieee.org/document/9442267/
This is indeed a missing kernel on #fugaku from #Fujitsu scientific library. #ARM performance library does not have it either.
ieeexplore.ieee.org
Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC
Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Spatial data are assumed…
HPC Guru (Twitter)
RT @ProjectPhysX: How much floating-point precision do you need for lattice Boltzmann #CFD? #FP64 is overkill in most cases. #FP32/#FP16 mixed-precision works with almost equal accuracy at ¼ the memory demand and is 4x-10x faster on #GPU.
🧵1/7
Big @SFB1357 #PhD paper👉 https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats https://twitter.com/ProjectPhysX/status/1552225695044190212/photo/1
RT @ProjectPhysX: How much floating-point precision do you need for lattice Boltzmann #CFD? #FP64 is overkill in most cases. #FP32/#FP16 mixed-precision works with almost equal accuracy at ¼ the memory demand and is 4x-10x faster on #GPU.
🧵1/7
Big @SFB1357 #PhD paper👉 https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats https://twitter.com/ProjectPhysX/status/1552225695044190212/photo/1