HPC Guru (Twitter)
RT @ProjectPhysX: @ProfMatsuoka At this point, arithmetic performance is so far ahead of memory bandwidth that it makes sense - even for non-AI workloads - to use "spare" arithmetic cycles to do data compression, such that the algorithm uses less memory and bandwidth.
https://t.co/3i9YdeXnOr
RT @ProjectPhysX: @ProfMatsuoka At this point, arithmetic performance is so far ahead of memory bandwidth that it makes sense - even for non-AI workloads - to use "spare" arithmetic cycles to do data compression, such that the algorithm uses less memory and bandwidth.
https://t.co/3i9YdeXnOr
Twitter
Moritz Lehmann
Make the lattice Boltzmann method (#LBM) use only ½ the memory and run 80% faster on #GPUs? #FP32/16-bit mixed precision makes it possible. Implementation takes 20 lines in existing code. Accuracy often is as good as #FP64.🖖🧐📃 Read my preprint: arxiv.or…
HPC Guru (Twitter)
RT @ProjectPhysX: How much floating-point precision do you need for lattice Boltzmann #CFD? #FP64 is overkill in most cases. #FP32/#FP16 mixed-precision works with almost equal accuracy at ¼ the memory demand and is 4x-10x faster on #GPU.
🧵1/7
Big @SFB1357 #PhD paper👉 https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats https://twitter.com/ProjectPhysX/status/1552225695044190212/photo/1
RT @ProjectPhysX: How much floating-point precision do you need for lattice Boltzmann #CFD? #FP64 is overkill in most cases. #FP32/#FP16 mixed-precision works with almost equal accuracy at ¼ the memory demand and is 4x-10x faster on #GPU.
🧵1/7
Big @SFB1357 #PhD paper👉 https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats https://twitter.com/ProjectPhysX/status/1552225695044190212/photo/1
HPC Guru (Twitter)
If confirmed, that’s ~100x the cost of the largest supercomputers used for traditional #HPC (Simulation etc.)
Drop the 6 from #FP64, FP4 is where all the💰is 😜
#AI #GenAI @OpenAI @Microsoft https://twitter.com/spectatorindex/status/1773773660198969573#m
If confirmed, that’s ~100x the cost of the largest supercomputers used for traditional #HPC (Simulation etc.)
Drop the 6 from #FP64, FP4 is where all the💰is 😜
#AI #GenAI @OpenAI @Microsoft https://twitter.com/spectatorindex/status/1773773660198969573#m
HPC Guru (Twitter)
RT @HatemLtaief: @nicholasmalaya: Beside the existence of an economic reality w/ #AI, the question is not about apps needing #FP64 or not but more on to which proportion they need it compared to lower precisions? Most of scientific applications are over computing. You can get similar science quality w/ #MxP. #HPC https://twitter.com/nicholasmalaya/status/1877201356244570307#m
RT @HatemLtaief: @nicholasmalaya: Beside the existence of an economic reality w/ #AI, the question is not about apps needing #FP64 or not but more on to which proportion they need it compared to lower precisions? Most of scientific applications are over computing. You can get similar science quality w/ #MxP. #HPC https://twitter.com/nicholasmalaya/status/1877201356244570307#m
X (formerly Twitter)
Hatem Ltaief (@HatemLtaief) on X
Beside the existence of an economic reality w/ #AI, the question is not about apps needing #FP64 or not but more on to which proportion they need it compared to lower precisions? Most of scientific applications are over computing. You can get similar science…