511 subscribers
112 photos
28 files
142 links
AI Hardware

Domain Specific Computing for machine learning


Download Telegram
to view and join the conversation
Apple M1, In-Depth Review

🌽 CPU : 8 ARM cores = 4 high perf + 4 low power , 5nm, TSMC

πŸ₯GPU Comparable with GTX 1650

πŸ•DRAM : 3DStack HBM, lower latency and power consumption

πŸ‘‰ Read more in Notion
🎒Quantum Annealing Simulation and FPGAs

While pure-play quantum computing (QC) gets most of the QC-related attention, there’s also been steady progress adapting quantum methods for select use on classical computers.
World interest in Quantum Computing warms up the interest in Quantum-Inspired algorithms, among them Quantum Annealing Simulation(QA).

QA has nothing in common with qubits and сryocooler but offers a fast optimization method for complex but structured non-convex landscape.

Before moving further, we recommend you to read first about the Simulated Annealing because QA is a kind of extension of classical SA. Read here and here.

Analytical and numerical evidence suggests that quantum annealing outperforms simulated annealing under certain conditions See this short and clear Introduction to Quantum inspired Optimization

QA can be simulated on a computer using quantum Monte Carlo (QMC), but computational complexity scales up too fast. That's where application specific hardware comes out on scene

OpenCL‑based design of an FPGA accelerator for quantum annealing simulation
FPGA accelerator for QA simulations designed using Intel OpenCL HLS and achieved 6 times the multicore CPU implementation.

🦨Why not GPU?
None of these accelerators are suitable for complete graphs where every node has an interaction with all the other nodes. It is very difficult to accelerate QMC algorithm for complete graphs using GPUs due to the lack of SIMD operations and high data dependency

πŸ”Further Reading:

πŸ”—D-Wave Two -commercially available computer for QA simulation

πŸ“‹Quantum-inspired algorithms in practice

βš™οΈMicrosoft announced that Toshiba Bifurcation Machine
will be available through the Azure Quantum platform.

Tenstorrent, a hardware start-up developing next generation computers, announces the addition of industry veteran Jim Keller as President, CTO, and board member.
πŸš‚ FPGA comes back. Titanium FPGAs from EFINIX are focused on the edge application and promises unbeatable per-Watt performance

βš™οΈ[pdf] Hardware Accelerator of CNN by Yann Le Cun, father of Deeo Learning revolution

πŸ“Ί [youTube] 20min introduction video from Intel about what are the FPGAs and what sort of applications can you use it for

🎱 [plumerAI] Yet another ML Hardware startup
keeps growing and explains why binarized neural networks do the job with less resources

🍰 [youTube] Bonus! The lecture from yesterday by professor Onur Mutlu, ETH, about GPU architecture.

This channel is back from hibernation and more reviews will come soon :)
The latest paper by David Patterson & Google TPU team reveals details of the world most efficient and one of the most powerful supercomputers for DNN Acceleration - TPU v3. The one which was used to train BERT. We recommend that you definitely read the full…
πŸ‹πŸΌGoogle finally released TPU v4, it will be avaliable for customers later this year.
πŸ₯΄The previous v3 version was unveiled in 2018 and the v4 is claimed to be twice as fast.
🌽TPU v4 combines in a 4096 chips sumercomputer that reaches 1 exaFLOPs (10**18) of performance

Read more on [hpcwire] and watch the video Google I/O β€˜21