PDP-11🚀 – Telegram

PDP-11🚀

405 subscribers

112 photos

28 files

146 links

AI Hardware & Domain Specific Computing

#FPGA #ASIC #HPC #DNN

@vconst89

Download Telegram

About

Blog

Apps

Platform

405 subscribers

NVIDIA Ampere Architecture In-Depth

588 views09:24

#notml
А bit of space news, but not about Crew Dragon as you may expect :)
It was 14 years ago, when Xilinx released it's previous radiation tolerance (RT) FPGAs, acceptable for space application - Virtex 5 series. And finally, new successor is coming - Artix 7 Ultra Scale RT.
https://www.xilinx.com/support/documentation/white_papers/wp523-xqrku060.pdf

539 views04:21

https://www.fierceelectronics.com/electronics/five-ai-chip-startups-vcs-love

Fierce Electronics

Five AI chip startups that VCs love

No doubt about it, the AI chip market is big and about to get bigger. | VCs are making big investments in some new AI startups going up against Nvidia, Qualcomm, Intel and other household names.

499 views05:25

Stratix-10-NX-Tehnology-Brief.pdf

AI-Optimized FPGA for High-Bandwidth, Low-Latency AI Acceleration

526 views13:34

The Intel® Stratix® 10 NX FPGA delivers a unique combination of capabilities needed to implement customized hardware with integrated high-performance artificial intelligence (AI). These capabilities include:

High-Performance AI Tensor Blocks
- Up to 15X more INT8 throughput than Intel Stratix 10 FPGA digital signal processing (DSP) block for AI workloads
- Hardware programmable for AI with customized workloads

Abundant Near-Compute Memory
- Embedded memory hierarchy for model persistence
- Integrated high- bandwidth memory (HBM)

High-Bandwidth Networking
- Up to 57.8 G PAM4 transceivers and hard Ethernet blocks for high efficiency
- Flexible and customizable interconnect to scale across multiple nodes

451 views13:35

https://www.economist.com/technology-quarterly/2020/06/11/the-cost-of-training-machines-is-becoming-a-problem

The growing demand for computing power has fuelled a boom in chip design and specialised devices that can perform the calculations used in AI efficiently. The first wave of specialist chips were graphics processing units (GPUs), designed in the 1990s to boost video-game graphics. As luck would have it, GPUs are also fairly well-suited to the sort of mathematics found in AI.

Further specialisation is possible, and companies are piling in to provide it. In December, Intel, a giant chipmaker, bought Habana Labs, an Israeli firm, for $2bn. Graphcore, a British firm founded in 2016, was valued at $2bn in 2019. Incumbents such as Nvidia, the biggest GPU-maker, have reworked their designs to accommodate AI. Google has designed its own “tensor-processing unit” (TPU) chips in-house. Baidu, a Chinese tech giant, has done the same with its own “Kunlun” chips. Alfonso Marone at KPMG reckons the market for specialised AI chips is already worth around $10bn, and could reach $80bn by 2025.

“Computer architectures need to follow the structure of the data they’re processing,” says Nigel Toon, one of Graphcore’s co-founders. The most basic feature of AI workloads is that they are “embarrassingly parallel”, which means they can be cut into thousands of chunks which can all be worked on at the same time. Graphcore’s chips, for instance, have more than 1,200 individual number-crunching “cores”, and can be linked together to provide still more power. Cerebras, a Californian startup, has taken an extreme approach. Chips are usually made in batches, with dozens or hundreds etched onto standard silicon wafers 300mm in diameter. Each of Cerebras’s chips takes up an entire wafer by itself. That lets the firm cram 400,000 cores onto each.

Other optimisations are important, too. Andrew Feldman, one of Cerebras’s founders, points out that AI models spend a lot of their time multiplying numbers by zero. Since those calculations always yield zero, each one is unnecessary, and Cerebras’s chips are designed to avoid performing them. Unlike many tasks, says Mr Toon at Graphcore, ultra-precise calculations are not needed in AI. That means chip designers can save energy by reducing the fidelity of the numbers their creations are juggling. (Exactly how fuzzy the calculations can get remains an open question.)

All that can add up to big gains. Mr Toon reckons that Graphcore’s current chips are anywhere between ten and 50 times more efficient than GPUs. They have already found their way into specialised computers sold by Dell, as well as into Azure, Microsoft’s cloud-computing service. Cerebras has delivered equipment to two big American government laboratories.

The cost of training machines is becoming a problem

Increased complexity and competition are part of it

908 views14:29

921 views14:29

https://arxiv.org/pdf/2006.10159v1.pdf

497 views20:21

Germany plans to invest 45M euro with the aim of developing “trustworthy” electronics and improving competences in processor development.

For comparison, Intel R&D expenses are 13B$ every year.

„Vertrauenswürdige Elektronik“: Deutschland drängt auf mehr eigene Chip-Herstellung

Die Bundesrepublik muss technologisch unabhängiger werden von China und Amerika, sagt die Forschungsministerin. Und stellt eine neue Initiative vor.

535 views11:44

Apple has announced the biggest change heading to its Mac computers in 14 years: the dumping of Intel Inside.
The company is ditching Intel’s traditional so-called x86 desktop chips for Apple’s own processors based on ARM designs - those used in smartphones and mobile tablets, including the iPhone and iPad.
The Guardian

640 views08:46

The chapter from the upcoming Vivienne Sze book " Efficient Processing of Deep Neural Networks" http://eyeriss.mit.edu/2020_efficient_dnn_excerpt.pdf * Processing Near Memory * Processing in memory * Processing in the Optical Domain * Processing in Sensor

efficient_proceeding_of_dnn.pdf

The fantastic book is finally generally available now!

Efficient Processing of Deep Neural Networks
This tutorial covers all aspects of model software and hardware design related to the this topic. Explain very key concepts of weight/output/input/row stationarities and dataflow, power budget tradeoffs and hardware-software co-design aspects.

2.69K views07:59

Efficient Processing of Deep Neural Networks, Contents

678 views07:59

#Mipsology #Zebra https://www.allaboutcircuits.com/news/ai-accelerating-software-gives-fpgas-leg-up-gpu/

743 views05:43

https://www.imec-int.com/en/articles/imec-and-globalfoundries-announce-breakthrough-in-ai-chip-bringing-deep-neural-network-calculations-to-iot-edge-devices

Imec and GLOBALFOUNDRIES Announce Breakthrough in AI Chip, Bringing Deep Neural Network Calculations to IoT Edge Devices | imec

LEUVEN (Belgium), and Santa Clara, Calif., July 8, 2020 — Imec, a world-leading research and innovation hub in nanoelectronics and digital technologies, and GLOBALFOUNDRIES® (GF®), the world’s leading specialty foundry, today announced a hardware demonstration…

766 views11:54

Sorry guys, this channel is transforming into link-collection feed, but I promise to be back on track soon with brief summaries :)

https://www.electronicdesign.com/industrial-automation/article/21136402/smartnic-architectures-a-shift-to-accelerators-and-why-fpgas-are-poised-to-dominate

643 views09:26

https://semiengineering.com/fpgas-hide-hardware-for-machine-learning/

Semiconductor Engineering

ML Opening New Doors For FPGAs

Programmability shifts some of the burden from hardware engineers to software developers for ML applications.

503 views13:22

Bluspec Haskell is an open-source framework, yet another High Level Hardware Description Language, but now based on Haskell

Jonathan Ross, hardware AI startup Groq founder and ex-Google TPU developer, claims that it was used on initial stages of TPU design. It looks like Groq is also actively using it
https://www.linkedin.com/in/jonathan-ross-12a95156/

Bluespec research note
https://arxiv.org/pdf/1905.03746.pdf

The latest version of bluespec Compiler can be found here
https://github.com/B-Lang-org/bsc

And here's the tutorial
https://github.com/rsnikhil/Bluespec_BSV_Tutorial/tree/master/Reference

520 views13:49