Spark in me
2.2K subscribers
822 photos
48 videos
116 files
2.68K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
Если вы ищете мощные GPU-accelerated сервера в облаке, то вот вариант
- https://www.servers.com/prisma_cloud

Я сам пробовал Floydhub, но мне очень не понравилось. С другой стороны тут ценник гораздо более кусучий чем на Амазоне.

#hardware
Если вы сейчас собираете себе или компании железо для нейросеток, то не только статья про GPU Limbo, но и эта статься про ответ Intel на новые линейки от AMD вам будет интересна
- https://3dnews.ru/954174
- http://timdettmers.com/2017/12/21/deep-learning-hardware-limbo/#more-627

Понятно, что процессор это не боттлнек, но все равно интересно как конкуренция влияет на рынок.

#hardware
Poor man's computing cluster

So, when I last checked, Amazon's p3.4xlarge instances cost around US$12 per hour (unless you reserve them for a year). A tower supercomputer from Nvidia costs probably US$40-50k or more (it was announced at around US$69k).

It is not difficult to crunch the numbers and see, that 1 month of renting such a machine would cost at least US$8-10k. Also there will the additional cost / problem of actually storing your large datasets. When I last used Amazon - their cheap storage was sloooooow, and fast storage was prohibitively expensive.


So, why I am saying this?


Let's assume (according to my miner friends' experience) - that consumer Nvidia GPUs can work 2-3 years non-stop given proper cooling and care (test before buying!). Also let's assume that 4xTesla V100 is roughly the same as 7-8 * 1080Ti.

Yeah, I know that you will point out at least one reason why this does not hold, but for practical purposes this is fine (yes, I know that Teslas have some cool features like Nvlink).

Now let me drop the ball - modern professional motherboards often boast 2-3 Ethernet ports. And sometimes you can even get 2x10Gbit/s ports (!!!).

It means, that you actually can connect at least 2 (or maybe you can daisy chain them?) machines into a computing cluster.

Now let's crunch the numbers

According to quotes I collected through the years, you can build a cluster roughly equivalent to Amazon's p3.4xlarge for US$10k (but with storage!) with used GPUs (miners sell them like crazy now). If you buy second market drives, motherboards, CPUs and processors you can lower the cost to US$5k or less.

So, a cluster, that would serve you at least one year (if you test everything properly and take care of it) costing US$10k is roughly equivalent to:
- 20-25% of DGX desktop;
- 1 month of renting on Amazon;

Assuming that all the hardware will just break in a year:
- It is 4-5x cheaper than buying from Nvidia;
- It is 10x cheaper than renting;

If you buy everything used, then it is 10x and 20x cheaper!

I would buy that for a dollar!
Ofc you have to invest your free time.

See my calculations here:
http://bit.ly/spark00001

#deep_learning
#hardware
Logging your hardware, with logs, charts and alers - in style

TLDR - we have been looking for THE software to do this easily, with charts / alerts / easy install.

We found prometheus. Configuring alerts was a bit of a problem, but enjoy:
- https://github.com/snakers4/gpu-box-setup#prometheus

#deep_learning
#hardware
Assembling a NAS for less than US$50

So ... you want a NAS for emergency backups that only you know about.

You have spent money on GPUs, drives, devboxes and you would like to get your NAS for free.
Ofc, if you are a clever boi, you will have RAID arrays on your devbox, offsite backups, etc etc

If you feel particularly S&M, you might even use AWS Glacier or smth similar.
Or you may buy a NAS (decent devices start from US$500-1000 w/o drives! rip-off!)


But you see, all of the above variants cost money.
Or you cannot easily throw such a backup out of the window / encryption creates overhead.

So you can create a NAS on the cheap in style:
- Buy any raspberry pi (US$5 - US$20, you can find one used even cheaper);
- Buy a USB HDD enclosure (US$5 - US$40);
- Find some garbage drives for free;
- Copy your files, put HDD under your pillow;
- Profit;

Added bonuses:
- If you live in a police state - you can use RAID 0 (just hide the second drive) => in essence this is like have a perfect one-time pad encryption;
- Easily use RAID 1 or RAID 10 with 4 drives;
- Very high portability, if you use 2.5'' drives;
- Mdadm arrays are easily transferrable;
- Cyber punk vibe;

#hardware
The current state of "DIY" ML hardware

(i.e. that you can actually assemble and maintain and use in a small team)

Wanted to write a large post, but decided to just a TLDR.
In case you need a super-computer / cluster / devbox with 4 - 16 GPUs.

The bad
- Nvidia DGX and similar - 3-5x overpriced (sic!)
- Cloud providers (Amazon) - 2-3x overpriced

The ugly
- Supermicro GPU server solutions. This server hardware is a bit overpriced, but its biggest problem is old processor sockets
- Custom shop buit machines (with water) - very nice, but (except for water) you just pay US$5 - 10 - 15k for work you can do yourself in one day
- 2 CPU professional level motherboards - very cool, but powerful Intel Xeons are also very overpriced

The good
- Powerful AMD processor with 12-32 cores + top tier motherboard. This will support 4 GPUs on x8 speed and have a 10 Gb/s ethernet port
- Just add more servers with 10 Gb/s connection and probably later connect them into a ring ... cheap / powerful / easy to maintain

More democratization soon?

Probably the following technologies will untie our hands

- Single slot GPUs - Zotac clearly thought about it, maybe it will become mainstream in the professional market
- PCIE 4.0 => enough speed for ML even on cheaper motherboards
- New motherboards for AMD processors => maybe more PCIE slots will become normal
- Intel optane persistent memory => slow and expensive now, maybe RAM / SSD will merge (imagine having 2 TB of cheap RAM on your box)

Good chat in ODS on same topic.

#hardware
Amazing hardware YouTube channel (RU)

Link.

Smart, in-depth, highly analytical, no bs / ads / cringe / sensationalism. Not your typical Russian channel, not Linus Tech Tips or similar.

Example videos:

- What hardware companies could do
- Choosing a PSU

#hardware
New embedded computing platform?

Looks like Intel has a new computing platform for ultra compact PCs. This may fit some of the semi-embedded ML applications!

Intel NUC 9 Extreme Compute Element
- https://www.theverge.com/2020/1/7/21051879/intel-pc-nuc-9-extreme-ghost-canyon-element-hands-on-teardown-ces-2020
- https://pc-01.tech/razer-tomahawk/

#hardware
Speculations about x86 ARM Future

https://pc-01.tech/arm/

I just love this channel. Nice speculations. I am pretty sure that MS move towards WLS was a calculated move, but how does this fit into this picture? How will they port old Win32 apps to ARM? Is there some hidden power play we do not understand?

On which ecosystem to bet? My personal choice is Linux. I am too lazy to migrate my laptop to Ubuntu yet, but looks like 2 options can happen (I am just an idiot, I have no proofs):

- Every consumer device becomes ARM;
- There will be a large turmoil in the market, large players will start adopting Linux as a safe haven;

Interesting)

#hardware
RTX 3090 + Multi-Instance-GPU

So, ~2x faster than 2080 Ti, which is 30% faster than 1080Ti.
2x VRAM.

The only real question for me is, will it support Multi-Instance-GPU?

Let me explain why this is important. Now usually when you train a network, you increase your batch-size to fit the VRAM and monitor your IO and GPU load to ensure saturation.

But if a GPU has 2x VRAM and is 2-3x faster than 1080Ti, then maybe you can have multiple instances of your model on you GPU (that matters only for models that do not scale with large batch-sizes easily).

The only problem is that:

- You cannot use DDP in PyTorch (usually it is faster than DP for 4+ devices), because:

 DDP processes can be placed on the same machine or across machines, but GPU devices cannot be shared across processes.


- So you will have to invent something / change your code / or maybe even use their bleeding edge RPC functions;

If this function is available on 3090 ... then you could turn your GPU into 2-3 virtual GPUs and use it accordingly? That would be truly epic, especially for production use-cases (yeah I know about their SLA)! Also would be great for teamworking.

#hardware