Spark in me
2.29K subscribers
726 photos
47 videos
114 files
2.61K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
Interestingly, after 100+ votes almost no one uses glances. Essentially it is an utility that kind of combines process / cpu / gpu / network / disk metrics. Check it out!
2020 DS / ML Digest 9

Highlights
:

- The Tesla LIDAR fallacy
- Russian paraphrasing dataset
- A review of what works in tracking / re-id tasks
- 600B params NMT by Google

Please like / share / repost!

https://spark-in.me/post/2020_ds_ml_digest_09

#digest
A Small Social Experiment

A small social experiment. Totally forgot about my account with TDS when publishing my first post about our research in STT.

Since both of them put a lot of emphasis into authors actually owning their posts and they have canonical URLs, I decided to re-post this article to both TDS and HackerNoon and see what happens.

- A friend link to a post on TDS
- A link to a post on HackerNoon - no reply from them yet ...

As for traffic - despite being featured on TDS, it is still zero. Looks like their audience only likes top 10 articles and rags to riches stories.

Will be interesting to compare 3 platforms - The Gradient, TDS, HN.

#deep_learning
This is just funny =)
Are you an Arctic Code Vault Contributor?
Anonymous Poll
37%
Yes
31%
No
32%
What is it?
A Small Social Experiment - Update 1

Looks like hacker noon does not respect their own response deadlines - I uploaded 5 days ago - no response yet.

As for TDS - it is kind of predictable, but sad anyway - despite being featured, the piece gets little to zero native traffic from there. You should write top-10 articles or rags to riches stories there I guess =)

Or maybe, cynically, if you fill in the canonical url all of these guys are not automatically interested?
Pandas Official Guide

Pandas now has a human readable best practices guide!

https://pandas.pydata.org/pandas-docs/stable/user_guide/

#data_science
Building Hyper Professional Looking PDFs in One Shell Command

You know, there are 2 types of people - those who value form over substance and substance over form.

I really like writing my documents in markdown and using CVS to store them, but many people do not understand this.

Enter Pandoc

You can build very professional-looking, whitepaper almost quality PDF documents with a single shell command using pandoc.

Its original template kind of sucks (do not also get me started on Latex and its witnesses) and shows its age. But I found a perfect solution - Eisvogel pandoc template.

It takes some fiddling with pandoc params, but in the end it is worth the effort.

- https://github.com/Wandmalfarbe/pandoc-latex-template
- https://pandoc.org/MANUAL.html

With this, you command may look like this:

pandoc \
meeting.md -o \
meeting.pdf \
--from markdown \
--template eisvogel \
--latex-engine=xelatex \
--highlight-style pygments


And viola, you have a perfect investment bank looking document.

Enjoy!

#data_science
Stressing your Headless Server GPUs ... in Style

You know, there are very cool tools to stress your GPU. But the problems is ... that they are either Windows or GUI based (Ungine, furball, etc).

But going through a hassle of installing a desktop environment just for testing? And then deleting it?

Of course you could set up your favourite DL framework, pull some repo and run it on some dataset. But at least in my case it is a bit too much hassle.

So, you can use stress for CPU and there is a tool called gpu-burn for GPU that just multiplies matrices.

- https://lambdalabs.com/blog/perform-gpu-and-cpu-stress-testing-on-linux/
- https://github.com/wilicc/gpu-burn/issues

#deep_learning
Daisy Chaining Your Servers with 10Gbit/s Ethernet for Additional US$100?

Many prosumer MBs now have one 10 Gbit/s Ethernet port. It is fine when you need to connect them in pairs, but what if you need to connect 3 or 4 of them?

Cheapest switches with 10 Gbit/s Ethernet start from US$300 - 500 and these models are usually not in stock.

There is a hack - you can just buy a US$100 PCIE network card (1-2 ports) and use a network bridge.

How well will it work? There is only one way to find out.

#deep_learning
Small nice things

Now when you install Ubuntu server it pulls your ssh key from GitHub and installs docker and offers a lot of standard packages like aws cli or postgres.

Previous installer offered email server, php, MySQL or Apache)

A Small thing, but so nice.
Speculations about x86 ARM Future

https://pc-01.tech/arm/

I just love this channel. Nice speculations. I am pretty sure that MS move towards WLS was a calculated move, but how does this fit into this picture? How will they port old Win32 apps to ARM? Is there some hidden power play we do not understand?

On which ecosystem to bet? My personal choice is Linux. I am too lazy to migrate my laptop to Ubuntu yet, but looks like 2 options can happen (I am just an idiot, I have no proofs):

- Every consumer device becomes ARM;
- There will be a large turmoil in the market, large players will start adopting Linux as a safe haven;

Interesting)

#hardware
Github Sponsors is out of beta in 32 regions.
Except for the CIS of course.
Very hard to extract usable insights, but it looks like:

- A100 is 2x faster than V100, ceteris paribus
- One TPUv4 is about as fast as A100

Does it mean that ampere consumer GPUs will be not +20-30% speed, but +50% speed compared to last generation? Interesting =)