Am Neumarkt 😱

#中文 #visualization

看到 TMS channel 推荐的 data stiches，
https://datastitches.substack.com/
关注了几期，感觉质量非常好，经常能看到很棒的作品。

同时推荐一下 TMS channel
https://t.me/tms_ur_way/1031
关于时间管理，效率，和人生。

Substack

Data Stitches

data visualization and digital methods. Click to read Data Stitches, by jsongal, a Substack publication with hundreds of readers.

204 viewsMarkt Mai, edited 18:04

Am Neumarkt 😱

#ML #self-supervised #representation

Contrastive loss is widely used in representation learning. However, the mechanism behind it is not as straightforward as it seems.

Wang & Isola proposed a method to rewrite the contrastive loss in to alignment and uniformity. Samples in the feature space are normalized to unit vectors. These vectors are allocated onto a hypersphere. The two components of the contrastive loss are

- alignment, which forces the positive samples to be aligned on the hypersphere, and
- uniformity, which distributes the samples uniformly on the hypersphere.

By optimization of such objectives, the samples are distributed on a hypersphere, with similar samples clustered, i.e., pointing to the similar directions. Uniformity makes sure the samples are using the whole hypersphere so we don't waste "space".

References:

Wang T, Isola P. Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. arXiv [cs.LG]. 2020. Available: http://arxiv.org/abs/2005.10242

214 viewsMarkt Mai, edited 05:49

Am Neumarkt 😱

#ML

The authors investigate the geometry formed by the responses of neurons for certain stimulations (tunning curve). Using stimulation as the hidden variable, we can construct a geometry of neuron responses. The authors clarified the relations between this geometry and other measurements such as mutual information.

The story itself in this paper may not be interesting to machine learning practitioners. But the method of using the geometry of neuron responses to probe the brain is intriguing. We may borrow this method to help us with the internal mechanism of neural networks.

Kriegeskorte, Nikolaus, and Xue-Xin Wei. 2021. “Neural Tuning and Representational Geometry.” Nature Reviews. Neuroscience, September. https://doi.org/10.1038/s41583-021-00502-3.

Nature

Neural tuning and representational geometry

Nature Reviews Neuroscience - Developing a better understanding of neural codes should enable the links between stimuli, brain activity and behaviour to become clearer. In this Perspective,...

196 viewsMarkt Mai, edited 06:09

Am Neumarkt 😱

#ML

A Gentle Introduction to Graph Neural Networks
https://distill.pub/2021/gnn-intro

Distill

A Gentle Introduction to Graph Neural Networks

What components are needed for building learning algorithms that leverage the structure and properties of graphs?

195 viewsMarkt Mai, edited 08:01

Am Neumarkt 😱

#data

https://github.com/hosseinmoein/DataFrame

GitHub

GitHub - hosseinmoein/DataFrame: C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types…

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage - hosseinmoein/DataFrame

202 viewsMarkt Mai, 08:44

Am Neumarkt 😱

#visualization

The Doomsday Datavisualizations - Bulletin of the Atomic Scientists

https://thebulletin.org/doomsday-clock/datavisualizations/

Bulletin of the Atomic Scientists

The Doomsday Datavisualizations - Bulletin of the Atomic Scientists

Overview Current Time FAQ Timeline Dashboard Datavisualizations Virtual Tour In setting the Doomsday Clock, the Bulletin’s Science and Security Board consults widely with colleagues across a range of disciplines and considers qualitative and quantitative…

223 viewsMarkt Mai, 12:43

Am Neumarkt 😱

#ML

Phys. Rev. X 11, 031059 (2021) - Statistical Mechanics of Deep Linear Neural Networks: The Backpropagating Kernel Renormalization
https://journals.aps.org/prx/abstract/10.1103/PhysRevX.11.031059

Physical Review X

Statistical Mechanics of Deep Linear Neural Networks: The Backpropagating Kernel Renormalization

A new theory of linear deep neural networks allows for the first statistical study of their ``weight space,'' providing insight into the features that allow such networks to generalize so well.

229 viewsMarkt Mai, edited 18:51

Am Neumarkt 😱

#ML #fun

I read about the story of using tensorflow in google translate [^Pointer2019].

> … Google Translate. Originally, the code that handled translation was a weighty 500,000 lines of code. The new, TensorFlow-based system has approximately 500, and it performs better than the old method.

This is crazy. Think about the maintenance of the code. A single person easily maintains 500 lines of code. 500,000 lines? No way.

Reference:
[^Pointer2019]: Pointer I. Programming PyTorch for Deep Learning: Creating and Deploying Deep Learning Applications. O’Reilly Media; 2019.

258 viewsMarkt Mai, 20:46

Am Neumarkt 😱

#ML

scikit learn reached 1.0. Nothing exciting about these new stuff but the major release probably means something.

Release Highlights for scikit-learn 1.0 — scikit-learn 1.0 documentation
http://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_1_0_0.html

215 viewsMarkt Mai, edited 18:24

Am Neumarkt 😱

#career

Comment: Same for many competitive careers

Beware survivorship bias in advice on science careers
https://www.nature.com/articles/d41586-021-02634-z

Nature

Beware survivorship bias in advice on science careers

Nature - For objective careers advice, talk to those who left science as well as those who stayed.

223 viewsMarkt Mai, edited 21:33

Am Neumarkt 😱

yt5s.com-Neural Network 3D Simulation(720p).mp4

41.2 MB

#visualization

Neural Networks visualized in 3D

Source:
https://youtu.be/3JQ3hYko51Y

236 viewsMarkt Mai, 11:10

Am Neumarkt 😱

#visualization

I like this. I was testing visualization using antv's G6. It is not for data analysis as it is quite tedious to generate visualizations.

Observable's plot is a much ~~easier~~ fluent package for data analysis.

https://github.com/observablehq/plot

GitHub

GitHub - observablehq/plot: A concise API for exploratory data visualization implementing a layered grammar of graphics

A concise API for exploratory data visualization implementing a layered grammar of graphics - observablehq/plot

221 viewsMarkt Mai, edited 14:50

Am Neumarkt 😱

#academia

This is not only Julia for biologists. It is for everyone who is not using Julia.

Roesch, Elisabeth, Joe G. Greener, Adam L. MacLean, Huda Nassar, Christopher Rackauckas, Timothy E. Holy, and Michael P. H. Stumpf. 2021. “Julia for Biologists.” ArXiv [q-Bio.QM]. arXiv. http://arxiv.org/abs/2109.09973.

256 viewsMarkt Mai, edited 14:52

Am Neumarkt 😱

#data

Finally.

Announcing Streamlit 1.0! 🎈

https://blog.streamlit.io/announcing-streamlit-1-0/

Streamlit

Announcing Streamlit 1.0! 🎈

Streamlit used to be the simplest way to write data apps. Now it's the most powerful.

264 viewsMarkt Mai, edited 18:36

Am Neumarkt 😱

#visualization #art #fun

More like a blog post…
But the visualisation is cool. I posted it as a comment.

[2109.15079] Asimov's Foundation -- turning a data story into an NFT artwork
https://arxiv.org/abs/2109.15079

227 viewsMarkt Mai, edited 05:55

Am Neumarkt 😱

#ML

Duan T, Avati A, Ding DY, Thai KK, Basu S, Ng AY, et al. NGBoost: Natural Gradient Boosting for probabilistic prediction. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1910.03225

(I had it on my reading list for a long time. However, I didn't read it until today because the title and abstract are not attractive at all.)
But this is a good paper. It goes deep to dig out the fundamental reasons why some methods work and others don't.

When inferring probability distributions, it is straightforward to come up with methods with parametrized distributions (statistical manifolds). Then, by tuning the parameters, we adjust the distribution to fit our dataset the best.
The problem is the choice of the objective function and optimization methods. This paper mentioned a most generic objective function and a framework to optimize the model along the natural gradient instead of just the gradient w.r.t. the parameters.
Different parametrizations of the objective is like coordinate transformations and chain rule only works if the transformations are in a "flat" space but such "flat" space is not necessarily a good choice for a high dimensional problem. For a space that is approximately flat in a small region, we can define distance like what we do in differential geometry[^1]. Meanwhile, just like "covariant derivatives" in differential geometry, some kind of covariant derivative can be found on statistical manifolds and they are called "natural derivatives".
Descending in the direction of natural derivatives is navigating the landscape more efficiently.

[^1]: This a Riemannian space

241 viewsMarkt Mai, edited 20:08

Am Neumarkt 😱

This media is not supported in your browser

VIEW IN TELEGRAM

223 viewsMarkt Mai, 14:19

Am Neumarkt 😱

#visualization

"Fail"
When visualizing data, the units being used have to be specified for any values shown.

But the style of the charts is attractive. :)

By chungischef
Available at:
https://www.reddit.com/r/dataisbeautiful/comments/q958if/recreation_of_a_classic_population_density_map/

258 viewsMarkt Mai, edited 14:19

Am Neumarkt 😱

#ML

(I am experimenting with a new platform. This post is also available at: https://community.kausalflow.com/c/ml-journal-club/how-do-neural-network-generalize )

There are somethings that are quite hard to understand in deep neural networks. One of them is how the network generalizes.

[Zhang2016] shows some experiments about the amazing ability of neural networks to learn even completely random datasets. But they can not generalize as the data is random. How to understand generalization? The authors mentioned some theories like VC dimension, Rademacher complexity, and uniform stability. But none of them is good enough.

Recently, I found the work of Simon et al [Simon2021]. The authors also wrote a blog about this paper [Simon2021Blog].

The idea is to simplify the problem of generalization by looking at how a neural network approximates a function f. This is approximate vectors in Hilbert space. Thus we are looking at the similarity of the vectors f, and its neural network approximation f'. The similarity of these two vectors is related to the eigenvalues of the so-called “neural tangent kernel” (NTK).
Using NTK, they derived an amazingly simple quantity, learnability, which can measure how Hilbert space vectors align with each other, that is, how good the approximation using the neural network is.

[Zhang2016]: Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1611.03530

[Simon2021Blog]: Simon J. A First-Principles Theory of NeuralNetwork Generalization. In: The Berkeley Artificial Intelligence Research Blog [Internet]. [cited 26 Oct 2021]. Available: https://bair.berkeley.edu/blog/2021/10/25/eigenlearning/

[Simon2021]: Simon JB, Dickens M, DeWeese MR. Neural Tangent Kernel Eigenvalues Accurately Predict Generalization. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2110.03922

Kausalflow

How do neural networks generalize? | kausalflow

There are somethings that are quite hard to understand in deep neural networks. One of them is how the network generalizes.

[Zhang2016] shows some experiments about the amazing ability to learn even completely random datasets but can not generalize as...

239 viewsMarkt Mai, edited 19:30

Am Neumarkt 😱

#ML

https://www.microsoft.com/en-us/research/blog/turing-bletchley-a-universal-image-language-representation-model-by-microsoft/

Microsoft Research

Turing Bletchley: A Universal Image Language Representation model by Microsoft - Microsoft Research

Today, the Microsoft Turing team (opens in new tab) is thrilled to introduce Turing Bletchley, a 2.5-billion parameter Universal Image Language Representation model (T-UILR) that can perform image-language tasks in 94 languages. T-Bletchley has an image encoder…

185 viewsMarkt Mai, 13:47

Am Neumarkt 😱

#ML

( I am experimenting with a new platform. This post is also available at: https://community.kausalflow.com/c/ml-journal-club/probably-approximately-correct-pac-learning-and-bayesian-view )

The first time I read about PAC was in the book The Nature of Statistical Learning Theory by Vapnik [^Vapnik2000].

PAC is a systematic theory on why learning from data is even feasible [^Valiant1984]. The idea is to quantify the errors when learning from data and we find that is is possible to have infinitesimal error under some certain codnitions, e.g., large datasets. Quote from Guedj [^Guedj2019]:

> A PAC inequality states that with an arbitrarily high probability (hence "probably"), the performance (as provided by a loss function) of a learning algorithm is upper-bounded by a term decaying to an optimal value as more data is collected (hence "approximately correct").

Bayesian learning is an very important topic in machine learning. We implement Bayesian rule in the components of learning, e.g., postierior in loss function. There also exists a PAC theory for Bayesian learning that explains why Bayesian algorithms works. Guedj wrote a primer on this topic[^Guedj2019].

[^Vapnik2000]: Vladimir N. Vapnik. The Nature of Statistical Learning Theory. 2000. doi:10.1007/978-1-4757-3264-1
[^Valiant1984]: Valiant LG. A theory of the learnable. Commun ACM. 1984;27: 1134–1142. doi:10.1145/1968.1972
[^Guedj2019]: Guedj B. A Primer on PAC-Bayesian Learning. arXiv [stat.ML]. 2019. Available: http://arxiv.org/abs/1901.05353
[^Bernstein2021]: Bernstein J. Machine learning is just statistics + quantifier reversal. In: jeremybernste [Internet]. [cited 1 Nov 2021]. Available: https://jeremybernste.in/writing/ml-is-just-statistics

Kausalflow

Probably Approximately Correct (PAC) learning and bayesian view | kausalflow

The first time I read about PAC was in the book The Nature of Statistical Learning Theory by Vapnik [^Vapnik2000].

PAC is a systematic theory on why learning from data is even feasible [^Valiant1984]. The idea is to quantify the errors when learning f...

206 viewsMarkt Mai, edited 17:04

About

Blog

Apps

Platform