Am Neumarkt 😱

#visualization

Beautiful, elegant, and informative. It reminds me of the Netflix movie chromatic storytelling visualization.

Full image:
https://zenodo.org/record/5828349

Other discussions:
https://www.reddit.com/r/dataisbeautiful/comments/s6vh8k/dutch_astronomer_cees_bassa_took_a_photo_of_the/

208 viewsMarkt Mai, edited 07:01

Am Neumarkt 😱

#ds

Deepnote supports Great Expectations (GE) now.

I ran their template notebook:

https://deepnote.com/project/Reduce-Pipeline-Debt-With-Great-Expectations-mLT9DFCQSpW4kUBAzzdhBw/%2Fnotebook.ipynb/#00000-e170fae0-7e06-4a7a-85f3-343584ec4b94

277 viewsMarkt Mai, 07:39

Am Neumarkt 😱

#visualization

Seaborn is getting a new interface.

Would be great if the author defines a dunder method _ _ add _ _ () instead of using .add() method. Using dunder add, we can simply use + on layers.

Nevertheless, we can all move away from plotnine when the migration is done.

https://seaborn.pydata.org/nextgen/

327 viewsMarkt Mai, edited 07:13

Am Neumarkt 😱

#ml

https://ruder.io/ml-highlights-2021/

ruder.io

ML and NLP Research Highlights of 2021

This post summarizes progress across multiple impactful areas in ML and NLP in 2021.

298 viewsMarkt Mai, 07:39

Am Neumarkt 😱

Forwarded from DPS Main

确实有很多，比如我用 ack 替代了 grep，速度快了不少。

https://www.ruanyifeng.com/blog/2022/01/cli-alternative-tools.html

256 viewsMarkt Mai, 08:52

Am Neumarkt 😱

#ml

Lol, DeepMind and OpenAI:

https://deepmind.com/blog/article/Competitive-programming-with-AlphaCode

vs

https://openai.com/blog/formal-math/

Google DeepMind

Competitive programming with AlphaCode

Solving novel problems and setting a new milestone in competitive programming.

275 viewsMarkt Mai, edited 06:47

Am Neumarkt 😱

graph-basics.pdf

3.3 MB

#ML

I made some slides to bootstrap a community in my company to share papers on graph related methods (spectral, graph neural networks, etc).
These slides are mostly based on the first two chapters of the book by William Hamilton. I added some intuitive interpretations on some key ideas. Some of these are frequently used in graph neural networks even transformers. Building intuitions helps us unboxing these neural networks. But the slides are only skeleton notes so I probably have to expand them at some point.

I am thinking about drawing more about the book and on this topic. Maybe even making some short videos using these slides. Let's see how far I can go. ~~I am way too busy now. (<-no excuse)~~

260 viewsMarkt Mai, edited 07:20

Am Neumarkt 😱

https://uxdesign.cc/why-do-we-round-corners-5145a90da6ed

211 viewsMarkt Mai, 06:43

Am Neumarkt 😱

#ML #RL #DeepMind

Magnetic control of tokamak plasmas through deep reinforcement learning | Nature
https://www.nature.com/articles/s41586-021-04301-9

Nature

Magnetic control of tokamak plasmas through deep reinforcement learning

Nature - A newly designed control architecture uses deep reinforcement learning to learn to command the coils of a tokamak, and successfully stabilizes a wide variety of fusion plasma configurations.

219 viewsMarkt Mai, edited 09:32

Am Neumarkt 😱

#tool

I have been using Hugo for my public notes. I built a theme called connectome a while ago. This theme has been serving as my note-taking theme.

When building my notes website on data science, I have noticed many problems with the connectome theme. And today, I fixed most of the problems. The connectome theme deserves some visibility now.

If you are using Hugo and would like to build a website for connected notes, like this one I have https://datumorphism.leima.is/ , the Hugo connectome theme can help a bit.

The Connectome Theme: https://github.com/kausalflow/connectome
A template one could use to bootstrap a new website: https://github.com/kausalflow/hugo-connectome-theme-demo
Tutorials: https://hugo-connectome.kausalflow.com/projects/tutorials/
Real-world example: https://datumorphism.leima.is/

—
If you would like to know more about how it was done, the idea is quite simple. Before we move on, one FAQ I got is, why Hugo. The answer is simple, speed.

The key components of the connectome theme are:

- automated backlinks, and
- a graph visualization of the whole notebook.

Behind the scene, the heart of the theme is a metadata file that describes the connections between the notes.

For each note, we use the metadata to get all the notes that links to the current note, and build backlinks based on the metadata.

840 viewsMarkt Mai, edited 15:35

Am Neumarkt 😱

#python

I find poetry a great tool to manage Python requirements.

I used to manage Python requirements using requirements.txt(environment.yaml) and install them using pip(conda). The thing is, in this stack, we have to pin the version ranges manually. It is quite tedious, and we easily run into version problems for a large project.

Poetry is the savior here. When developing a package, we add some initial dependencies to the pyproject.yaml, a PEP standard. Whenever a new package is needed, we run poetry add package-name. Poetry tries to figure out the compatible versions. A lock file for the dependencies with restricted versions will be created or updated. To recreate an identical python environment, we only need to run poetry install.

There's one drawback and may be quite painful at some point. Recreating the lock file for dependencies is extremely slow when the complexity grows in the requirements. But this is not a problem if poetry but rather constraints from pypi. One solution to this problem is to use cache.

https://python-poetry.org/

291 viewsMarkt Mai, edited 07:29

Am Neumarkt 😱

#physics

This is so cool

https://physics.aps.org/articles/v15/31

Physics

Airborne Spiders Drift on Multiple Silk Threads

Simulations reveal new details of the way spiders can fly by exploiting the electric field present in the atmosphere.

203 viewsMarkt Mai, edited 20:46

Am Neumarkt 😱

#ml

I share similar thoughts with the top comment by theXYZT.

If I may add to her comment, I would say:
Embrace the new approach even if it shatters our philosophy.
But it's not only about what happened in the history of physics. It's about what we believe in science.
In some sense, the purpose of interpretability and parsimony is for human to come up with better ideas and making us happy. If a universal model is working well enough and can be improved gradually already, interpretability is not as important as predictability.
This is more or less the first principle of science, if I may say so.

https://www.reddit.com/r/MachineLearning/comments/t8fn7m/d_are_we_at_the_end_of_an_era_where_ml_could_be/

[D] Are we at the end of an era where ML could be explained...

Some ML models can be explained rigorously using mathematics, e.g., linear/polynomial/logistic regression and some neural networks. We have full...

236 viewsMarkt Mai, edited 08:35

Am Neumarkt 😱

#visualization

Please click on the link and watch the animation. It's 3D.

------

"The clever people at @NASA have created this deceptively simple yet highly effective data visualisation showing monthly global temperatures between 1880-2021".: nextfuckinglevel
https://www.reddit.com/r/nextfuckinglevel/comments/tejc0l/the_clever_people_at_nasa_have_created_this/?utm_source=share&utm_medium=ios_app&utm_name=iossmf

The clever people at @NASA have created this deceptively simple...

Posted in r/nextfuckinglevel by u/rdias002 • 7,443 points and 419 comments

186 viewsMarkt Mai, edited 11:13

Am Neumarkt 😱

#ml

It’s a lengthy article but also a well written one.

A few comments:

- The author wrote a paper on “The Next Decade in AI”: https://arxiv.org/abs/2002.06177
- Make things work in their own domain. If we are gonna come up with a “theory of everything” for computing or intelligence, we will hit the “mesoscopic” wall, where the bottom up theories and the top down approaches meet but we can’t really make a connection. In the case of intelligence, the wall is determined by the complexities (maybe MDL?). You can make symbols work for high complexities but not always. Similar thing happens to neural networks.
- The neural symbolic approach sounds good but it’s almost like patching a bike as wheels of a train.

https://nautil.us/deep-learning-is-hitting-a-wall-14467/

Nautilus

Deep Learning Is Hitting a Wall

What would it take for artificial intelligence to make real progress?

194 viewsMarkt Mai, edited 07:57

Am Neumarkt 😱

#ml

(WARNING: Promoting of my notes. This is a test.)

I learned something very interesting today: CRPS.

Suppose we would like to approximate the quantile function of some data points.
If we assume a parametric model of the quantile function, e.g., Q(x|theta), how do we find the parameters using the given dataset?
Naturally, we need a loss function to compare our quantile function to the datapoints. CRPS is a robust choice. I have seen it being used in several papers in time series forecasting.

You can find more details here:
https://datumorphism.leima.is/cards/time-series/crps/

datumorphism.leima.is

Continuous Ranked Probability Score - CRPS

The Continuous Ranked Probability Score, known as CRPS, is a score to measure how a proposed distribution approximates the data, without knowledge about the true distributions of the data.
Definition CRPS is defined as1
$$ CRPS(P, x_a) = \int_{-\infty}^\infty…

206 viewsMarkt Mai, 19:11

Am Neumarkt 😱

#tool

I drafted a new release of the Hugo Connectome theme.

I like the command palette in VSCode. It is fast and accurate. So I added a command palette to the Hugo Connectome theme to help us navigate the notes and links.

Now we can use the command palette to navigate to backlinks, out links, references, and more.

See it in action:
https://datumorphism.leima.is/wiki/time-series/state-space-models/
Use Command+K or Windows+K to activate the command palette.

- Type in search to search for notes.
- Type in Note ID to copy the current note id to the clipboard.
- Type in graph to see the graph view of all the notes.
- Type in references to go to references.
- Type in backlinks to select from backlinks to navigate to.
- Type in links to select from all outgoing links to navigate to.

Release:
https://github.com/kausalflow/connectome/releases/tag/0.1.1

datumorphism.leima.is

State Space Models

The state space model is an important category of models for sequential data such as time series

252 viewsMarkt Mai, edited 12:37

Am Neumarkt 😱

#ml

Beautiful and systematic derivation showing how and why negative sampling works

Negative sampling is a great technique to estimate the softmax especially when the calculation of the partition function is intractable. It's used in word2vec, and many other models such as node2vec.

Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv [cs.CL]. 2014. Available: http://arxiv.org/abs/1402.3722

243 viewsMarkt Mai, edited 20:41