Am Neumarkt 😱

#DS

Just in case you are also struggling with Python packages on Apple M1 Macs

I am using the third option: anaconda + miniforge.

https://www.anaconda.com/blog/apple-silicon-transition

Anaconda

Anaconda | A Python Data Scientist’s Guide to the Apple Silicon…

Even if you are not a Mac user, you have likely heard Apple is switching from Intel CPUs to their own custom CPUs, which they refer to collectively as "Apple Silicon." The last time Apple changed its computer architecture this dramatically was 15 years ago…

269 viewsMarkt Mai, edited 10:36

Am Neumarkt 😱

#visualization

Hmmm my plate is way off the planetary heath diet recommendation.

Source:
https://www.nature.com/articles/d41586-021-03612-1

274 viewsMarkt Mai, edited 09:52

Am Neumarkt 😱

#ml #rl

How to Train your Decision-Making AIs
https://thegradient.pub/how-to-train-your-decision-making-ais/

The author reviewed "five types of human guidance to train AIs: evaluation, preference, goals, attention, and demonstrations without action labels".

The last one reminds me of the movie Finch. In the movie, Finch was teaching the robot to walk by demonstrating walking but without "labels".

The Gradient

How to Train your Decision-Making AIs

How do humans transfer their knowledge and skills to artificial decision-making agents more efficiently? What kind of knowledge and skills should humans provide and in what format?

268 viewsMarkt Mai, 10:19

Am Neumarkt 😱

#DS #visualization

https://percival.ink/

A new lightweight language for data analysis and visualization. It looks promising.

I hate jupyter notebooks and I don't use them on most of my projects. One of the reasons is low reproducibility due to its non-reative nature. You changed some old cells and forgot to run a cell below, you may read wrong results.
This new language is reactive. If old cells are changed, related results are also updated.

percival.ink

Percival • Web-based, reactive Datalog notebooks

Percival is a declarative data query and visualization language for exploring complex datasets, producing interactive graphics, and sharing results.

293 viewsMarkt Mai, edited 07:40

Am Neumarkt 😱

#ML #Transformers

Alammar J. The Illustrated Transformer. [cited 14 Dec 2021]. Available: http://jalammar.github.io/illustrated-transformer/

So good.

jalammar.github.io

The Illustrated Transformer

Discussions:
Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments)

Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish…

295 viewsMarkt Mai, 20:40

Am Neumarkt 😱

0:33

This media is not supported in your browser

VIEW IN TELEGRAM

#visualization #fun

https://www.githubwrapped.com/

319 viewsMarkt Mai, edited 22:09

Am Neumarkt 😱

#ml #science

I remember several years ago when I was still doing my PhD, there's this contest about predicting protein structure and none of them was working well. At that time, I would never have thought we could have anything like AlphaFold in a few years.
.

https://www.science.org/content/article/breakthrough-2021

Science

Science’s 2021 Breakthrough of the Year: AI brings protein structures to all

Bounty of new structures will forever change biology and medicine

333 viewsMarkt Mai, edited 08:47

Am Neumarkt 😱

#visualization

Pu X, Kay M. A probabilistic grammar of graphics. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. New York, NY, USA: ACM; 2020. doi:10.1145/3313831.3376466
Available at: https://dl.acm.org/doi/10.1145/3313831.3376466

A very good read if you are visualizing probability densities a lot.
The paper began with a common mistake people make when visualizing densities. Then they proposed a systematic grammar of graphics for probabilities. They also provide a package (quite preliminary, see here https://github.com/MUCollective/pgog ).

315 viewsMarkt Mai, edited 09:26

Am Neumarkt 😱

#data #ds

Disclaimer: I'm no expert in state diagram nor statecharts.

It might be something trivial but I find this useful: Combined with some techniques in statecharts (something frontend people like a lot), state diagram is a great way to document what our data is going through in data (pre)processing.

For complicated data transformations, we can make the corresponding state diagram and follow your code to make sure it is working as expected. The only thing is that we are focusing on the state of data not any other system.

We can use some techniques from statecharts, such as hierarchies and parallels.

State diagram is better than flowchart in this scenario because we are more interested in the different states of the data. State diagrams automatically highlights the states and we can easily spot the relevant part in the diagram and we don’t have to start from the beginning.

I documented some data transformations using state diagrams already. I haven't tired but it might also help us document our ML models.

References:
1. https://statecharts.dev
2. https://en.wikipedia.org/wiki/State_diagram

statecharts.dev

Welcome to the world of Statecharts

The world of statecharts describes what statecharts are, their benefits and drawbacks, how they differ from state machines, and practical examples on how to use them.

332 viewsMarkt Mai, edited 21:26

Am Neumarkt 😱

#ds

https://2022.pycon.de/blog/pyconde-pydata-berlin-tickets/

2022.pycon.de

PyConDE & PyData Berlin 2022 Tickets

Tickets for PyConDE & PyData Berlin 2022

278 viewsMarkt Mai, edited 23:49

Am Neumarkt 😱

#python

I thought it was a trivial talk in the beginning.
But I quickly realized that I may know every each piece of the code mentioned in the video but the philosophy is what makes it exciting.

He talked about some fundamental ideas of Python, e.g., protocols.

After watching this video, an idea came to me. Pytorch lightning has implanted a lot of hooks in a very pythonic way. This is what makes pytorch lightning easy to use. (So if you do a lot of machine learning experiments, pytorch lightning is worth a try.)

https://youtu.be/cKPlPJyQrt4

YouTube

James Powell: So you want to be a Python expert? | PyData Seattle 2017

www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each…

228 viewsMarkt Mai, edited 21:15

Am Neumarkt 😱

#visualization

Beautiful, elegant, and informative. It reminds me of the Netflix movie chromatic storytelling visualization.

Full image:
https://zenodo.org/record/5828349

Other discussions:
https://www.reddit.com/r/dataisbeautiful/comments/s6vh8k/dutch_astronomer_cees_bassa_took_a_photo_of_the/

208 viewsMarkt Mai, edited 07:01

Am Neumarkt 😱

#ds

Deepnote supports Great Expectations (GE) now.

I ran their template notebook:

https://deepnote.com/project/Reduce-Pipeline-Debt-With-Great-Expectations-mLT9DFCQSpW4kUBAzzdhBw/%2Fnotebook.ipynb/#00000-e170fae0-7e06-4a7a-85f3-343584ec4b94

277 viewsMarkt Mai, 07:39

Am Neumarkt 😱

#visualization

Seaborn is getting a new interface.

Would be great if the author defines a dunder method _ _ add _ _ () instead of using .add() method. Using dunder add, we can simply use + on layers.

Nevertheless, we can all move away from plotnine when the migration is done.

https://seaborn.pydata.org/nextgen/

327 viewsMarkt Mai, edited 07:13

Am Neumarkt 😱

#ml

https://ruder.io/ml-highlights-2021/

ruder.io

ML and NLP Research Highlights of 2021

This post summarizes progress across multiple impactful areas in ML and NLP in 2021.

298 viewsMarkt Mai, 07:39

Am Neumarkt 😱

Forwarded from DPS Main

确实有很多，比如我用 ack 替代了 grep，速度快了不少。

https://www.ruanyifeng.com/blog/2022/01/cli-alternative-tools.html

256 viewsMarkt Mai, 08:52

Am Neumarkt 😱

#ml

Lol, DeepMind and OpenAI:

https://deepmind.com/blog/article/Competitive-programming-with-AlphaCode

vs

https://openai.com/blog/formal-math/

Google DeepMind

Competitive programming with AlphaCode

Solving novel problems and setting a new milestone in competitive programming.

275 viewsMarkt Mai, edited 06:47

Am Neumarkt 😱

graph-basics.pdf

3.3 MB

#ML

I made some slides to bootstrap a community in my company to share papers on graph related methods (spectral, graph neural networks, etc).
These slides are mostly based on the first two chapters of the book by William Hamilton. I added some intuitive interpretations on some key ideas. Some of these are frequently used in graph neural networks even transformers. Building intuitions helps us unboxing these neural networks. But the slides are only skeleton notes so I probably have to expand them at some point.

I am thinking about drawing more about the book and on this topic. Maybe even making some short videos using these slides. Let's see how far I can go. ~~I am way too busy now. (<-no excuse)~~

260 viewsMarkt Mai, edited 07:20

Am Neumarkt 😱

https://uxdesign.cc/why-do-we-round-corners-5145a90da6ed

211 viewsMarkt Mai, 06:43

Am Neumarkt 😱

#ML #RL #DeepMind

Magnetic control of tokamak plasmas through deep reinforcement learning | Nature
https://www.nature.com/articles/s41586-021-04301-9

Nature

Magnetic control of tokamak plasmas through deep reinforcement learning

Nature - A newly designed control architecture uses deep reinforcement learning to learn to command the coils of a tokamak, and successfully stabilizes a wide variety of fusion plasma configurations.

219 viewsMarkt Mai, edited 09:32

About

Blog

Apps

Platform