CleanShot 2021-03-05 at 13.37.20.png
47.7 KB
#ML #Phyiscs
The easiest method to apply constraints to a dynamical system is through Lagrange multiplier, aka, penalties in statistical learning. Penalties don't guarantee any conservation laws as they are simply penalties, unless you find the multiplers carrying some physical meaning like what we have in Boltzmann statistics.
This paper explains a simple method to hardcode conservation laws in a Neural Network architecture.
Paper:
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.126.098302
TLDR:
See the attached figure. Basically, the hardcoded conservation is realized using additional layers after the normal neural network predictions.
A quick bite of the paper: https://physics.aps.org/articles/v14/s25
Some thoughts:
I like this paper. When physicists work on problems, they like dimensionlessness. This paper follows this convention. This is extremely important when you are working on a numerical problem. One should always make it dimensionless before implementing the equations in code.
The easiest method to apply constraints to a dynamical system is through Lagrange multiplier, aka, penalties in statistical learning. Penalties don't guarantee any conservation laws as they are simply penalties, unless you find the multiplers carrying some physical meaning like what we have in Boltzmann statistics.
This paper explains a simple method to hardcode conservation laws in a Neural Network architecture.
Paper:
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.126.098302
TLDR:
See the attached figure. Basically, the hardcoded conservation is realized using additional layers after the normal neural network predictions.
A quick bite of the paper: https://physics.aps.org/articles/v14/s25
Some thoughts:
I like this paper. When physicists work on problems, they like dimensionlessness. This paper follows this convention. This is extremely important when you are working on a numerical problem. One should always make it dimensionless before implementing the equations in code.
#fun
> Growth in data science interviews plateaued in 2020. Data science interviews only grew by 10% after previously growing by 80% year over year.
> Data engineering specific interviews increased by 40% in the past year.
https://www.interviewquery.com/blog-data-science-interview-report
> Growth in data science interviews plateaued in 2020. Data science interviews only grew by 10% after previously growing by 80% year over year.
> Data engineering specific interviews increased by 40% in the past year.
https://www.interviewquery.com/blog-data-science-interview-report
Interview Query
The 2021 Data Science Interview Report
We analyzed over 10,000 data science interview experiences. Here are our findings.
#ML
I just found an elegant decision tree visualization package for sklearn.
I have been trying to explain decision tree results to many business people. It is very hard. This package makes it much easier to explain the results to a non-techinical person.
https://github.com/parrt/dtreeviz
I just found an elegant decision tree visualization package for sklearn.
I have been trying to explain decision tree results to many business people. It is very hard. This package makes it much easier to explain the results to a non-techinical person.
https://github.com/parrt/dtreeviz
GitHub
GitHub - parrt/dtreeviz: A python library for decision tree visualization and model interpretation.
A python library for decision tree visualization and model interpretation. - parrt/dtreeviz
#ML
Simple algorithm, powerful results
https://avinayak.github.io/algorithms/programming/2021/02/19/finding-mona-lisa-in-the-game-of-life.html
Simple algorithm, powerful results
https://avinayak.github.io/algorithms/programming/2021/02/19/finding-mona-lisa-in-the-game-of-life.html
#fun
India is growing so fast
Oh Germany...
Global AI Vibrancy Tool
Who’s leading the global AI race?
https://aiindex.stanford.edu/vibrancy/
India is growing so fast
Oh Germany...
Global AI Vibrancy Tool
Who’s leading the global AI race?
https://aiindex.stanford.edu/vibrancy/
#ML
How do we interpret the capacities of the neural nets? Naively, we would represent the capacity using the number of parameters. Even for Hopfield network, Hopfield introduced the concept of capacity using entropy which in turn is related to the number of parameters.
But adding layers to neural nets also introduces regularizations. It might be related to capacities of the neural nets but we do not have a clear clue.
This paper introduced a new perspective using sparse approximation theory. Sparse approximation theory represents the data by encouraging parsimony. The more parameters, the more accurate the model is representing the training data. But it causes generalization issues as similar data points in the test data may have been pushed apart [^Murdock2021].
By mapping the neural nets to shallow "overcomplete frames", the capacity of the neural nets is easier to interpret.
[Murdock2021]: Murdock C, Lucey S. Reframing Neural Networks: Deep Structure in Overcomplete Representations. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2103.05804
How do we interpret the capacities of the neural nets? Naively, we would represent the capacity using the number of parameters. Even for Hopfield network, Hopfield introduced the concept of capacity using entropy which in turn is related to the number of parameters.
But adding layers to neural nets also introduces regularizations. It might be related to capacities of the neural nets but we do not have a clear clue.
This paper introduced a new perspective using sparse approximation theory. Sparse approximation theory represents the data by encouraging parsimony. The more parameters, the more accurate the model is representing the training data. But it causes generalization issues as similar data points in the test data may have been pushed apart [^Murdock2021].
By mapping the neural nets to shallow "overcomplete frames", the capacity of the neural nets is easier to interpret.
[Murdock2021]: Murdock C, Lucey S. Reframing Neural Networks: Deep Structure in Overcomplete Representations. arXiv [cs.LG]. 2021. Available: http://arxiv.org/abs/2103.05804
#DataScience
(Please refer to this post https://t.me/amneumarkt/199 for more background.)
I read the book "everyday data science". I think it is not as good as I expected.
The book doesn't explain things clearly at all. Besides, I was expecting something starting from everyday life and being extrapolate to something more scientific.
I also mentioned previously that I would like to write a similar book. Attached is something I created recently that is quite close to the idea of my ideal book for everyday data science.
Cross Referencing Post:
https://t.me/amneumarkt/199
(Please refer to this post https://t.me/amneumarkt/199 for more background.)
I read the book "everyday data science". I think it is not as good as I expected.
The book doesn't explain things clearly at all. Besides, I was expecting something starting from everyday life and being extrapolate to something more scientific.
I also mentioned previously that I would like to write a similar book. Attached is something I created recently that is quite close to the idea of my ideal book for everyday data science.
Cross Referencing Post:
https://t.me/amneumarkt/199
#TIL
How the pandemic changed the way people collaborate.
1. Siloing: From April 2019 to April 2020, modularity, a measure of workgroup siloing, rose around the world.
https://www.microsoft.com/en-us/research/blog/advancing-organizational-science-using-network-machine-learning-to-measure-innovation-in-the-workplace/
How the pandemic changed the way people collaborate.
1. Siloing: From April 2019 to April 2020, modularity, a measure of workgroup siloing, rose around the world.
https://www.microsoft.com/en-us/research/blog/advancing-organizational-science-using-network-machine-learning-to-measure-innovation-in-the-workplace/
Microsoft Research
Advancing organizational science using network machine learning to measure innovation in the workplace - Microsoft Research
Is innovation another loss due to the global COVID-19 pandemic? Indicators reveal challenges to overcome—as well as opportunities to build on our collective experience gained in the last year. Measuring collaboration using network machine learning provides…
AI researchers allege that machine learning is alchemy | Science | AAAS
https://www.sciencemag.org/news/2018/05/ai-researchers-allege-machine-learning-alchemy
https://www.sciencemag.org/news/2018/05/ai-researchers-allege-machine-learning-alchemy
Science
AI researchers allege that machine learning is alchemy
Study cites ways to bolster scientific foundations of artificial intelligence
#ML
Silla CN, Freitas AA. A survey of hierarchical classification across different application domains. Data Min Knowl Discov. 2011;22: 31–72. doi:10.1007/s10618-010-0175-9
A survey paper on hierarchical classification problems. It is a bit old as it didn’t consider the classifier chains, but this paper summarizes most of the ideas in hierarchical classification.
The authors also proposed a framework for the categorization of such problems using two different dimensions (ranks).
Silla CN, Freitas AA. A survey of hierarchical classification across different application domains. Data Min Knowl Discov. 2011;22: 31–72. doi:10.1007/s10618-010-0175-9
A survey paper on hierarchical classification problems. It is a bit old as it didn’t consider the classifier chains, but this paper summarizes most of the ideas in hierarchical classification.
The authors also proposed a framework for the categorization of such problems using two different dimensions (ranks).
#ML
Voss, et al., "Branch Specialization", Distill, 2021. https://distill.pub/2020/circuits/branch-specialization/
TLDR;
- Branch: neuron clusters that are roughly segregated locally, e.g., AlexNet branches by design.
- Branch specialization: branches specialize in specific tasks, e.g., the two AlexNet branches specialize in different detectors (color detector or black-white filter).
- Is it a coincidence? No. Branch specialization repeatedly occurs in different trainings and different models.
- Do we find the same branch specializations in different models and tasks? Yes.
- Why? The authors' proposal is that a positive feedback loop will be established between layers, and this loop enhances what the branch will do.
- Our brains have specialized regions too. Are there any connections?
Voss, et al., "Branch Specialization", Distill, 2021. https://distill.pub/2020/circuits/branch-specialization/
TLDR;
- Branch: neuron clusters that are roughly segregated locally, e.g., AlexNet branches by design.
- Branch specialization: branches specialize in specific tasks, e.g., the two AlexNet branches specialize in different detectors (color detector or black-white filter).
- Is it a coincidence? No. Branch specialization repeatedly occurs in different trainings and different models.
- Do we find the same branch specializations in different models and tasks? Yes.
- Why? The authors' proposal is that a positive feedback loop will be established between layers, and this loop enhances what the branch will do.
- Our brains have specialized regions too. Are there any connections?
Distill
Branch Specialization
When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.
I would like to say thank you for following this channel.
I use this channel as a notebook. Sometimes, I wonder if we could have more interactions. Maybe we could start with this question:
Which of the following do you read the most? (Multiple choice)
I use this channel as a notebook. Sometimes, I wonder if we could have more interactions. Maybe we could start with this question:
Which of the following do you read the most? (Multiple choice)
Anonymous Poll
47%
Data science (career related)
63%
Data science (technical)
47%
Machine learning (theoritical)
37%
Machine learning (applications, libraries)
21%
Something else (I would appreciate it if you leave a comment)
#DS
Wing JM. Ten research challenge areas in data science. Harvard Data Science Review. 2020;114: 1574–1596. doi:10.1162/99608f92.c6577b1f
https://hdsr.mitpress.mit.edu/pub/d9j96ne4/release/2
Wing JM. Ten research challenge areas in data science. Harvard Data Science Review. 2020;114: 1574–1596. doi:10.1162/99608f92.c6577b1f
https://hdsr.mitpress.mit.edu/pub/d9j96ne4/release/2
Harvard Data Science Review
Ten Research Challenge Areas in Data Science · Issue 2.3, Summer 2020