#ml
In his MinT paper, Hyndman said he confused these two quantities in his previous paper. 😂
MinT is a simple method to make forecasts with hierarchical structure coherent. Here coherent means the sum of the lower level forecasts equals the higher level forecasts.
For example, our time series has a strucutre like sales of coca cola + sales of spirit = sales of beverages. If this relations holds for our forecasts, we have coherent forecasts.
This may sound trivial, the problem is in fact hard. There are many trivial methods such as only forecasting lower levels (coca cola, spirit) then use the sum as the higher level (sales of beverages). These are usually too naive to be effective.
MinT is a reconciliation method that combines high level forecasts and the lower level forecasts to find an optimal combination/reconciliation.
https://robjhyndman.com/papers/MinT.pdf
In his MinT paper, Hyndman said he confused these two quantities in his previous paper. 😂
MinT is a simple method to make forecasts with hierarchical structure coherent. Here coherent means the sum of the lower level forecasts equals the higher level forecasts.
For example, our time series has a strucutre like sales of coca cola + sales of spirit = sales of beverages. If this relations holds for our forecasts, we have coherent forecasts.
This may sound trivial, the problem is in fact hard. There are many trivial methods such as only forecasting lower levels (coca cola, spirit) then use the sum as the higher level (sales of beverages). These are usually too naive to be effective.
MinT is a reconciliation method that combines high level forecasts and the lower level forecasts to find an optimal combination/reconciliation.
https://robjhyndman.com/papers/MinT.pdf
#ml
google-research/tuning_playbook: A playbook for systematically maximizing the performance of deep learning models.
https://github.com/google-research/tuning_playbook
google-research/tuning_playbook: A playbook for systematically maximizing the performance of deep learning models.
https://github.com/google-research/tuning_playbook
GitHub
GitHub - google-research/tuning_playbook: A playbook for systematically maximizing the performance of deep learning models.
A playbook for systematically maximizing the performance of deep learning models. - google-research/tuning_playbook
#ml
https://mlcontests.com/state-of-competitive-machine-learning-2022/
Quote from the report:
Successful competitors have mostly converged on a common set of tools — Python, PyData, PyTorch, and gradient-boosted decision trees.
Deep learning still has not replaced gradient-boosted decision trees when it comes to tabular data, though it does often seem to add value when ensembled with boosting methods.
Transformers continue to dominate in NLP, and start to compete with convolutional neural nets in computer vision.
Competitions cover a broad range of research areas including computer vision, NLP, tabular data, robotics, time-series analysis, and many others.
Large ensembles remain common among winners, though single-model solutions do win too.
There are several active machine learning competition platforms, as well as dozens of purpose-built websites for individual competitions.
Competitive machine learning continues to grow in popularity, including in academia.
Around 50% of winners are solo winners; 50% of winners are first-time winners; 30% have won more than once before.
Some competitors are able to invest significantly into hardware used to train their solutions, though others who use free hardware like Google Colab are also still able to win competitions.
https://mlcontests.com/state-of-competitive-machine-learning-2022/
Quote from the report:
Successful competitors have mostly converged on a common set of tools — Python, PyData, PyTorch, and gradient-boosted decision trees.
Deep learning still has not replaced gradient-boosted decision trees when it comes to tabular data, though it does often seem to add value when ensembled with boosting methods.
Transformers continue to dominate in NLP, and start to compete with convolutional neural nets in computer vision.
Competitions cover a broad range of research areas including computer vision, NLP, tabular data, robotics, time-series analysis, and many others.
Large ensembles remain common among winners, though single-model solutions do win too.
There are several active machine learning competition platforms, as well as dozens of purpose-built websites for individual competitions.
Competitive machine learning continues to grow in popularity, including in academia.
Around 50% of winners are solo winners; 50% of winners are first-time winners; 30% have won more than once before.
Some competitors are able to invest significantly into hardware used to train their solutions, though others who use free hardware like Google Colab are also still able to win competitions.
ML Contests
The State of Competitive Machine Learning | ML Contests
We summarise the state of the competitive landscape and analyse the 200+ competitions that took place in 2022. Plus a deep dive analysis of 67 winning solutions to figure out the best strategies to win at competitive ML.
#ml
Pérez J, Barceló P, Marinkovic J. Attention is Turing-Complete. J Mach Learn Res. 2021;22: 1–35. Available: https://jmlr.org/papers/v22/20-302.html
Pérez J, Barceló P, Marinkovic J. Attention is Turing-Complete. J Mach Learn Res. 2021;22: 1–35. Available: https://jmlr.org/papers/v22/20-302.html
#ml
Yeh, Catherine, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, and Martin Wattenberg. 2023. “AttentionViz: A Global View of Transformer Attention.” ArXiv [Cs.HC]. arXiv. http://arxiv.org/abs/2305.03210.
Yeh, Catherine, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, and Martin Wattenberg. 2023. “AttentionViz: A Global View of Transformer Attention.” ArXiv [Cs.HC]. arXiv. http://arxiv.org/abs/2305.03210.
#ml
Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)
https://huggingface.co/blog/autoformer
Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)
https://huggingface.co/blog/autoformer
huggingface.co
Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
#ml
A family tree shows how transformers are evolving.
(HTML is probably the worst name for a model.)
https://arxiv.org/abs/2302.07730
A family tree shows how transformers are evolving.
(HTML is probably the worst name for a model.)
https://arxiv.org/abs/2302.07730
#ml
Hand-Crafted Transformers
HandCrafted.ipynb - Colaboratory
https://colab.research.google.com/github/newhouseb/handcrafted/blob/main/HandCrafted.ipynb
Hand-Crafted Transformers
HandCrafted.ipynb - Colaboratory
https://colab.research.google.com/github/newhouseb/handcrafted/blob/main/HandCrafted.ipynb
Google
HandCrafted.ipynb
Run, share, and edit Python notebooks
#ml
Interesting idea to use Hydra in ML experiments.
https://github.com/ashleve/lightning-hydra-template
Interesting idea to use Hydra in ML experiments.
https://github.com/ashleve/lightning-hydra-template
GitHub
GitHub - ashleve/lightning-hydra-template: PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡ - ashleve/lightning-hydra-template
#ml
Jelassi S, Brandfonbrener D, Kakade SM, Malach E. Repeat after me: Transformers are better than state space models at copying. arXiv [cs.LG]. 2024. Available: http://arxiv.org/abs/2402.01032
Not surprising at all when you have direct access to a long context. But hey, look at this title.
Jelassi S, Brandfonbrener D, Kakade SM, Malach E. Repeat after me: Transformers are better than state space models at copying. arXiv [cs.LG]. 2024. Available: http://arxiv.org/abs/2402.01032
Not surprising at all when you have direct access to a long context. But hey, look at this title.
arXiv.org
Repeat After Me: Transformers are Better than State Space Models at Copying
Transformers are the dominant architecture for sequence modeling, but there is growing interest in models that use a fixed-size latent state that does not depend on the sequence length, which we...
#ml
I got interested in satellite data last year and played with it a bit. It's fantastic. The spatiotemporal nature of it brings up a lot of interesting questions.
Then I saw this paper today:
Rolf, Esther, Konstantin Klemmer, Caleb Robinson, and Hannah Kerner. 2024. “Mission Critical -- Satellite Data Is a Distinct Modality in Machine Learning.” arXiv [Cs.LG], February. http://arxiv.org/abs/2402.01444.
I got interested in satellite data last year and played with it a bit. It's fantastic. The spatiotemporal nature of it brings up a lot of interesting questions.
Then I saw this paper today:
Rolf, Esther, Konstantin Klemmer, Caleb Robinson, and Hannah Kerner. 2024. “Mission Critical -- Satellite Data Is a Distinct Modality in Machine Learning.” arXiv [Cs.LG], February. http://arxiv.org/abs/2402.01444.
arXiv.org
Mission Critical -- Satellite Data is a Distinct Modality in...
Satellite data has the potential to inspire a seismic shift for machine learning -- one in which we rethink existing practices designed for traditional data modalities. As machine learning for...
#ml
Like a dictionary
Kunc, Vladim’ir, and Jivr’i Kl’ema. 2024. “Three Decades of Activations: A Comprehensive Survey of 400 Activation Functions for Neural Networks.” arXiv [Cs.LG], February. http://arxiv.org/abs/2402.09092.
Like a dictionary
Kunc, Vladim’ir, and Jivr’i Kl’ema. 2024. “Three Decades of Activations: A Comprehensive Survey of 400 Activation Functions for Neural Networks.” arXiv [Cs.LG], February. http://arxiv.org/abs/2402.09092.
arXiv.org
Three Decades of Activations: A Comprehensive Survey of 400...
Neural networks have proven to be a highly effective tool for solving complex problems in many areas of life. Recently, their importance and practical usability have further been reinforced with...