Aspiring Data Science
#hpo #hpt #metalearning #books Пересмотрев доклад, стал искать упомянутую книжку "Automated Machine Learning - Methods, Systems, Challenges". Она оказалась в свободном доступе, по виду хорошая, изучаю. Наугад полистал этот сайт на предмет интересной литературы…
#ensembling #cascading #delegating #arbitrating
Читаю "Metalearning - Applications to Automated Machine Learning and Data Mining". В главе про ансамбли с удивлением обнаружил весьма интересные алгоритмы.
1. Cascade Generalization
Cascade generalization is a type of stacked ensembling where models are arranged in a sequential, layered fashion. Predictions from earlier layers become additional features for models in later layers. Unlike traditional stacking, cascade generalization emphasizes using higher-level models to refine or augment lower-level predictions.
Key idea: Each layer’s models pass their predictions as additional inputs to models in the next layer.
Advantage: Can iteratively refine weak models, allowing later models to correct errors made earlier.
Example: A first-layer model predicts probabilities, and a second-layer model uses those probabilities along with the original features to make the final decision.
2. Cascading
Cascading refers to a progressive model selection strategy where simpler (cheaper) models are used first, and only ambiguous cases are passed to more complex (expensive) models.
Key idea: Reduce computational cost by filtering easy cases early.
Example: A decision tree quickly filters obvious negative cases, and only uncertain cases are sent to a more sophisticated deep learning model.
3. Delegating (or Selective Routing)
Delegating is a framework where multiple models exist, and an intelligent mechanism decides which model should handle each instance. This is also known as an expert selection approach.
Key idea: Different models specialize in different regions of the feature space, and a routing function determines which model should process a given input.
Example: In fraud detection, a rule-based system handles typical transactions, while an anomaly detection model analyzes suspicious ones.
4. Arbitrating
Arbitrating is a meta-learning approach where an additional model (arbitrator) decides which base model’s prediction to trust more. Unlike delegation, where models specialize in different regions, arbitration combines predictions but gives more weight to the most confident model.
Key idea: Instead of picking a single expert, the arbitrator dynamically adjusts confidence in different models.
Example: A reinforcement learning agent (arbitrator) learns which base model performs best in specific scenarios.
5. Meta-Decision Trees (MDTs)
Meta-decision trees (MDTs) are decision trees that learn when to trust each base model in an ensemble, instead of directly predicting the target.
Key idea: The decision tree’s leaves represent which model should be used for a given input, rather than the final prediction itself.
Advantage: Unlike traditional stacking, it explicitly learns a strategy for model selection.
Example: An MDT may learn that Model A performs best for low-income customers, while Model B works better for high-income customers.
Ещё Мастерс писал о чём-то, подобном Arbitrating, когда модели могут специализироваться в разных регионах области определения. Открытых реализаций в Питоне найти не смог навскидку.
Читаю "Metalearning - Applications to Automated Machine Learning and Data Mining". В главе про ансамбли с удивлением обнаружил весьма интересные алгоритмы.
1. Cascade Generalization
Cascade generalization is a type of stacked ensembling where models are arranged in a sequential, layered fashion. Predictions from earlier layers become additional features for models in later layers. Unlike traditional stacking, cascade generalization emphasizes using higher-level models to refine or augment lower-level predictions.
Key idea: Each layer’s models pass their predictions as additional inputs to models in the next layer.
Advantage: Can iteratively refine weak models, allowing later models to correct errors made earlier.
Example: A first-layer model predicts probabilities, and a second-layer model uses those probabilities along with the original features to make the final decision.
2. Cascading
Cascading refers to a progressive model selection strategy where simpler (cheaper) models are used first, and only ambiguous cases are passed to more complex (expensive) models.
Key idea: Reduce computational cost by filtering easy cases early.
Example: A decision tree quickly filters obvious negative cases, and only uncertain cases are sent to a more sophisticated deep learning model.
3. Delegating (or Selective Routing)
Delegating is a framework where multiple models exist, and an intelligent mechanism decides which model should handle each instance. This is also known as an expert selection approach.
Key idea: Different models specialize in different regions of the feature space, and a routing function determines which model should process a given input.
Example: In fraud detection, a rule-based system handles typical transactions, while an anomaly detection model analyzes suspicious ones.
4. Arbitrating
Arbitrating is a meta-learning approach where an additional model (arbitrator) decides which base model’s prediction to trust more. Unlike delegation, where models specialize in different regions, arbitration combines predictions but gives more weight to the most confident model.
Key idea: Instead of picking a single expert, the arbitrator dynamically adjusts confidence in different models.
Example: A reinforcement learning agent (arbitrator) learns which base model performs best in specific scenarios.
5. Meta-Decision Trees (MDTs)
Meta-decision trees (MDTs) are decision trees that learn when to trust each base model in an ensemble, instead of directly predicting the target.
Key idea: The decision tree’s leaves represent which model should be used for a given input, rather than the final prediction itself.
Advantage: Unlike traditional stacking, it explicitly learns a strategy for model selection.
Example: An MDT may learn that Model A performs best for low-income customers, while Model B works better for high-income customers.
Ещё Мастерс писал о чём-то, подобном Arbitrating, когда модели могут специализироваться в разных регионах области определения. Открытых реализаций в Питоне найти не смог навскидку.
🤔1