AI, Python, Cognitive Neuroscience
3.8K subscribers
1.09K photos
46 videos
78 files
891 links
Download Telegram
💡 What is the curse of dimensionality?

The curse of dimensionality refers to problems that occur when we try to use statistical methods in high-dimensional space.

As the number of features (dimensionality) increases, the data becomes relatively more sparse and often exponentially more samples are needed to make statistically significant predictions.

Imagine going from a 10x10 grid to a 10x10x10 grid... if we want one sample in each "1x1 square", then the addition of the third parameter requires us to have 10 times as many samples (1000) as we needed when we had 2 parameters (100).

In short, some models become much less accurate in high-dimensional space and may behave erratically. Examples include: linear models with no feature selection or regularization, kNN, Bayesian models

Models that are less affected by the curse of dimensionality: regularized models, random forest, some neural networks, stochastic models (e.g. monte carlo simulations)

#datascience #dsdj #QandA
#machinelearning

For more free info, sign up here -> https://lnkd.in/g7AYg72

❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
💡 What is multi-armed bandit testing and why is it useful?

Multi-armed bandit (or simply "bandit") testing is similar to multivariate and A/B testing, but the sampling distribution for variants change gradually over time as feedback is received.

For example, with traditional A/B tests, we could test 2 email subject lines: A and B. We would initially send out emails to 200 customers, sending 100 A variations and 100 B variations. After some set period of time, say 24 hours, we would observe which email variant was opened by more customers. We would then send that variant to all customers moving forward.

With bandit testing, we would set some learning rate for the distribution of variants to change over time. Perhaps 60 customers opened the A variant emails and only 50 customers opened the B variant emails. We could then shift the distribution from (50% A, 50% B) to (55% A, 45% B) for the next round of emails.

Using this approach, we can continuously monitor the response from our audience and shift our actions accordingly. This is particularly useful in marketing or any industry where people's preferences and opinions may change rapidly since it continuously tests and learns preferences and can adapt very quickly.

#datascience #dsdj #multiarmedbandit
#QandA

✴️ @AI_Python_EN
❇️ @AI_Python
🗣 @AI_Python_arXiv