Data 2 Pattern
88 subscribers
3 photos
4 files
18 links
Data science isn't about the quantity of data but rather the quality. β€” Joo Ann Lee
Download Telegram
πŸ” Understanding the Impact of Feature Selection vs. Feature Extraction in Dimensionality Reduction for Big Data πŸ“Š



In the era of big data, working with high-dimensional datasets presents major challenges in processing, visualization, and model performance. A recent study titled "Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data" (Journal of Techniques, 2023) offers a comprehensive evaluation of Feature Selection (FS) and Feature Extraction (FE) using the ANSUR II dataset β€” a U.S. Army anthropometric dataset with 109 features and 6068 observations.



πŸ“Œ Study Goals

To compare FS and FE techniques in terms of:

➑️ Dimensionality reduction

➑️ Predictive performance

➑️ Information retention

βš™οΈ Techniques Explored

🧹 Feature Selection:

πŸ”Έ Highly Correlated Filter – removes features with correlation > 0.88

πŸ”Έ Recursive Feature Elimination (RFE) – eliminates the least important features iteratively

πŸ”„ Feature Extraction:

πŸ”Ή Principal Component Analysis (PCA) – transforms original features into orthogonal components

πŸ§ͺ Methodology

🧼 Data preprocessing using Missing Value Ratio

🧠 Classification using ML models:

βœ… K-Nearest Neighbors (KNN)

βœ… Decision Tree

βœ… Support Vector Machine (SVM)

βœ… Neural Network

βœ… Random Forest

πŸ” Post-reduction classification using the same models

πŸ“ˆ Key Results

πŸ† KNN consistently performed best, maintaining 83% accuracy pre- and post-reduction

🧠 RFE showed the highest accuracy among reduction techniques with 66% post-reduction accuracy

🧩 PCA effectively reduced features but slightly decreased accuracy and interpretability

πŸ’‘ Takeaways

βœ… Use Feature Selection when interpretability and maintaining original structure are important

βœ… Use Feature Extraction for noisy or highly redundant datasets

🎯 The choice depends on your data and modeling objectives

πŸ“– Read the full paper here: DOI: 10.51173/jt.v5i1.1027



This is an excellent reference for anyone navigating the complexities of dimensionality reduction in ML pipelines. Whether you're optimizing models or just curious about FS vs. FE, this study is gold! 🧠✨

#MachineLearning #DataScience #FeatureEngineering #DimensionalityReduction #BigData #AI #KNN #PCA #RFE #MLResearch #DataAnalytics
πŸš€ From One Junior Data Scientist to Another β€” Free Resources to Kickstart Your Journey!

As a junior data scientist myself, I know how tough it can feel to break into this field from finding the right learning path to connecting with a supportive community. The good news? You don’t have to do it alone, and you don’t need to spend a fortune.

Here are two amazing (and FREE) resources that have been super valuable:

πŸŽ“ WorldQuant University

πŸ‘‰Offers 100% free online programs in Data Science, AI, and quantitative fields.
πŸ‘‰Project-based learning with an Applied Data Science Lab.

A great place to build strong foundations and hands-on experience.

🌍 Zindi Africa

πŸ‘‰A community and competition platform for data science & ML.
πŸ‘‰Work on real-world problems, build a portfolio, and grow with peers.
πŸ‘‰Amazing for networking and learning through collaboration.

βœ… If you’re just starting out like me β€” don’t wait! These resources can help you learn, practice, and connect with others on the same path.

Let’s grow together in data πŸš€πŸ“Š

#DataScience #JuniorData #MachineLearning #FreeLearning #WorldQuantUniversity #ZindiAfrica #Community
πŸ”₯4
πŸš€ Join the Ethiopian Data Science & Machine Learning Community! πŸ‡ͺπŸ‡Ή

Are you passionate about Data Science, Machine Learning, and AI?
Do you want to learn, share knowledge, and grow together with like-minded Ethiopians?

πŸ“’ Channel (Updates & Opportunities):
πŸ‘‰ https://t.me/Ethiopian_ds_ml

πŸ’¬ Group (Discussions & Networking):
πŸ‘‰ https://t.me/Ethiopian_ds_ml_community

What you’ll find:
βœ… Events, workshops
βœ… Challenges & hackathons πŸ†
βœ… Networking with fellow enthusiasts 🌐

Let’s build Ethiopia’s future in AI & Data Science together! πŸ’‘


@data_to_pattern @data_to_pattern @data_to_pattern
#DataScience #MachineLearning #AI #Ethiopia #Hackathon #Community
πŸ‘2❀1