Google launches new search engine to help scientists find the datasets they need.
#DataSet
π Link Review
β΄οΈ @AI_Python_EN
π£ @AI_Python_Arxiv
βοΈ @AI_Python
  #DataSet
π Link Review
β΄οΈ @AI_Python_EN
π£ @AI_Python_Arxiv
βοΈ @AI_Python
The 10 Biggest datasets of 2018
0) Open Images V4 from Google AI on April 30th Contains 15.4M bounding-boxes for 600 categories on 1.9M images.
Paper: https://lnkd.in/fm4xiUm
1) MURA from Stanford University ML Group on May 24 Radiographic image dataset
Paper: https://lnkd.in/fBy5szB
2) BDD100K from BAIR, Georgia Tech, Peking University, Uber AI
on May 30 Self-Driving Car Dataset.
Paper: https://lnkd.in/f-sYj9k
3) SQuAD 2.0 from Stanford
on June 11 QA Dataset.
Paper: https://lnkd.in/fYc6c5W
4) CoQA from Stanford on August 21 QA Dataset
Paper: https://lnkd.in/fKvuTvE
5) Spider 1.0 from Yale Univ on September 24 Cross-domain semantic parsing and text-to-SQL dataset.
Paper: https://lnkd.in/fWyR2x8
6) HototQA from Carnegie, Stanford, and Montreal on September 25 QA Dataset on Wiki
Paper: https://lnkd.in/fTtTgZt
7) Tencent ML Images from Tencent AI Lab on Oct 18 largest open-source multi-label image dataset
Paper: https://lnkd.in/ffV6VD5
8) Tencent AI Lab Embedding Corpus for Chinese words and phrases on Oct 19 Embeddings Dataset
Paper: https://lnkd.in/ffV6VD5
9) fastMRI from NYU and Facebook AI on November 26
Knee MRI Images Dataset
Paper: https://lnkd.in/fQuUDNk
Read: https://lnkd.in/fXU9Kr6
#dataset #datasets
β΄οΈ @AI_Python_EN
π£ @AI_Python_Arxiv
  0) Open Images V4 from Google AI on April 30th Contains 15.4M bounding-boxes for 600 categories on 1.9M images.
Paper: https://lnkd.in/fm4xiUm
1) MURA from Stanford University ML Group on May 24 Radiographic image dataset
Paper: https://lnkd.in/fBy5szB
2) BDD100K from BAIR, Georgia Tech, Peking University, Uber AI
on May 30 Self-Driving Car Dataset.
Paper: https://lnkd.in/f-sYj9k
3) SQuAD 2.0 from Stanford
on June 11 QA Dataset.
Paper: https://lnkd.in/fYc6c5W
4) CoQA from Stanford on August 21 QA Dataset
Paper: https://lnkd.in/fKvuTvE
5) Spider 1.0 from Yale Univ on September 24 Cross-domain semantic parsing and text-to-SQL dataset.
Paper: https://lnkd.in/fWyR2x8
6) HototQA from Carnegie, Stanford, and Montreal on September 25 QA Dataset on Wiki
Paper: https://lnkd.in/fTtTgZt
7) Tencent ML Images from Tencent AI Lab on Oct 18 largest open-source multi-label image dataset
Paper: https://lnkd.in/ffV6VD5
8) Tencent AI Lab Embedding Corpus for Chinese words and phrases on Oct 19 Embeddings Dataset
Paper: https://lnkd.in/ffV6VD5
9) fastMRI from NYU and Facebook AI on November 26
Knee MRI Images Dataset
Paper: https://lnkd.in/fQuUDNk
Read: https://lnkd.in/fXU9Kr6
#dataset #datasets
β΄οΈ @AI_Python_EN
π£ @AI_Python_Arxiv
Massive Speech Dataset !!! 19,000 hours of Apollo-11 recordings
TASK#1: Speech Activity Detection: SAD
TASK#2: Speaker Diarization: SD
TASK#3: Speaker Identification: SID
TASK#4: Automatic Speech Recognition: ASR
TASK#5: Sentiment Detection: SENTIMENT
http://fearlesssteps.exploreapollo.org/
#NASA #speech #sentiment #dataset
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
  TASK#1: Speech Activity Detection: SAD
TASK#2: Speaker Diarization: SD
TASK#3: Speaker Identification: SID
TASK#4: Automatic Speech Recognition: ASR
TASK#5: Sentiment Detection: SENTIMENT
http://fearlesssteps.exploreapollo.org/
#NASA #speech #sentiment #dataset
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Time Series forecasting & modeling plays an important role in data analysis. The best way to learn #TimeSeries techniques is by applying them on time series data!
Here's a really cool #dataset where you need to
#forecast the traffic for a startup's product. Close to 10,500 data scientists have taken this challenge - can you climb up the leaderboard?https://lnkd.in/fyZiCJt
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
  Here's a really cool #dataset where you need to
#forecast the traffic for a startup's product. Close to 10,500 data scientists have taken this challenge - can you climb up the leaderboard?https://lnkd.in/fyZiCJt
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Introducing TensorFlow Datasets
By TensorFlow: https://lnkd.in/d2yEjSr
#MachineLearning #Data #Dataset #TensorFlow
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
  By TensorFlow: https://lnkd.in/d2yEjSr
#MachineLearning #Data #Dataset #TensorFlow
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Discover Computer Vision Datasets with this search engine
#dataset #image #visual #search #engine #vision
https://www.visualdata.io
β΄οΈ @AI_Python_EN
  #dataset #image #visual #search #engine #vision
https://www.visualdata.io
β΄οΈ @AI_Python_EN
What happens when a dataset has too many #variables? Here are few possible situations which you might come across:
β’ You find that most of the variables are correlated.
β’ You lose patience and decide to run a model on the whole data which returns poor accuracy
β’ You become indecisive about what to do
β’ You start thinking of some strategic method to find few important variables
But dealing with such situations isnβt as difficult as it sounds. Statistical techniques such as factor analysis and principal component analysis help to overcome such difficulties. Here's a detailed guide on Principal Component Analysis - a method to extract important variables from a large set of variables available in a #dataset. Read the full article here: https://lnkd.in/fbKgbrh
β΄οΈ @AI_Python_EN
  β’ You find that most of the variables are correlated.
β’ You lose patience and decide to run a model on the whole data which returns poor accuracy
β’ You become indecisive about what to do
β’ You start thinking of some strategic method to find few important variables
But dealing with such situations isnβt as difficult as it sounds. Statistical techniques such as factor analysis and principal component analysis help to overcome such difficulties. Here's a detailed guide on Principal Component Analysis - a method to extract important variables from a large set of variables available in a #dataset. Read the full article here: https://lnkd.in/fbKgbrh
β΄οΈ @AI_Python_EN
#Dataset list β A list of the biggest datasets for machine learning
π Dataset list
β΄οΈ @AI_Python_EN
  π Dataset list
β΄οΈ @AI_Python_EN
Stanford ML Group just released knee injury dataset they're calling MRNET.
Paper: https://lnkd.in/dwik_zz
Dataset: https://lnkd.in/dwS96AD
#ml #knee #injury #stanford #dataset #deeplearning
https://lnkd.in/dDpD38u
β΄οΈ @AI_Python_EN
  Paper: https://lnkd.in/dwik_zz
Dataset: https://lnkd.in/dwS96AD
#ml #knee #injury #stanford #dataset #deeplearning
https://lnkd.in/dDpD38u
β΄οΈ @AI_Python_EN
Real numbers, data science and chaos: How to fit any dataset with a single parameter
Paper:
https://arxiv.org/abs/1904.12320
Code:
https://github.com/Ranlot/single-parameter-fit/
#artificialintelligence #datascience #dataset #machinelearning
β΄οΈ @AI_Python_EN
  Paper:
https://arxiv.org/abs/1904.12320
Code:
https://github.com/Ranlot/single-parameter-fit/
#artificialintelligence #datascience #dataset #machinelearning
β΄οΈ @AI_Python_EN
