So a couple of things - https://goo.gl/CBk7Um
- Movidius USB stick is enough to launch real-time object detection which is interesting to know
- It has shitty driver and library support (Caffe was mentioned)
- Installing everything is FAR from trivial (no idea why virtual box was used, but whatever)
- This guide uses Virtual box instead of Docker which says much
Also PyImageSearch is a sellout - he most likely has advertiser-friendly featured content in this post, looks like the Movidius stick topcoder event did not gain enough traction...
So - use Nvidia Jetsons for embedded solutions and do not bother with this. But it's good that new products emerge.
#deep_learning
- Movidius USB stick is enough to launch real-time object detection which is interesting to know
- It has shitty driver and library support (Caffe was mentioned)
- Installing everything is FAR from trivial (no idea why virtual box was used, but whatever)
- This guide uses Virtual box instead of Docker which says much
Also PyImageSearch is a sellout - he most likely has advertiser-friendly featured content in this post, looks like the Movidius stick topcoder event did not gain enough traction...
So - use Nvidia Jetsons for embedded solutions and do not bother with this. But it's good that new products emerge.
#deep_learning
PyImageSearch
Real-time object detection on the Raspberry Pi with the Movidius NCS - PyImageSearch
In this tutorial I'll demonstrate how you an achieve real-time object detection on the Raspberry Pi using deep learning and Intel's Movidius NCS.
Internet Digest
- Ben Evans - https://goo.gl/XsBqHN
- Flipboard (orly) launches ads - https://goo.gl/2muoiT
- Google sold 3.9 million Pixel phones in 2017 - https://goo.gl/6eUiXw
- Looks like smartbuses may be cool. App => bus route information => route gap => launch cosy bus with music and social features - https://goo.gl/TjKndB (I doubt this is a business though)
- About the importance of decentralization - next Internet will be a set of cryptonetwork protocols - https://goo.gl/c2aB4n
- How London is responding to technological innovationhttps://goo.gl/Dh6NgD
(1) Connected and autonomous vehicles (CAVs) or driverless (2) cars won't be on the road until the 2030s at least and could add to congestion
(3) Dockless cycle schemes need to be able to operate across London to be effective
(4) There is no control system in place for drones and droids
(5) TfL is monitoring technological developments but this needs to be embedded across the whole organisation
- Nice info graphics about city dwellers daily routes on pages 7-10 - https://goo.gl/vV71DR
#internet
#digest
- Ben Evans - https://goo.gl/XsBqHN
- Flipboard (orly) launches ads - https://goo.gl/2muoiT
- Google sold 3.9 million Pixel phones in 2017 - https://goo.gl/6eUiXw
- Looks like smartbuses may be cool. App => bus route information => route gap => launch cosy bus with music and social features - https://goo.gl/TjKndB (I doubt this is a business though)
- About the importance of decentralization - next Internet will be a set of cryptonetwork protocols - https://goo.gl/c2aB4n
- How London is responding to technological innovationhttps://goo.gl/Dh6NgD
(1) Connected and autonomous vehicles (CAVs) or driverless (2) cars won't be on the road until the 2030s at least and could add to congestion
(3) Dockless cycle schemes need to be able to operate across London to be effective
(4) There is no control system in place for drones and droids
(5) TfL is monitoring technological developments but this needs to be embedded across the whole organisation
- Nice info graphics about city dwellers daily routes on pages 7-10 - https://goo.gl/vV71DR
#internet
#digest
So ofc I tried the new Jupyter lab.
And it is really cool that something so simple / cool / useful is completely free / no strings attached (yet). But I will not use it professionally.
Use my Dockerfile if you want to check it out with my DL environment:
(1) https://goo.gl/Y7VMTa
But in a nutshell it worked with jpn params inside the container
(1) https://goo.gl/1UQBnS
But this is a list of reasons, why you might consider sticking to ssh pass-through for auto-complete / terminal and jupyter notebook with extensions:
(0) It is still in beta, so unless your professional path is connected with node-js / web - you better pass now
(1) The existence of amazing extensions for Jupyter notebook that do 95% of what you might need - https://goo.gl/K86gjp
(2) Built-it terminal is much better than before, but it pales in comparison with Putty or even standard linux shell (autocomplete?)
(3) Some of built-in extensions like image viewer are really useful, but overall the product is a bit beta (which they openly say it is)
And here is why turning Jupyter notebook into a real environment is really cool:
(1) Building everything based on extensions IS REALLY COOL - and in the long run will encourage people to port jupyter extensions and build a really powerful tool. Also this implies diversity and freedom unlike shitty tools like Zeppelin
(2) After some effort, it may really replace terminal, IDE, desktop environment and notebooks for data-oriented people (I guess 6-12 monhts)
(3) Structuring extensions and npm packages lures the most fast developing web-developer community to support the project and provides transparency and clarity
#data_science
And it is really cool that something so simple / cool / useful is completely free / no strings attached (yet). But I will not use it professionally.
Use my Dockerfile if you want to check it out with my DL environment:
(1) https://goo.gl/Y7VMTa
But in a nutshell it worked with jpn params inside the container
CMD jupyter lab --port=8888 --ip=0.0.0.0 --no-browserAnd installation is as easy as
conda install -c conda-forge jupyterlabDocs are a bit sparse for now
(1) https://goo.gl/1UQBnS
But this is a list of reasons, why you might consider sticking to ssh pass-through for auto-complete / terminal and jupyter notebook with extensions:
(0) It is still in beta, so unless your professional path is connected with node-js / web - you better pass now
(1) The existence of amazing extensions for Jupyter notebook that do 95% of what you might need - https://goo.gl/K86gjp
(2) Built-it terminal is much better than before, but it pales in comparison with Putty or even standard linux shell (autocomplete?)
(3) Some of built-in extensions like image viewer are really useful, but overall the product is a bit beta (which they openly say it is)
And here is why turning Jupyter notebook into a real environment is really cool:
(1) Building everything based on extensions IS REALLY COOL - and in the long run will encourage people to port jupyter extensions and build a really powerful tool. Also this implies diversity and freedom unlike shitty tools like Zeppelin
(2) After some effort, it may really replace terminal, IDE, desktop environment and notebooks for data-oriented people (I guess 6-12 monhts)
(3) Structuring extensions and npm packages lures the most fast developing web-developer community to support the project and provides transparency and clarity
#data_science
Gist
Dockerfile update
Dockerfile update. GitHub Gist: instantly share code, notes, and snippets.
Was looking for CLAHE abstraction for my image pre-processing pipeline and found one in the Internet
#deep_learning
class CLAHE:
def __init__(self, clipLimit=2.0, tileGridSize=(8, 8)):
self.clipLimit = clipLimit
self.tileGridSize = tileGridSize
def __call__(self, im):
img_yuv = cv2.cvtColor(im, cv2.COLOR_BGR2YUV)
clahe = cv2.createCLAHE(clipLimit=self.clipLimit, tileGridSize=self.tileGridSize)
img_yuv[:, :, 0] = clahe.apply(img_yuv[:, :, 0])
img_output = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2BGR)
return img_output
#deep_learning
Forwarded from Just links
Adversarial Examples that Fool both Human and Computer Vision https://arxiv.org/abs/1802.08195
2017 DS/ML digest 5
Fun stuff
(1) Hardcore metal + CNNs + style transfer - https://goo.gl/VHYfHe
SpaceNet challenge
(1) Post by Nvidia https://goo.gl/6Mw4CB
(2) Some links to sota semseg articles
(3) Useful tools for CV - floodfill and grabcut, but guys from Nvidia did not notice ... that road width was in geojson data...
(4) Looks like they replicated the results just for PR, but their masks do not look appealing
Research / papers / libraries
(1) Neural Voice Cloning with a Few Samples - https://goo.gl/LwmzRf (demos audiodemos.github.io.)
(2) A library for CRFs in Python - https://goo.gl/cQc8hA
(3) 1000x faster CNN architecture search - still on CIFAR - https://arxiv.org/pdf/1802.03268.pdf (PyTorch https://goo.gl/BZ9Vrh)
(4) URLs + CNN - malicious link detection - https://arxiv.org/abs/1802.03162
Datasets
(1) 3m anime image dataset - https://www.gwern.net/Danbooru2017
(2) Google HDR dataset - https://goo.gl/XEL1Fm
Market
(1) Idea - AMT + blockchain - https://goo.gl/JfzEPV
(2) ARM to make processors for CNNs? - https://goo.gl/MpdPSB
(3) Google TPU in beta - https://goo.gl/gRzq9t - very expensive. + Note the rumours that Google's own people do not use their TPU quota
(4) One guy managed to deploy a PyTorch model using ONNX - https://goo.gl/QD4DkZ
#digest
#machine_learning
#data_science
Fun stuff
(1) Hardcore metal + CNNs + style transfer - https://goo.gl/VHYfHe
SpaceNet challenge
(1) Post by Nvidia https://goo.gl/6Mw4CB
(2) Some links to sota semseg articles
(3) Useful tools for CV - floodfill and grabcut, but guys from Nvidia did not notice ... that road width was in geojson data...
(4) Looks like they replicated the results just for PR, but their masks do not look appealing
Research / papers / libraries
(1) Neural Voice Cloning with a Few Samples - https://goo.gl/LwmzRf (demos audiodemos.github.io.)
(2) A library for CRFs in Python - https://goo.gl/cQc8hA
(3) 1000x faster CNN architecture search - still on CIFAR - https://arxiv.org/pdf/1802.03268.pdf (PyTorch https://goo.gl/BZ9Vrh)
(4) URLs + CNN - malicious link detection - https://arxiv.org/abs/1802.03162
Datasets
(1) 3m anime image dataset - https://www.gwern.net/Danbooru2017
(2) Google HDR dataset - https://goo.gl/XEL1Fm
Market
(1) Idea - AMT + blockchain - https://goo.gl/JfzEPV
(2) ARM to make processors for CNNs? - https://goo.gl/MpdPSB
(3) Google TPU in beta - https://goo.gl/gRzq9t - very expensive. + Note the rumours that Google's own people do not use their TPU quota
(4) One guy managed to deploy a PyTorch model using ONNX - https://goo.gl/QD4DkZ
#digest
#machine_learning
#data_science
YouTube
Hardcore Anal Hydrogen "Jean-Pierre" (2018, Apathia Records)
Order "Hypercut" : http://apathia.link/hah
Bandcamp : https://hardcoreanalhydrogen.bandcamp.com/album/hypercut
« A gigantic piece of art here to mess with what’s left of your brain after an afternoon at the mall with the kids. »
Video created with artificial…
Bandcamp : https://hardcoreanalhydrogen.bandcamp.com/album/hypercut
« A gigantic piece of art here to mess with what’s left of your brain after an afternoon at the mall with the kids. »
Video created with artificial…
Just found a book on practical Python programming patterns
- http://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonForProgrammers.html
Looks good
#python
- http://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonForProgrammers.html
Looks good
#python
Savva's company (my jungle teammate) made a brief post about the competition
https://www.objectstyle.com/news/savva-kolbachev-computer-vision-contest-win
https://www.objectstyle.com/news/savva-kolbachev-computer-vision-contest-win
Objectstyle
ObjectStyler Savva Kolbachev wins 3rd prize in Computer Vision contest by Chimp&See - ObjectStyle.com
Congrats to ObjectStyler Savva Kolbachev and his team for winning the 3rd prize in Computer Vision contest organized by Chimp&See!
A great survey - how to work with imbalanced data
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
#data_science
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
#data_science
MachineLearningMastery.com
8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset - MachineLearningMastery.com
Has this happened to you? You are working on your dataset. You create a classification model and get 90% accuracy immediately.
Forwarded from Data Science by ODS.ai 🦜
Most common libraries for Natural Language Processing:
CoreNLP from Stanford group:
http://stanfordnlp.github.io/CoreNLP/index.html
NLTK, the most widely-mentioned NLP library for Python:
http://www.nltk.org/
TextBlob, a user-friendly and intuitive NLTK interface:
https://textblob.readthedocs.io/en/dev/index.html
Gensim, a library for document similarity analysis:
https://radimrehurek.com/gensim/
SpaCy, an industrial-strength NLP library built for performance:
https://spacy.io/docs/
Source: https://itsvit.com/blog/5-heroic-tools-natural-language-processing/
#nlp #digest #libs
CoreNLP from Stanford group:
http://stanfordnlp.github.io/CoreNLP/index.html
NLTK, the most widely-mentioned NLP library for Python:
http://www.nltk.org/
TextBlob, a user-friendly and intuitive NLTK interface:
https://textblob.readthedocs.io/en/dev/index.html
Gensim, a library for document similarity analysis:
https://radimrehurek.com/gensim/
SpaCy, an industrial-strength NLP library built for performance:
https://spacy.io/docs/
Source: https://itsvit.com/blog/5-heroic-tools-natural-language-processing/
#nlp #digest #libs
CoreNLP
High-performance human language analysis tools, now with native deep learning modules in Python, available in many human languages.
A framework to deploy and maintain models by instacart - https://tech.instacart.com/how-to-build-a-deep-learning-model-in-15-minutes-a3684c6f71e - please tell me if anybody tried it
Medium
How to build a deep learning model in 15 minutes
An open source framework for configuring, building, deploying and maintaining deep learning models in Python.
It is tricky to launch XGB fully on GPU. People report that on the same data CatBoost has inferior quality w/o tweaking (but is faster). LightGBM is reported to be faster and to have the same accuracy.
So I tried adding LighGBM w GPU support to my Dockerfile -
https://github.com/Microsoft/LightGBM/blob/master/docs/GPU-Tutorial.rst - but I encountered some driver Docker issues.
One of the caveats I understood - it supports only older Nvidia drivers, up to 384.
Luckily, there is a Dockerfile by MS that seems to be working (+ jupyter, but I could not install extensions)
https://github.com/Microsoft/LightGBM/blob/master/docker/gpu/README.md
#data_science
So I tried adding LighGBM w GPU support to my Dockerfile -
https://github.com/Microsoft/LightGBM/blob/master/docs/GPU-Tutorial.rst - but I encountered some driver Docker issues.
One of the caveats I understood - it supports only older Nvidia drivers, up to 384.
Luckily, there is a Dockerfile by MS that seems to be working (+ jupyter, but I could not install extensions)
https://github.com/Microsoft/LightGBM/blob/master/docker/gpu/README.md
#data_science
GitHub
LightGBM/docs/GPU-Tutorial.rst at master · microsoft/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning ...
Modern Pandas series about classic time-series algorithms
- https://tomaugspurger.github.io/modern-7-timeseries
Some basic boilerplate and baselines
#data_science
#time_series
- https://tomaugspurger.github.io/modern-7-timeseries
Some basic boilerplate and baselines
#data_science
#time_series
tomaugspurger.github.io
datasframe
– Modern Pandas (Part 7): Timeseries
– Modern Pandas (Part 7): Timeseries
Posts and writings by Tom Augspurger
Amazing article about the most popular warning in Pandas
- https://www.dataquest.io/blog/settingwithcopywarning/
#data_science
- https://www.dataquest.io/blog/settingwithcopywarning/
#data_science
Dataquest
SettingwithCopyWarning: How to Fix This Warning in Pandas – Dataquest
SettingWithCopyWarning: Everything you need to know about the most common (and most misunderstood) warning in pandas and how to fix it!
Found some starter boilerplate of how to use hyperopt instead of gridsearch for faster search:
- here - https://goo.gl/ccXkuM
- and here - https://goo.gl/ktblo5
#data_science
- here - https://goo.gl/ccXkuM
- and here - https://goo.gl/ktblo5
#data_science