Basic Dictionary commands (2):
# Looping
for key in d:
print(key)
for value in d.values():
print(value)
for key, value in d.items():
print(key, value)
# Copying
d_copy = d.copy()
d_copy = dict(d)
# Merging
d = {'name': 'Max', 'age': 28}
d1 = {'name': 'Anna', 'age': 27}
d.update(d1)
d = {**d, **d1}
# Dictionary Comprehension
d = {x: x**2 for x in range(10)}
d = {k: v**2 for k, v in zip(['a', 'b'], range(4))}
d = {k: v for k, v in d.items() if v % 2 == 0}
👍1
Copy and deep copy in dictionaries:
# copy and deepcopy
import copy
d = {'x': [1, 2, 3]}
d1 = d.copy()
d2 = copy.deepcopy(d)
d['x'].append(4)
print(d1) # {'x': [1, 2, 3, 4]}
print(d2) # {'x': [1, 2, 3]}
👍3
This media is not supported in your browser
VIEW IN TELEGRAM
activate "sticky scroll" in vscode through ctrl+shirt+p panel.
#Datashader simplifies creating meaningful visuals from large datasets by breaking down the process into clear steps. It automatically generates accurate visualizations without the need for manual parameter tweaking. Computations are optimized using Python, Numba, Dask, and CUDA, making it efficient even with huge datasets on standard hardware.
https://datashader.org/
Gallery
https://datashader.org/
Gallery
The walrus operator, introduced in Python 3.8, is represented by ":=". It allows the assignment of a value to a variable within an expression.
## f(x) is called 3 times
foo = [f(x), f(x)**2, f(x)**3]
## two lines of code
y = f(x)
foo = [y, y**2, y**3]
## walrus operator
foo = [y := f(x), y**2, y**3]
👍1
Walrus operator [1]
# Avoiding inefficient comprehensions
results = []
for x in data:
res = f(x)
if res:
results.append(res)
# f(x) is called twice
results = [f(x) for x in data if f(x)]
# walrus operator
results = [res for x in data if (res := f(x))]
# Unnecessary variables in scope
match = pattern.search(data)
if match:
do_sth(math)
if (match := pattern.search(data)):
do_sth(match)
# Processing streams in chunks
chunk = file.read(8192)
while chunk:
process(chunk)
chunk = file.read(8192)
# walrus operator
while chunk := file.read(8192):
process(chunk)
Datasets for machine learning typically contain a large number of features, but such high-dimensional feature spaces are not always helpful.
In general, all the features are not equally important and there are certain features that account for a large percentage of variance in the dataset. Dimensionality reduction algorithms aim to reduce the dimension of the feature space to a fraction of the original number of dimensions. In doing so, the features with high variance are still retained—but are in the transformed feature space. And principal component analysis (PCA) is one of the most popular dimensionality reduction algorithms.
Here's a simple example in Python demonstrating PCA for dimensionality reduction before training a scikit-learn classifier.
Github
You may also need to read more about PCA here.
In general, all the features are not equally important and there are certain features that account for a large percentage of variance in the dataset. Dimensionality reduction algorithms aim to reduce the dimension of the feature space to a fraction of the original number of dimensions. In doing so, the features with high variance are still retained—but are in the transformed feature space. And principal component analysis (PCA) is one of the most popular dimensionality reduction algorithms.
Here's a simple example in Python demonstrating PCA for dimensionality reduction before training a scikit-learn classifier.
Github
You may also need to read more about PCA here.
GitHub
workshop_ML/pca/classify_use_pca.ipynb at main · Ziaeemehr/workshop_ML
Machine learning tutorials and examples. Contribute to Ziaeemehr/workshop_ML development by creating an account on GitHub.
Applications for Students & Teaching Assistants are Open!
1️⃣ 3-week Courses (July 8 - 26, 2024):
- Computational Neuroscience: Explore the intricacies of the brain's computational processes and join an engaging community of learners.
- Deep Learning: Delve into the world of machine learning, uncovering the principles and applications of deep learning.
2️⃣ 2-week Courses (July 15 - 26, 2024):
- Computational Tools for Climate Science: Uncover the tools and techniques driving climate science research in this dynamic two-week course.
- NeuroAI - Inaugural Year!: Be part of history as we launch our first-ever NeuroAI course, designed to explore the intersection of neuroscience and artificial intelligence.
https://neuromatch.io/courses/
1️⃣ 3-week Courses (July 8 - 26, 2024):
- Computational Neuroscience: Explore the intricacies of the brain's computational processes and join an engaging community of learners.
- Deep Learning: Delve into the world of machine learning, uncovering the principles and applications of deep learning.
2️⃣ 2-week Courses (July 15 - 26, 2024):
- Computational Tools for Climate Science: Uncover the tools and techniques driving climate science research in this dynamic two-week course.
- NeuroAI - Inaugural Year!: Be part of history as we launch our first-ever NeuroAI course, designed to explore the intersection of neuroscience and artificial intelligence.
https://neuromatch.io/courses/
Incremental principal component analysis (IPCA) is typically used as a replacement for principal component analysis (PCA) when the dataset to be decomposed is too large to fit in memory.
IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage.
I have made some changes on the example from sklearn documentation so one does not need to load the whole dataset in memory.
GitHub
IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage.
I have made some changes on the example from sklearn documentation so one does not need to load the whole dataset in memory.
python
X_ipca = np.zeros((X.shape[0], n_components))
for i in range(3):
ipca.partial_fit(X[i*50:(i+1)*50])
for i in range(3):
X_ipca[i*50:(i+1)*50] = ipca.transform(X[i*50:(i+1)*50])
GitHub
GitHub
workshop_ML/pca/increamental_pca.ipynb at main · Ziaeemehr/workshop_ML
Machine learning tutorials and examples. Contribute to Ziaeemehr/workshop_ML development by creating an account on GitHub.
How can you create an audiobook with a natural human voice and a customized accent? Let's say you have an EPUB file and you're tired of the robotic voice generated by common text-to-speech (TTS) systems. One of the most advanced TTS technologies available today is provided by Openvoice. You can find more information about it here.
It performs optimally with a GPU, but it's also compatible with CPU. To use it on your own machine, simply set up a virtual environment and install the package. You'll also need to download a few additional files. I'm currently using the basic setup with the default voice, but the ability to clone any voice is an incredibly exciting feature.
follow the notebook demo1, extract text from epub and replace the sample test with your favourite book.
You may need to split the book into several chapters to fit into the gpu memory and avoid killing the job.
It took me about 10 min to make audio book from Shogun, a novel which is about 500 pages.
It performs optimally with a GPU, but it's also compatible with CPU. To use it on your own machine, simply set up a virtual environment and install the package. You'll also need to download a few additional files. I'm currently using the basic setup with the default voice, but the ability to clone any voice is an incredibly exciting feature.
follow the notebook demo1, extract text from epub and replace the sample test with your favourite book.
You may need to split the book into several chapters to fit into the gpu memory and avoid killing the job.
It took me about 10 min to make audio book from Shogun, a novel which is about 500 pages.
How to Use ZSH Auto-suggestions?
ZSH is a popular Unix shell that extends the Bourne Again Shell. It comes packed with features and improvements over Bash.
If you already zsh as default terminal just use:
Read more here.
ZSH is a popular Unix shell that extends the Bourne Again Shell. It comes packed with features and improvements over Bash.
If you already zsh as default terminal just use:
# Linux
git clone https://github.com/zsh-users/zsh-autosuggestions ~/.zsh/zsh-autosuggestions
# add to .zshrc
source ~/.zsh/zsh-autosuggestions/zsh-autosuggestions.zsh
# Mac
brew install zsh-autosuggestions
# add to .zshrc
source $(brew --prefix)/share/zsh-autosuggestions/zsh-autosuggestions.zsh
Read more here.
GitHub
GitHub - zsh-users/zsh-autosuggestions: Fish-like autosuggestions for zsh
Fish-like autosuggestions for zsh. Contribute to zsh-users/zsh-autosuggestions development by creating an account on GitHub.
JAX is an open-source Python library developed by Google for high-performance numerical computing, especially suited for machine learning and scientific computing. It provides a combination of automatic differentiation, just-in-time compilation, and support for GPU/TPU acceleration, making it particularly well-suited for scalable and efficient computation on large datasets. JAX is built on top of the XLA (Accelerated Linear Algebra) compiler and is heavily inspired by NumPy, making it easy 🤨 for users familiar with NumPy to transition to JAX.
Let's practice some JAX:
I recommend start with this repo and the series of videos for start.
Videos
GitHub
Then you can move to the Deep learning book.
Deep Learning with JAX
Workshop JAX
Let's practice some JAX:
I recommend start with this repo and the series of videos for start.
Videos
GitHub
Then you can move to the Deep learning book.
Deep Learning with JAX
Workshop JAX
YouTube
Machine Learning with JAX - From Zero to Hero | Tutorial #1
❤️ Become The AI Epiphany Patreon ❤️
https://www.patreon.com/theaiepiphany
👨👩👧👦 Join our Discord community 👨👩👧👦
https://discord.gg/peBrCpheKE
With this video I'm kicking off a series of tutorials on JAX!
JAX is a powerful and increasingly more popular…
https://www.patreon.com/theaiepiphany
👨👩👧👦 Join our Discord community 👨👩👧👦
https://discord.gg/peBrCpheKE
With this video I'm kicking off a series of tutorials on JAX!
JAX is a powerful and increasingly more popular…
I found this to be quite useful, and it might be beneficial for you as well. There's a YouTube video course available, along with a GitHub page focusing on R and Python.
PDF can be found here: @reza_jafari_ai
PDF can be found here: @reza_jafari_ai
Vectorizing in JAX
# 135 µs ± 171 ns per loop
# 543 µs ± 1.38 µs per loop
Adding JIT
Notebook
ویرگول
def dot(v1, v2):1️⃣ Naively vectorizing
return jax.numpy.dot(v1, v2)
dot_naive =[dot(v1, v2) for v1, v2 in zip(v1s, v2s)]2️⃣ Manual vectorizing
def dot_vectorized(v1s, v2s):3️⃣ Automatic vectorizing
return jnp.einsum("ij,ij->i", v1s, v2s)
dot_vmapped = jax.vmap(dot)⏰ Timing
%timeit [dot(v1, v2) for v1, v2 in zip(v1s, v2s)]# 5.15 ms ± 54.3 µs per loop
%timeit dot_vectorized(v1s, v2s).block_until_ready()
%timeit dot_vmapped(v1s, v2s).block_until_ready()
# 135 µs ± 171 ns per loop
# 543 µs ± 1.38 µs per loop
Adding JIT
dot_vectorized_jitted = jax.jit(dot_vectorized)Timing
dot_vmapped_jitted = jax.jit(dot_vmapped)
bash
6.5 µs ± 12.9 ns per loop
6.39 µs ± 13.4 ns per loop
Notebook
ویرگول
Here are some of the most important and frequently used commands in scikit-learn (sklearn):
1. Model Selection:
-
-
-
-
2. Preprocessing:
-
-
-
3. Model Building:
-
-
-
-
-
-
-
4. Model Evaluation:
-
-
-
-
5. Pipeline and Feature Union:
-
-
6. Dimensionality Reduction:
-
-
7. Clustering:
-
-
These are just a few of the many functionalities provided by scikit-learn for machine learning tasks.
1. Model Selection:
-
train_test_split()
: Split arrays or matrices into random train and test subsets.-
cross_val_score()
: Evaluate a score by cross-validation.-
GridSearchCV()
: Exhaustive search over specified parameter values for an estimator.-
StratifiedKFold()
: Provides train/test indices to split data into train/test sets while maintaining class distribution.2. Preprocessing:
-
StandardScaler()
: Standardize features by removing the mean and scaling to unit variance.-
MinMaxScaler()
: Transform features by scaling each feature to a given range.-
OneHotEncoder()
: Encode categorical integer features as one-hot numeric arrays.3. Model Building:
-
LinearRegression()
: Ordinary least squares Linear Regression.-
LogisticRegression()
: Logistic Regression (for classification tasks).-
RandomForestClassifier()
: Random Forest Classifier.-
RandomForestRegressor()
: Random Forest Regressor.-
GradientBoostingClassifier()
: Gradient Boosting Classifier.-
GradientBoostingRegressor()
: Gradient Boosting Regressor.-
DecisionTreeClassifier()
: Decision Tree Classifier.4. Model Evaluation:
-
accuracy_score()
: Accuracy classification score.-
precision_score()
, recall_score()
, f1_score()
: Compute precision, recall, F-measure, and support for classification.-
mean_squared_error()
: Mean squared error regression loss.-
r2_score()
: R^2 (coefficient of determination) regression score function.5. Pipeline and Feature Union:
-
Pipeline()
: Chain multiple estimators into one.-
FeatureUnion()
: Combine several transformer objects into a new transformer.6. Dimensionality Reduction:
-
PCA()
: Principal Component Analysis.-
TruncatedSVD()
: Dimensionality reduction using truncated singular value decomposition.7. Clustering:
-
KMeans()
: K-Means clustering.-
AgglomerativeClustering()
: Agglomerative hierarchical clustering.These are just a few of the many functionalities provided by scikit-learn for machine learning tasks.
👍1