Machine Learning

📌 The Pearson Correlation Coefficient, Explained Simply

🗂 Category: STATISTICS

🕒 Date: 2025-11-01 | ⏱️ Read time: 7 min read

A simple explanation of the Pearson correlation coefficient with examples

❤2

877 views13:18

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Graph RAG vs SQL RAG

🗂 Category: LARGE LANGUAGE MODELS

🕒 Date: 2025-11-01 | ⏱️ Read time: 7 min read

Evaluating RAGs on graph and SQL databases

858 views17:18

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Understanding the Two Faces of Shiny for Python: Core and Express

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 7 min read

Exploring the Differences and Use Cases of Shiny Core and Shiny Express for Python

715 views21:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Do You Need a Degree to Be a Data Scientist?

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 8 min read

No, but it certainly helps.

645 views21:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

🤖🧠 HunyuanWorld-Mirror: Tencent’s Breakthrough in Universal 3D Reconstruction

🗓️ 03 Nov 2025
📚 AI News & Trends

The race toward achieving universal 3D understanding has reached a significant milestone with Tencent’s HunyuanWorld-Mirror, a cutting-edge open-source model designed to revolutionize 3D reconstruction. In an era dominated by visual intelligence and immersive digital experiences, this new model stands out by offering a feed-forward, geometry-aware framework that can predict multiple 3D outputs in a single ...

#HunyuanWorld #Tencent #3DReconstruction #UniversalAI #GeometryAware #OpenSourceAI

619 views21:20

📖 Read More

📣 BEST TELEGRAM CHANNELS

Machine Learning

📌 Data Scientists Work in the Cloud. Here’s How to Practice This as a Student (Part 2: Python)

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 9 min read

Because data scientists don’t write production code in the Udemy code editor

643 views01:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

💡 Top 50 Operations for Signal Processing in Python

Note: Most examples use numpy, scipy.signal, and matplotlib.pyplot. Assume they are imported as:
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt

I. Signal Generation

• Create a time vector.

fs = 1000  # Sampling frequency
t = np.linspace(0, 1, fs, endpoint=False)

• Generate a sine wave.

freq = 50 # Hz
sine_wave = np.sin(2 * np.pi * freq * t)

• Generate a square wave.

square_wave = signal.square(2 * np.pi * freq * t)

• Generate a sawtooth wave.

sawtooth_wave = signal.sawtooth(2 * np.pi * freq * t)

• Generate Gaussian white noise.

noise = np.random.normal(0, 1, len(t))

• Generate a frequency-swept cosine (chirp).

chirp_signal = signal.chirp(t, f0=1, f1=100, t1=1, method='linear')

• Generate an impulse signal (unit impulse).

impulse = signal.unit_impulse(100, 'mid') # at index 50 of 100

• Generate a Gaussian pulse.

gaus_pulse = signal.gausspulse(t, fc=5, bw=0.5)

II. Signal Visualization & Properties

• Plot a signal.

plt.plot(t, sine_wave)
plt.xlabel("Time [s]")
plt.ylabel("Amplitude")
plt.show()

• Calculate the mean value.

mean_val = np.mean(sine_wave)

• Calculate the Root Mean Square (RMS).

rms_val = np.sqrt(np.mean(sine_wave**2))

• Calculate the standard deviation.

std_dev = np.std(sine_wave)

• Find the maximum value and its index.

max_val = np.max(sine_wave)
max_idx = np.argmax(sine_wave)

III. Frequency Domain Analysis (FFT)

• Compute the Fast Fourier Transform (FFT).

from scipy.fft import fft, fftfreq
yf = fft(sine_wave)

• Get the frequency bins for the FFT.

N = len(sine_wave)
xf = fftfreq(N, 1 / fs)[:N//2]

• Plot the magnitude spectrum.

plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

• Compute the Inverse FFT (IFFT).

from scipy.fft import ifft
original_signal = ifft(yf)

• Compute the Power Spectral Density (PSD) using Welch's method.

f, Pxx_den = signal.welch(sine_wave, fs, nperseg=1024)

IV. Digital Filtering

• Design a Butterworth low-pass filter.

b, a = signal.butter(4, 100, 'low', analog=False, fs=fs)

• Apply a filter to a signal (zero-phase filtering).

noisy_signal = sine_wave + noise
filtered_signal = signal.filtfilt(b, a, noisy_signal)

• Design a Chebyshev Type I high-pass filter.

b, a = signal.cheby1(4, 5, 100, 'high', fs=fs) # 5dB ripple

• Design a Bessel band-pass filter.

b, a = signal.bessel(4, [50, 150], 'band', fs=fs)

• Design an FIR filter using a window method.

numtaps = 101
fir_coeffs = signal.firwin(numtaps, cutoff=100, fs=fs)

• Plot the frequency response of a filter.

w, h = signal.freqz(b, a, fs=fs)
plt.plot(w, 20 * np.log10(abs(h)))

• Apply a median filter (good for salt-and-pepper noise).

median_filtered = signal.medfilt(noisy_signal, kernel_size=3)

• Apply a Wiener filter for noise reduction.

wiener_filtered = signal.wiener(noisy_signal)

V. Resampling & Windowing

• Resample a signal to a new length.

resampled = signal.resample(sine_wave, num=500) # Resample to 500 points

• Decimate a signal (downsample by a factor).

decimated = signal.decimate(sine_wave, q=4) # Downsample by 4

• Create a Hamming window.

window = signal.windows.hamming(51)

• Apply a window to a signal segment.

538 views04:19

Machine Learning

segment = sine_wave[0:51]
windowed_segment = segment * window

VI. Convolution & Correlation

• Perform linear convolution.

sig1 = np.repeat([0., 1., 0.], 100)
sig2 = np.repeat([0., 1., 1., 0.], 100)
convolved = signal.convolve(sig1, sig2, mode='same')

• Compute cross-correlation.

# Useful for finding delays between signals
correlation = signal.correlate(sig1, sig2, mode='full')

• Compute auto-correlation.

# Useful for finding periodicities in a signal
autocorr = signal.correlate(sine_wave, sine_wave, mode='full')

VII. Time-Frequency Analysis

• Compute and plot a spectrogram.

f, t_spec, Sxx = signal.spectrogram(chirp_signal, fs)
plt.pcolormesh(t_spec, f, Sxx, shading='gouraud')
plt.show()

• Perform Continuous Wavelet Transform (CWT).

widths = np.arange(1, 31)
cwt_matrix = signal.cwt(chirp_signal, signal.ricker, widths)

• Perform Hilbert transform to get the analytic signal.

analytic_signal = signal.hilbert(sine_wave)

• Calculate instantaneous frequency.

instant_phase = np.unwrap(np.angle(analytic_signal))
instant_freq = (np.diff(instant_phase) / (2.0*np.pi) * fs)

VIII. Feature Extraction

• Find peaks in a signal.

peaks, _ = signal.find_peaks(sine_wave, height=0.5)

• Find peaks with prominence criteria.

peaks_prom, _ = signal.find_peaks(noisy_signal, prominence=1)

• Differentiate a signal (e.g., to find velocity from position).

derivative = np.diff(sine_wave)

• Integrate a signal.

from scipy.integrate import cumulative_trapezoid
integral = cumulative_trapezoid(sine_wave, t, initial=0)

• Detrend a signal to remove a linear trend.

trend = np.linspace(0, 1, fs)
trended_signal = sine_wave + trend
detrended = signal.detrend(trended_signal)

IX. System Analysis

• Define a system via a transfer function (numerator, denominator).

# Example: 2nd order low-pass filter
system = signal.TransferFunction([1], [1, 1, 1])

• Compute the step response of a system.

t_step, y_step = signal.step(system)

• Compute the impulse response of a system.

t_impulse, y_impulse = signal.impulse(system)

• Compute the Bode plot of a system's frequency response.

w, mag, phase = signal.bode(system)

X. Signal Generation from Data

• Generate a signal from a function.

t = np.linspace(0, 1, 500)
custom_signal = np.sinc(2 * np.pi * 4 * t)

• Convert a list of values to a signal array.

my_data = [0, 1, 2, 3, 2, 1, 0, -1, -2, -1, 0]
data_signal = np.array(my_data)

• Read signal data from a WAV file.

from scipy.io import wavfile
samplerate, data = wavfile.read('audio.wav')

• Create a pulse train signal.

pulse_train = np.zeros(fs)
pulse_train[::100] = 1 # Impulse every 100 samples

#Python #SignalProcessing #SciPy #NumPy #DSP

━━━━━━━━━━━━━━━
By: @DataScienceM ✨

400 views04:19

Machine Learning

💡 Top 50 Matplotlib Commands in Python

Note: Examples assume the following imports:
import matplotlib.pyplot as plt
import numpy as np

I. Figure & Basic Plots

• Create a figure.

fig = plt.figure(figsize=(8, 6))

• Create a basic line plot.

x = np.linspace(0, 10, 100)
plt.plot(x, np.sin(x))

• Show/display the plot.

plt.show()

• Save a figure to a file.

plt.savefig("my_plot.png", dpi=300)

• Create a scatter plot.

plt.scatter(x, np.cos(x))

• Create a bar chart.

categories = ['A', 'B', 'C']
values = [3, 7, 2]
plt.bar(categories, values)

• Create a horizontal bar chart.

plt.barh(categories, values)

• Create a histogram.

data = np.random.randn(1000)
plt.hist(data, bins=30)

• Create a pie chart.

plt.pie(values, labels=categories, autopct='%1.1f%%')

• Create a box plot.

plt.boxplot([data, data*2])

• Display a 2D array or image.

matrix = np.random.rand(10, 10)
plt.imshow(matrix, cmap='viridis')

• Clear the current figure.

plt.clf()

II. Labels, Titles & Legends

• Add a title to the plot.

plt.title("Sine Wave")

• Add a label to the x-axis.

plt.xlabel("Time (s)")

• Add a label to the y-axis.

plt.ylabel("Amplitude")

• Add a legend.

plt.plot(x, np.sin(x), label='Sine')
plt.plot(x, np.cos(x), label='Cosine')
plt.legend()

• Add a grid.

plt.grid(True)

• Add text to the plot at specific coordinates.

plt.text(2, 0.5, 'An important point')

• Add an annotation with an arrow.

plt.annotate('Peak', xy=(np.pi/2, 1), xytext=(3, 1.5),
             arrowprops=dict(facecolor='black', shrink=0.05))

III. Axes & Ticks

• Set the x-axis limits.

plt.xlim(0, 5)

• Set the y-axis limits.

plt.ylim(-1.5, 1.5)

• Set the x-axis ticks and labels.

plt.xticks([0, np.pi, 2*np.pi], ['0', '$\pi$', '$2\pi$'])

• Set the y-axis ticks and labels.

plt.yticks([-1, 0, 1])

• Set a logarithmic scale on an axis.

plt.yscale('log')

• Set the aspect ratio of the plot.

plt.axis('equal') # Other options: 'tight', 'off'

IV. Plot Customization

• Set the color of a plot.

plt.plot(x, np.sin(x), color='red')

• Set the line style.

plt.plot(x, np.sin(x), linestyle='--')

• Set the line width.

plt.plot(x, np.sin(x), linewidth=3)

• Set the marker style for points.

plt.plot(x, np.sin(x), marker='o')

• Set the transparency (alpha).

plt.hist(data, alpha=0.5)

• Use a predefined style.

plt.style.use('ggplot')

• Fill the area between two curves.

plt.fill_between(x, np.sin(x), np.cos(x), alpha=0.2)

• Create an error bar plot.

y_err = 0.2 * np.ones_like(x)
plt.errorbar(x, np.sin(x), yerr=y_err)

• Add a horizontal line.

plt.axhline(y=0, color='k', linestyle='-')

• Add a vertical line.

plt.axvline(x=np.pi, color='k', linestyle='-')

• Add a colorbar for plots like imshow or scatter.

plt.colorbar(label='Magnitude')

V. Subplots (Object-Oriented Approach)

• Create a figure and a grid of subplots (preferred method).

488 views04:21

Machine Learning

fig, ax = plt.subplots() # Single subplot
fig, axes = plt.subplots(2, 2) # 2x2 grid of subplots

• Plot on a specific subplot (Axes object).

axes[0, 0].plot(x, np.sin(x))

• Set the title for a specific subplot.

axes[0, 0].set_title('Subplot 1')

• Set labels for a specific subplot.

axes[0, 0].set_xlabel('X-axis')
axes[0, 0].set_ylabel('Y-axis')

• Add a legend to a specific subplot.

axes[0, 0].legend(['Sine'])

• Add a main title for the entire figure.

fig.suptitle('Main Figure Title')

• Automatically adjust subplot parameters for a tight layout.

plt.tight_layout()

• Share x or y axes between subplots.

fig, axes = plt.subplots(2, 1, sharex=True)

• Get the current Axes instance.

ax = plt.gca()

• Create a second y-axis that shares the x-axis.

ax2 = ax.twinx()

VI. Specialized Plots

• Create a contour plot.

X, Y = np.meshgrid(x, x)
Z = np.sin(X) * np.cos(Y)
plt.contour(X, Y, Z, levels=10)

• Create a filled contour plot.

plt.contourf(X, Y, Z)

• Create a stream plot for vector fields.

U, V = np.cos(X), np.sin(Y)
plt.streamplot(X, Y, U, V)

• Create a 3D surface plot.

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z)

#Python #Matplotlib #DataVisualization #DataScience #Plotting

━━━━━━━━━━━━━━━
By: @DataScienceM ✨

694 views04:21

Machine Learning

📌 SQL Explained: Normal Forms

🗂 Category: DATA ENGINEERING

🕒 Date: 2024-05-29 | ⏱️ Read time: 9 min read

Applying 1st, 2nd and 3rd normal forms to a database

922 views05:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Simple Ways to Speed Up Your PyTorch Model Training

🗂 Category: MACHINE LEARNING

🕒 Date: 2024-05-28 | ⏱️ Read time: 12 min read

If all machine learning engineers want one thing, it’s faster model training - maybe after good test…

939 views09:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Fine-Tune Smaller Transformer Models: Text Classification

🗂 Category: MACHINE LEARNING

🕒 Date: 2024-05-28 | ⏱️ Read time: 22 min read

Using Microsoft’s Phi-3 to generate synthetic data

❤1

942 views13:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 How I Assess the Memory Consumption of My Python Code

🗂 Category: ARTIFICIAL INTELLIGENCE

🕒 Date: 2024-05-28 | ⏱️ Read time: 6 min read

Different approaches to measure the memory consumption of a variable or a function

949 views17:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

📌 Scaling Monosemanticity: Anthropic’s One Step Towards Interpretable & Manipulable LLMs

🗂 Category:

🕒 Date: 2024-05-28 | ⏱️ Read time: 13 min read

From prompt engineering to activation engineering for more controllable and safer LLMs

❤1

846 views21:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

🤖🧠 LongCat-Video: Meituan’s Groundbreaking Step Toward Efficient Long Video Generation with AI

🗓️ 04 Nov 2025
📚 AI News & Trends

In the rapidly advancing field of generative AI, the ability to create realistic, coherent, and high-quality videos from text or images has become one of the most sought-after goals. Meituan, one of the leading technology innovators in China, has made a remarkable stride in this domain with its latest open-source model — LongCat-Video. Designed as ...

#LongCatVideo #Meituan #GenerativeAI #VideoGeneration #AIInnovation #OpenSource

725 views22:21

📖 Read More

📣 BEST TELEGRAM CHANNELS

Machine Learning

📌 Introduction to Domain Adaptation- Motivation, Options, Tradeoffs

🗂 Category:

🕒 Date: 2024-05-28 | ⏱️ Read time: 15 min read

Stepping out of the “comfort zone” – part 1/3 of a deep-dive into domain adaptation…

🔥1

648 views01:19

📖 Read and Learn

🧪 Explore Data Science

Machine Learning

💡 Top 50 Pandas Operations in Python

(Note: Examples assume the import import pandas as pd and import numpy as np)

I. Series & DataFrame Creation

• Create a pandas Series from a list.

s = pd.Series([1, 3, 5, np.nan, 6, 8])

• Create a DataFrame from a dictionary of lists.

data = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
df = pd.DataFrame(data)

• Create a DataFrame from a list of dictionaries.

data = [{'a': 1, 'b': 2}, {'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)

• Read data from a CSV file.

df = pd.read_csv('my_file.csv')

• Create a date range.

dates = pd.date_range('20230101', periods=6)

II. Data Inspection & Selection

• View the first 5 rows.

df.head()

• View the last 5 rows.

df.tail()

• Get a concise summary of the DataFrame.

df.info()

• Get descriptive statistics for numerical columns.

df.describe()

• Get the dimensions of the DataFrame (rows, columns).

df.shape

• Get the column labels.

df.columns

• Get the index (row labels).

df.index

• Select a single column.

df['col1'] # or df.col1

• Select multiple columns.

df[['col1', 'col2']]

• Select rows by label/index name using .loc.

df.loc[0:2, ['col1']] # Select rows 0,1,2 and column 'col1'

• Select rows by integer position using .iloc.

df.iloc[0:3, 0:1] # Select first 3 rows and first column

• Perform boolean/conditional selection.

df[df['col1'] > 2]

• Filter rows using .isin().

df[df['col1'].isin([1, 3])]

III. Data Cleaning

• Check for missing/null values.

df.isnull().sum() # Returns a Series with counts of nulls per column

• Drop rows with any missing values.

df.dropna()

• Fill missing values with a specific value.

df.fillna(value=0)

• Check for duplicated rows.

df.duplicated()

• Drop duplicated rows.

df.drop_duplicates(inplace=True)

IV. Data Manipulation & Operations

• Drop specified labels (columns or rows).

df.drop('col1', axis=1) # Drop a column

• Rename columns.

df.rename(columns={'col1': 'new_col1_name'})

• Set a column as the index.

df.set_index('col1')

• Reset the index.

df.reset_index(drop=True)

• Apply a function along an axis (e.g., per column).

df.apply(np.cumsum)

• Apply a function element-wise to a Series.

df['col1'].map(lambda x: x*100)

• Sort by values in a column.

df.sort_values(by='col1', ascending=False)

• Sort by index.

df.sort_index(axis=1, ascending=False)

• Change the data type of a column.

df['col1'].astype('float')

• Create a new column based on a calculation.

df['new_col'] = df['col1'] * 2

V. Grouping & Aggregation

🔥1

611 views03:07

Machine Learning

• Group data by a column.

df.groupby('col1')

• Group by a column and get the sum.

df.groupby('col1').sum()

• Apply multiple aggregation functions at once.

df.groupby('col1').agg(['mean', 'count'])

• Get the size of each group.

df.groupby('col1').size()

• Get the frequency counts of unique values in a Series.

df['col1'].value_counts()

• Create a pivot table.

pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])

VI. Merging, Joining & Concatenating

• Merge two DataFrames (like a SQL join).

pd.merge(left_df, right_df, on='key_column')

• Concatenate (stack) DataFrames along an axis.

pd.concat([df1, df2]) # Stacks rows

• Join DataFrames on their indexes.

left_df.join(right_df, how='outer')

VII. Input & Output

• Write a DataFrame to a CSV file.

df.to_csv('output.csv', index=False)

• Write a DataFrame to an Excel file.

df.to_excel('output.xlsx', sheet_name='Sheet1')

• Read data from an Excel file.

pd.read_excel('input.xlsx', sheet_name='Sheet1')

• Read from a SQL database.

pd.read_sql_query('SELECT * FROM my_table', connection_object)

VIII. Time Series & Special Operations

• Use the string accessor (.str) for Series operations.

s.str.lower()
s.str.contains('pattern')

• Use the datetime accessor (.dt) for Series operations.

s.dt.year
s.dt.day_name()

• Create a rolling window calculation.

df['col1'].rolling(window=3).mean()

• Create a basic plot from a Series or DataFrame.

df['col1'].plot(kind='hist')

#Python #Pandas #DataAnalysis #DataScience #Programming

━━━━━━━━━━━━━━━
By: @DataScienceM ✨

❤6👍1🔥1

864 views03:07

Machine Learning

• Group data by a column. df.groupby('col1') • Group by a column and get the sum. df.groupby('col1').sum() • Apply multiple aggregation functions at once. df.groupby('col1').agg(['mean', 'count']) • Get the size of each group. df.groupby('col1').size() • Get…

please more likes

782 views03:10

About

Blog

Apps

Platform