Data Analytics – Telegram

Data Analytics

@DataAnalyticsX

27.6K subscribers

1.18K photos

25 videos

33 files

1.01K links

Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

27.6K subscribers

Pyspark Functions.pdf

M𝗼𝘀𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 𝘂𝘀𝗲 #𝗣𝘆𝗦𝗽𝗮𝗿𝗸 𝗲𝘃𝗲𝗿𝘆 𝗱𝗮𝘆… 𝗯𝘂𝘁 𝗳𝗲𝘄 𝗸𝗻𝗼𝘄 𝘄𝗵𝗶𝗰𝗵 𝗳𝘂𝗻𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗺𝗮𝘅𝗶𝗺𝗶𝘇𝗲 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲.

Ever written long UDFs, confusing joins, or bulky transformations?
Most of that effort is unnecessary — #Spark already gives you built-ins for almost everything.

𝐊𝐞𝐲 𝐈𝐧𝐬𝐢𝐠𝐡𝐭𝐬 (𝐟𝐫𝐨𝐦 𝐭𝐡𝐞 𝐏𝐃𝐅)
• Core Ops: select(), withColumn(), filter(), dropDuplicates()
• Aggregations: groupBy(), countDistinct(), collect_list()
• Strings: concat(), split(), regexp_extract(), trim()
• Window: row_number(), rank(), lead(), lag()
• Date/Time: current_date(), date_add(), last_day(), months_between()
• Arrays/Maps: array(), array_union(), MapType

Just mastering these ~20 functions can simplify 70% of your transformations.

https://t.me/DataAnalyticsX

❤5👍1

1.74K views08:18

Forwarded from Machine Learning with Python

Numpy @CodeProgrammer.pdf

🏷

Sections of the «NumPy» library
⬅️ From introductory to advanced

👨🏻‍💻 This is a long-term project to learn Python and NumPy from scratch. The main task is to handle numerical #data and #arrays in #Python using NumPy, and many other libraries are also used.

✏️ This section shows a structured and complete path for learning #NumPy; but the code examples and exercises help to practically memorize the concepts.

⭕️ Introduction to NumPy
🟠 NumPy arrays
⭕️ Introduction to array features
🟠 Basic operations on arrays
⭕️ Functions for statistical and aggregative purposes
🟠 And...

https://t.me/CodeProgrammer

⛈

⚡️

Please open Telegram to view this post

VIEW IN TELEGRAM

❤4

1.36K views00:01

SQL Ultimate Cheat Sheet

Standard #SQL, Queries & Management

https://t.me/DataAnalyticsX

👾

Please open Telegram to view this post

VIEW IN TELEGRAM

Please open Telegram to view this post

VIEW IN TELEGRAM

❤7👍1

1.64K views08:30

I'm pleased to invite you to join my private Signal group.

All my resources will be free and unrestricted there. My goal is to build a clean community exclusively for smart programmers, and I believe Signal is the most suitable platform for this (Signal is the second most popular app after WhatsApp in the US), making it particularly suitable for us as programmers.

https://signal.group/#CjQKIPcpEqLQow53AG7RHjeVk-4sc1TFxyym3r0gQQzV-OPpEhCPw_-kRmJ8LlC13l0WiEfp

Signal Messenger Group

Follow this link to join a group on Signal Messenger.

❤2

1.59K views06:35

Forwarded from Machine Learning with Python

🚀 #Pandas Cheat Sheet for Everyday Data Work

This covers the essential functions we use in day to day work like inspecting data, selecting rows and columns, cleaning, manipulating and doing quick aggregations.

https://t.me/CodeProgrammer

❤️

Please open Telegram to view this post

VIEW IN TELEGRAM

❤2

1.1K views06:31

❗️LISA HELPS EVERYONE EARN MONEY!$29,000 HE'S GIVING AWAY TODAY!

Everyone can join his channel and make money! He gives away from $200 to $5.000 every day in his channel

https://t.me/+YDWOxSLvMfQ2MGNi

⚡️FREE ONLY FOR THE FIRST 500 SUBSCRIBERS! FURTHER ENTRY IS PAID! 👆👇

https://t.me/+YDWOxSLvMfQ2MGNi

❤3

1.18K views13:02

Forwarded from Machine Learning with Python

Mastering pandas%22.pdf

🌟

A new and comprehensive book "Mastering pandas"

👨🏻‍💻 If I've worked with messy and error-prone data this time, I don't know how much time and energy I've wasted. Incomplete tables, repetitive records, and unorganized data. Exactly the kind of things that make analysis difficult and frustrate you.

⬅️

And the only way to save yourself is to use pandas! A tool that makes processes 10 times faster.

🏷 This book is a comprehensive and organized guide to pandas, so you can start from scratch and gradually master this library and gain the ability to implement real projects. In this file, you'll learn:

🔹 How to clean and prepare large amounts of data for analysis,

🔹 How to analyze real business data and draw conclusions,

🔹 How to automate repetitive tasks with a few lines of code,

🔹 And improve the speed and accuracy of your analyses significantly.

🌐 #DataScience #DataScience #Pandas #Python

https://t.me/CodeProgrammer

⚡️

Please open Telegram to view this post

VIEW IN TELEGRAM

❤2

877 views11:04

Python Libraries You Should Know ✅

⦁ NumPy: Numerical Computing ⚙️
NumPy is the foundation for numerical operations in Python. It provides fast arrays and math functions.

Example:

import numpy as np

arr = np.array([1, 2, 3])
print(arr * 2)  # [2 4 6]

Challenge: Create a 3x3 matrix of random integers from 1–10.

matrix = np.random.randint(1, 11, size=(3, 3))
print(matrix)

⦁ Pandas: Data Analysis 🐼
Pandas makes it easy to work with tabular data using DataFrames.

Example:

import pandas as pd

data = {"Name": ["Alice", "Bob"], "Age": [25, 30]}
df = pd.DataFrame(data)
print(df)

Challenge: Load a CSV file and show the top 5 rows.

df = pd.read_csv("data.csv")
print(df.head())

⦁ Matplotlib: Data Visualization 📊
Matplotlib helps you create charts and plots.

Example:

import matplotlib.pyplot as plt

x = [1, 2, 3]
y = [2, 4, 1]

plt.plot(x, y)
plt.title("Simple Line Plot")
plt.show()

Challenge: Plot a bar chart of fruit sales.

fruits = ["Apples", "Bananas", "Cherries"]
sales = [30, 45, 25]

plt.bar(fruits, sales)
plt.title("Fruit Sales")
plt.show()

⦁ Seaborn: Statistical Plots 🎨
Seaborn builds on Matplotlib with beautiful, high-level charts.

Example:

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
sns.boxplot(x="day", y="total_bill", data=tips)
plt.show()

Challenge: Create a heatmap of correlation.

corr = tips.corr()
sns.heatmap(corr, annot=True, cmap="coolwarm")
plt.show()

⦁ Requests: HTTP for Humans 🌐
Requests makes it easy to send HTTP requests.

Example:

import requests

response = requests.get("https://api.github.com")
print(response.status_code)
print(response.json())

Challenge: Fetch and print your IP address.

res = requests.get("https://api.ipify.org?format=json")
print(res.json()["ip"])

⦁ Beautiful Soup: Web Scraping 🍜
Beautiful Soup helps you extract data from HTML pages.

Example:

from bs4 import BeautifulSoup
import requests

url = "https://example.com"
html = requests.get(url).text
soup = BeautifulSoup(html, "html.parser")

print(soup.title.text)

Challenge: Extract all links from a webpage.

links = soup.find_all("a")
for link in links:
    print(link.get("href"))

Next Steps:
⦁ Combine these libraries for real-world projects
⦁ Try scraping data and analyzing it with Pandas
⦁ Visualize insights with Seaborn and Matplotlib

Double Tap ♥️ For More

❤4

1.25K views11:42

🚀 Master Data Science & Programming!

Unlock your potential with this curated list of Telegram channels. Whether you need books, datasets, interview prep, or project ideas, we have the perfect resource for you. Join the community today!

🔰

Machine Learning with Python
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.
https://t.me/CodeProgrammer

🔖

Machine Learning
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.
https://t.me/DataScienceM

🧠

Code With Python
This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
https://t.me/DataScience4

🎯

PyData Careers | Quiz
Python Data Science jobs, interview tips, and career insights for aspiring professionals.
https://t.me/DataScienceQ

💾

Kaggle Data Hub
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.
https://t.me/datasets1

🧑‍🎓

Udemy Coupons | Courses
The first channel in Telegram that offers free Udemy coupons
https://t.me/DataScienceC

😀

ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
https://t.me/DataScienceT

💬

Data Science Chat
An active community group for discussing data challenges and networking with peers.
https://t.me/DataScience9

🐍

Python Arab| بايثون عربي
The largest Arabic-speaking group for Python developers to share knowledge and help.
https://t.me/PythonArab

🖊

Data Science Jupyter Notebooks
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
https://t.me/DataScienceN

📺

Free Online Courses | Videos
Free online courses covering data science, machine learning, analytics, programming, and essential skills for learners.
https://t.me/DataScienceV

📈

Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
https://t.me/DataAnalyticsX

🎧

Learn Python Hub
Master Python with step-by-step courses – from basics to advanced projects and practical applications.
https://t.me/Python53

⭐️

Research Papers
Professional Academic Writing & Simulation Services
https://t.me/DataScienceY

━━━━━━━━━━━━━━━━━━
Admin: @HusseinSheikho

Please open Telegram to view this post

VIEW IN TELEGRAM

❤2

1.5K views17:03

These Python commands cover 90% of data cleaning tasks you'll ever need 👇

👍3

1.46K views07:48

Want to get into Data Analysis?
Here are paid courses with certificates to build real skills:

1️⃣ Google Data Analytics Certificate
https://lnkd.in/dqEU-yht

2️⃣ IBM Data Science Certificate
https://lnkd.in/dQz58dY6

3️⃣ SQL Basics for Data Science
https://lnkd.in/dcFHHm28

4️⃣ Google Business Intelligence Certificate
https://lnkd.in/d4gbdF24

5️⃣ Microsoft Python Development Certificate
https://lnkd.in/dDXX_AHM

Which data skill are you focusing on now?

❤5

5.33K views06:54

Selecting with Transformations and Conditional Logic

#### Data Setup

#### pandas

import pandas as pd

data = {
    'product_id': [101, 102, 103, 104, 105, 106, 107, 108],
    'product_name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Microphone', 'Speakers', 'Charger'],
    'category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Peripherals', 'Peripherals', 'Audio', 'Accessories'],
    'price': [1200.00, 25.00, 75.00, 300.00, 50.00, 80.00, 150.00, 15.00],
    'stock_quantity': [50, 200, 150, 70, 100, 60, 40, 0]
}
df_pd = pd.DataFrame(data)

#### polars

import polars as pl

data = {
    'product_id': [101, 102, 103, 104, 105, 106, 107, 108],
    'product_name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Microphone', 'Speakers', 'Charger'],
    'category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Peripherals', 'Peripherals', 'Audio', 'Accessories'],
    'price': [1200.00, 25.00, 75.00, 300.00, 50.00, 80.00, 150.00, 15.00],
    'stock_quantity': [50, 200, 150, 70, 100, 60, 40, 0]
}
df_pl = pl.DataFrame(data)

#### SQL (Conceptual Table Structure and Data)

-- CREATE TABLE products (
--     product_id INT PRIMARY KEY,
--     product_name VARCHAR(255),
--     category VARCHAR(255),
--     price DECIMAL(10, 2),
--     stock_quantity INT
-- );

-- INSERT INTO products VALUES
-- (101, 'Laptop', 'Electronics', 1200.00, 50),
-- (102, 'Mouse', 'Electronics', 25.00, 200),
-- (103, 'Keyboard', 'Electronics', 75.00, 150),
-- (104, 'Monitor', 'Electronics', 300.00, 70),
-- (105, 'Webcam', 'Peripherals', 50.00, 100),
-- (106, 'Microphone', 'Peripherals', 80.00, 60),
-- (107, 'Speakers', 'Audio', 150.00, 40),
-- (108, 'Charger', 'Accessories', 15.00, 0);

---

Creating New Columns with Expressions (SELECT col1, col2 + col3 AS new_col)

#### pandas

# Select 'product_name', 'price', and calculate 'total_inventory_value'
result_pd = df_pd.assign(
    total_inventory_value=df_pd['price'] * df_pd['stock_quantity'],
    discounted_price=df_pd['price'] * 0.9
)[['product_name', 'price', 'total_inventory_value', 'discounted_price']]
print(result_pd)

#### polars

# Select 'product_name', 'price', and calculate 'total_inventory_value'
result_pl = df_pl.select(
    'product_name',
    'price',
    (pl.col('price') * pl.col('stock_quantity')).alias('total_inventory_value'),
    (pl.col('price') * 0.9).alias('discounted_price')
)
print(result_pl)

#### SQL

-- Select product_name, price, and calculate total_inventory_value and discounted_price
SELECT
    product_name,
    price,
    price * stock_quantity AS total_inventory_value,
    price * 0.9 AS discounted_price
FROM products;

---

Conditional Column Creation (CASE WHEN equivalent)

#### pandas

852 views11:12

# Create 'price_level' based on price and 'stock_status'
def get_price_level(price):
    if price > 200:
        return 'High'
    elif price > 50:
        return 'Medium'
    else:
        return 'Low'

def get_stock_status(stock):
    if stock == 0:
        return 'Out of Stock'
    elif stock < 50:
        return 'Low Stock'
    else:
        return 'In Stock'

result_pd = df_pd.assign(
    price_level=df_pd['price'].apply(get_price_level),
    stock_status=df_pd['stock_quantity'].apply(get_stock_status)
)[['product_name', 'price', 'price_level', 'stock_quantity', 'stock_status']]
print(result_pd)

#### polars

# Create 'price_level' based on price and 'stock_status'
result_pl = df_pl.select(
    'product_name',
    'price',
    pl.when(pl.col('price') > 200).then(pl.lit('High'))
    .when(pl.col('price') > 50).then(pl.lit('Medium'))
    .otherwise(pl.lit('Low'))
    .alias('price_level'),
    'stock_quantity',
    pl.when(pl.col('stock_quantity') == 0).then(pl.lit('Out of Stock'))
    .when(pl.col('stock_quantity') < 50).then(pl.lit('Low Stock'))
    .otherwise(pl.lit('In Stock'))
    .alias('stock_status')
)
print(result_pl)

#### SQL

-- Create price_level and stock_status based on conditions
SELECT
    product_name,
    price,
    CASE
        WHEN price > 200 THEN 'High'
        WHEN price > 50 THEN 'Medium'
        ELSE 'Low'
    END AS price_level,
    stock_quantity,
    CASE
        WHEN stock_quantity = 0 THEN 'Out of Stock'
        WHEN stock_quantity < 50 THEN 'Low Stock'
        ELSE 'In Stock'
    END AS stock_status
FROM products;

---

String Transformations in Select

#### pandas

# Select product_name in uppercase and first 3 characters of category
result_pd = df_pd.assign(
    product_name_upper=df_pd['product_name'].str.upper(),
    category_prefix=df_pd['category'].str.slice(0, 3)
)[['product_name', 'product_name_upper', 'category', 'category_prefix']]
print(result_pd)

#### polars

# Select product_name in uppercase and first 3 characters of category
result_pl = df_pl.select(
    'product_name',
    pl.col('product_name').str.to_uppercase().alias('product_name_upper'),
    'category',
    pl.col('category').str.slice(0, 3).alias('category_prefix')
)
print(result_pl)

#### SQL

-- Select product_name in uppercase and first 3 characters of category
SELECT
    product_name,
    UPPER(product_name) AS product_name_upper,
    category,
    SUBSTRING(category, 1, 3) AS category_prefix -- Or LEFT(category, 3) in some SQL dialects
FROM products;

---

Selecting with Advanced Filtering (IN, BETWEEN equivalents)

#### pandas

# Select products in 'Electronics' or 'Audio' categories
print("Products in Electronics or Audio:")
print(df_pd[df_pd['category'].isin(['Electronics', 'Audio'])])

# Select products with price between 50 and 200 (inclusive)
print("\nProducts with price between 50 and 200:")
print(df_pd[df_pd['price'].between(50, 200)])

#### polars

❤1

934 views11:12

# Select products in 'Electronics' or 'Audio' categories
print("Products in Electronics or Audio:")
print(df_pl.filter(pl.col('category').is_in(['Electronics', 'Audio'])))

# Select products with price between 50 and 200 (inclusive)
print("\nProducts with price between 50 and 200:")
print(df_pl.filter(pl.col('price').is_between(50, 200)))

#### SQL

-- Select products in 'Electronics' or 'Audio' categories
SELECT *
FROM products
WHERE category IN ('Electronics', 'Audio');

-- Select products with price between 50 and 200 (inclusive)
SELECT *
FROM products
WHERE price BETWEEN 50 AND 200;

https://t.me/DataAnalyticsX

👾

Please open Telegram to view this post

VIEW IN TELEGRAM

❤5

1.11K viewsedited 11:12

This media is not supported in your browser

VIEW IN TELEGRAM

The Python library PandasAI has been released for simplified data analysis using AI.

You can ask questions about the dataset in plain language directly in the AI dialogue, compare different datasets, and create graphs. It saves a lot of time, especially in the initial stage of getting acquainted with the data. It supports CSV, SQL, and Parquet.

And here's the link

😍

👉

https://t.me/DataAnalyticsX

Please open Telegram to view this post

VIEW IN TELEGRAM

❤5

958 viewsedited 17:22

Forwarded from Machine Learning with Python

This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

✅

https://t.me/addlist/8_rRW2scgfRhOTc0

✅

https://t.me/Codeprogrammer

Please open Telegram to view this post

VIEW IN TELEGRAM

❤1

481 views05:24

713 views09:06

1. What is the primary purpose of the pandas library?
A. Working with unstructured multimedia data
B. Creating and manipulating structured tabular data
C. Building machine learning models
D. Visualizing neural networks

Correct answer: B.

2. Which pandas object is one-dimensional and enforces a homogeneous data type?
A. DataFrame
B. Index
C. Series
D. Panel

Correct answer: C.

3. How can a pd.Series be best compared to an Excel structure?
A. Entire worksheet
B. Row
C. Column
D. Pivot table

Correct answer: C.

4. Which object in pandas represents labels for rows or columns?
A. Series
B. DataFrame
C. Index
D. ndarray

Correct answer: C.

5. What happens if no index is provided when creating a pd.Series?
A. An error is raised
B. A random index is created
C. A RangeIndex starting at 0 is created
D. Index values must be inferred manually

Correct answer: C.

6. Which argument is used to explicitly set the data type of a pd.Series?
A. type=
B. data_type=
C. dtype=
D. astype=

Correct answer: C.

7. What is the default value of the name attribute of a pd.Series if not provided?
A. Empty string
B. Undefined
C. None
D. "Series"

Correct answer: C.

8. Which structure allows heterogeneous column data types?
A. Series
B. Index
C. ndarray
D. DataFrame

Correct answer: D.

9. When constructing a DataFrame from a dictionary, what do the dictionary keys represent?
A. Row labels
B. Index levels
C. Column labels
D. Data types

Correct answer: C.

10. Which attribute returns the number of rows in a pd.Series?
A. size
B. shape
C. len()
D. index

Correct answer: B.

11. What does the pd.Series.shape attribute return?
A. An integer
B. A list
C. A one-element tuple
D. A two-element tuple

Correct answer: C.

12. Which attribute of a DataFrame returns a Series of column data types?
A. dtype
B. dtypes
C. types
D. schema

Correct answer: B.

13. What does len(df) return for a DataFrame?
A. Number of columns
B. Total number of elements
C. Number of rows
D. Size of memory used

Correct answer: C.

14. In basic DataFrame selection using df["a"], what is returned?
A. A DataFrame
B. A scalar
C. A NumPy array
D. A Series

Correct answer: D.

15. What does df[["a"]] return?
A. A Series
B. A DataFrame
C. A scalar
D. A NumPy array

Correct answer: B.

16. When using [] with a Series that has a non-default integer index, selection is done by:
A. Position
B. Order of insertion
C. Label
D. Data type

Correct answer: C.

17. Which method should be used for explicit position-based selection in a Series?
A. loc
B. at
C. iloc
D. ix

Correct answer: C.

18. What does ser.iloc[1] return?
A. All rows with label 1
B. The value at position 1
C. A slice of the Series
D. A DataFrame

Correct answer: B.

19. How many indexers are required when using DataFrame.iloc?
A. One
B. Two
C. Three
D. Unlimited

Correct answer: B.

20. What does df.iloc[:, 0] return?
A. First row
B. First column as a Series
C. First column as a DataFrame
D. Entire DataFrame

Correct answer: B.

21. Which method performs label-based selection in a Series?
A. iloc
B. at
C. loc
D. take

Correct answer: C.

22. What is a key difference between slicing with loc and iloc?
A. loc excludes the stop value
B. iloc includes labels
C. loc includes the stop label
D. iloc works only with strings

Correct answer: C.

23. Which operation may raise a KeyError when using loc?
A. Slicing with ordered unique labels
B. Selecting existing labels
C. Slicing with non-unique unordered labels
D. Selecting with lists

Correct answer: C.

24. In a DataFrame, df.loc["Jack", :] selects:
A. All rows named Jack
B. All columns named Jack
C. All columns for the row labeled Jack
D. Only numeric columns

Correct answer: C.

❤3

731 views09:07

25. What is the main advantage of using pd.Index.get_indexer when mixing selection styles?
A. Improved readability
B. Lazy evaluation
C. Better performance by avoiding intermediate objects
D. Automatic type conversion

Correct answer: C.

https://t.me/DataAnalyticsX

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.

Admin: @HusseinSheikho || @Hussein_Sheikho

❤2

690 views09:07

1. What is the result of the following code?

import pandas as pd
s = pd.Series([10, 20, 30], index=[1, 2, 3])
print(s[1])

A. 10
B. 20
C. 30
D. KeyError

Correct answer: A.

2. What will this code output?

import pandas as pd
s = pd.Series([10, 20, 30])
print(s.iloc[1])

A. 10
B. 20
C. 30
D. IndexError

Correct answer: B.

3. What does this print?

import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df.shape)

A. (4,)
B. (2, 2)
C. (1, 4)
D. (2,)

Correct answer: B.

4. What is returned by this expression?

df["a"]

A. DataFrame
B. Series
C. list
D. ndarray

Correct answer: B.

5. What does this code output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df[["a"]].shape)

A. (2,)
B. (1, 2)
C. (2, 1)
D. (4, 1)

Correct answer: C.

6. What is the result?

import pandas as pd
s = pd.Series([1, 2, 3])
print(s > 1)

A. [False, True, True]
B. Series of booleans
C. ndarray of booleans
D. True

Correct answer: B.

7. What does this code produce?

import pandas as pd
s = pd.Series([1, 2, 3])
print(s[s > 1])

A. Series [2, 3]
B. Series [False, True, True]
C. [2, 3]
D. IndexError

Correct answer: A.

8. What is the output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df.iloc[0, 1])

A. 1
B. 2
C. 3
D. 4

Correct answer: C.

9. What does this select?

df.loc[:, "a"]

A. First row
B. First column as Series
C. First column as DataFrame
D. Entire DataFrame

Correct answer: B.

10. What will this code output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(len(df))

A. 1
B. 2
C. 3
D. Error

Correct answer: C.

11. What is returned?

df.values

A. Series
B. DataFrame
C. NumPy ndarray
D. list

Correct answer: C.

12. What does this code output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.index)

A. [0, 1, 2]
B. list
C. RangeIndex
D. ndarray

Correct answer: C.

13. What is the result?

df.columns

A. list
B. Series
C. Index
D. dict

Correct answer: C.

14. What does this return?

df.dtypes

A. dict
B. Series
C. DataFrame
D. ndarray

Correct answer: B.

15. What is printed?

import pandas as pd
s = pd.Series([1, None, 3])
print(s.isna().sum())

A. 0
B. 1
C. 2
D. 3

Correct answer: B.

16. What does this code output?

import pandas as pd
s = pd.Series([1, None, 3])
print(s.dropna().values)

A. [1, None, 3]
B. [None]
C. [1, 3]
D. Error

Correct answer: C.

17. What does this expression return?

df.head(1)

A. First column
B. First row as Series
C. First row as DataFrame
D. Entire DataFrame

Correct answer: C.

18. What is the output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.tail(1)["a"].iloc[0])

A. 1
B. 2
C. 3
D. Error

Correct answer: C.

19. What happens here?

df["c"] = df["a"] * 2

A. Raises KeyError
B. Modifies column a
C. Adds new column c
D. No effect

Correct answer: C.

20. What does this code output?

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.sum().iloc[0])

A. 1
B. 3
C. 6
D. Error

Correct answer: C.

21. What does df.mean() return?
A. scalar
B. Series
C. DataFrame
D. ndarray

Correct answer: B.

22. What is the result?

df["a"].dtype

A. int
B. numpy.int64
C. object
D. float

Correct answer: B.

23. What does this code do?

df = df.rename(columns={"a": "x"})

A. Renames index
B. Renames column a to x
C. Deletes column a
D. Copies DataFrame only

Correct answer: B.

24. What does this expression return?

df.loc[df["a"] > 1, :]

A. Boolean Series
B. Filtered DataFrame
C. Filtered Series
D. Error

Correct answer: B.

25. What is printed?

import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.empty)

A. True
B. False
C. None
D. Error

Correct answer: B.

https://t.me/DataAnalyticsX

😱

Please open Telegram to view this post

VIEW IN TELEGRAM

❤1

723 views09:08