Age
count 5.000000
mean 30.000000
std 6.363961
min 22.000000
25% 26.000000
50% 29.000000
75% 35.000000
max 38.000000
---
10.
df.columnsReturns the column labels of the DataFrame.
import pandas as pd
df = pd.DataFrame({'Name': [], 'Age': [], 'City': []})
print(df.columns)
Index(['Name', 'Age', 'City'], dtype='object')
---
11.
df.dtypesReturns the data type of each column.
import pandas as pd
df = pd.DataFrame({'Name': ['Alice'], 'Age': [25], 'Salary': [75000.50]})
print(df.dtypes)
Name object
Age int64
Salary float64
dtype: object
---
12. Selecting a Column
Select a single column, which returns a Pandas Series.
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
ages = df['Age']
print(ages)
0 25
1 30
Name: Age, dtype: int64
#DataSelection #Indexing #Statistics
---
13.
df.loc[]Access a group of rows and columns by label(s) or a boolean array.
import pandas as pd
data = {'Age': [25, 30, 35], 'City': ['NY', 'LA', 'CH']}
df = pd.DataFrame(data, index=['Alice', 'Bob', 'Charlie'])
print(df.loc['Bob'])
Age 30
City LA
Name: Bob, dtype: object
---
14.
df.iloc[]Access a group of rows and columns by integer position(s).
import pandas as pd
data = {'Age': [25, 30, 35], 'City': ['NY', 'LA', 'CH']}
df = pd.DataFrame(data, index=['Alice', 'Bob', 'Charlie'])
print(df.iloc[1]) # Get the second row (index 1)
Age 30
City LA
Name: Bob, dtype: object
---
15.
df.isnull()Returns a DataFrame of the same shape with boolean values indicating if a value is missing (NaN).
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan], 'B': [3, 4]})
print(df.isnull())
A B
0 False False
1 True False
---
16.
df.dropna()Removes missing values.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [4, 5, 6]})
cleaned_df = df.dropna()
print(cleaned_df)
A B
0 1.0 4
2 3.0 6
#DataCleaning #MissingData
---
17.
df.fillna()Fills missing (NaN) values with a specified value or method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Score': [90, 85, np.nan, 92]})
filled_df = df.fillna(0)
print(filled_df)
Score
0 90.0
1 85.0
2 0.0
3 92.0
---
18.
df.drop_duplicates()Removes duplicate rows from the DataFrame.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice'], 'Age': [25, 30, 25]}
df = pd.DataFrame(data)
unique_df = df.drop_duplicates()
print(unique_df)
Name Age
0 Alice 25
1 Bob 30
---
19.
df.rename()Alters axes labels (e.g., column names).
import pandas as pd
df = pd.DataFrame({'A': [1], 'B': [2]})
renamed_df = df.rename(columns={'A': 'Column_A', 'B': 'Column_B'})
print(renamed_df)
Column_A Column_B
0 1 2
---
20.
series.value_counts()Returns a Series containing counts of unique values.