1. What is the difference between the RANK() and DENSE_RANK() functions?
The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.
2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of the given dataset?
One-hot encoding is the representation of categorical variables as binary vectors. Label Encoding is converting labels/words into numeric form. Using one-hot encoding increases the dimensionality of the data set. Label encoding doesnβt affect the dimensionality of the data set. One-hot encoding creates a new variable for each level in the variable whereas, in Label encoding, the levels of a variable get encoded as 1 and 0.
3. Explain the Difference Between Tableau Worksheet, Dashboard, Story, and Workbook in Tableau?
Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
A workbook contains sheets, which can be a worksheet, dashboard, or a story.
A worksheet contains a single view along with shelves, legends, and the Data pane.
A dashboard is a collection of views from multiple worksheets.
A story contains a sequence of worksheets or dashboards that work together to convey information.
4. How can you split a column into 2 or more columns?
You can split a column into 2 or more columns by following the below steps:
1. Select the cell that you want to split. Then, navigate to the Data tab, after that, select Text to Columns. 2. Select the delimiter. 3. Choose the column data format and select the destination you want to display the split. 4. The final output will look like below where the text is split into multiple columns.
The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.
2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of the given dataset?
One-hot encoding is the representation of categorical variables as binary vectors. Label Encoding is converting labels/words into numeric form. Using one-hot encoding increases the dimensionality of the data set. Label encoding doesnβt affect the dimensionality of the data set. One-hot encoding creates a new variable for each level in the variable whereas, in Label encoding, the levels of a variable get encoded as 1 and 0.
3. Explain the Difference Between Tableau Worksheet, Dashboard, Story, and Workbook in Tableau?
Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
A workbook contains sheets, which can be a worksheet, dashboard, or a story.
A worksheet contains a single view along with shelves, legends, and the Data pane.
A dashboard is a collection of views from multiple worksheets.
A story contains a sequence of worksheets or dashboards that work together to convey information.
4. How can you split a column into 2 or more columns?
You can split a column into 2 or more columns by following the below steps:
1. Select the cell that you want to split. Then, navigate to the Data tab, after that, select Text to Columns. 2. Select the delimiter. 3. Choose the column data format and select the destination you want to display the split. 4. The final output will look like below where the text is split into multiple columns.
π41β€4
1. What is a Self-Join?
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
π37
Which of the following is not a component of Power BI?
Anonymous Quiz
14%
Power Query
16%
Power Pivot
56%
Power Prep
14%
Power View
π30π3
Which of the following is not explicitly used for Data Visualization?
Anonymous Quiz
4%
Tableau
8%
Power BI
30%
Qlik
59%
SQL
π23β€5π₯°1
Which of the following is one of the view in Power BI Desktop?
Anonymous Quiz
27%
Report View
48%
Dashboard View
9%
Chart View
16%
ETL View
π32π4β€3π2
Best way to learn a tool?
Anonymous Poll
10%
Learn entire theory and then practice the tool
4%
Read the books to learn the tool and then practice it
14%
Read the official documentation and simultaneously practice the tool
69%
Learn basics of the tool and simultaneously practice it. Watch videos/ read documentation if stuck
3%
None of the above approach works
π21π4β€2
Which of the following is not a python library?
Anonymous Quiz
3%
Numpy
4%
Pandas
72%
Hackerflow
14%
Tensorflow
7%
Matplotlib
π46π5π4π1
1. What is the difference between SQL and MySQL?
SQL is a standard language for retrieving and manipulating structured databases. On the contrary, MySQL is a relational database management system, like SQL Server, Oracle or IBM DB2, that is used to manage SQL databases.
2. What is a Cross-Join?
Cross join can be defined as a cartesian product of the two tables included in the join. The table after join contains the same number of rows as in the cross-product of the number of rows in the two tables. If a WHERE clause is used in cross join then the query will work like an INNER JOIN.
3. What is a Stored Procedure?
A stored procedure is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data dictionary. The sole disadvantage of stored procedure is that it can be executed nowhere except in the database and occupies more memory in the database server.
4. What is Pattern Matching in SQL?
SQL pattern matching provides for pattern search in data if you have no clue as to what that word should be. This kind of SQL query uses wildcards to match a string pattern, rather than writing the exact word. The LIKE operator is used in conjunction with SQL Wildcards to fetch the required information.
SQL is a standard language for retrieving and manipulating structured databases. On the contrary, MySQL is a relational database management system, like SQL Server, Oracle or IBM DB2, that is used to manage SQL databases.
2. What is a Cross-Join?
Cross join can be defined as a cartesian product of the two tables included in the join. The table after join contains the same number of rows as in the cross-product of the number of rows in the two tables. If a WHERE clause is used in cross join then the query will work like an INNER JOIN.
3. What is a Stored Procedure?
A stored procedure is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data dictionary. The sole disadvantage of stored procedure is that it can be executed nowhere except in the database and occupies more memory in the database server.
4. What is Pattern Matching in SQL?
SQL pattern matching provides for pattern search in data if you have no clue as to what that word should be. This kind of SQL query uses wildcards to match a string pattern, rather than writing the exact word. The LIKE operator is used in conjunction with SQL Wildcards to fetch the required information.
π25β€10π₯5
Which of the following chart is used to show the relationship between different variables?
Anonymous Quiz
33%
Scatter plot
12%
Heatmap
8%
Bubble chart
47%
All of the above
π12π₯10π₯°1
SQL Constraints
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the table. If there is any violation between the constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column, and table level constraints apply to the whole table.
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the table. If there is any violation between the constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column, and table level constraints apply to the whole table.
π30β€9
SQL CHEAT SHEETπ©βπ»
SQL is a language used to communicate with databases it stands for Structured Query Language and is used by database administrators and developers alike to write queries that are used to interact with the database. Here is a quick cheat sheet of some of the most essential SQL commands:
SELECT - Retrieves data from a database
UPDATE - Updates existing data in a database
DELETE - Removes data from a database
INSERT - Adds data to a database
CREATE - Creates an object such as a database or table
ALTER - Modifies an existing object in a database
DROP -Deletes an entire table or database
ORDER BY - Sorts the selected data in an ascending or descending order
WHERE β Condition used to filter a specific set of records from the database
GROUP BY - Groups a set of data by a common parameter
HAVING - Allows the use of aggregate functions within the query
JOIN - Joins two or more tables together to retrieve data
INDEX - Creates an index on a table, to speed up search times.
SQL is a language used to communicate with databases it stands for Structured Query Language and is used by database administrators and developers alike to write queries that are used to interact with the database. Here is a quick cheat sheet of some of the most essential SQL commands:
SELECT - Retrieves data from a database
UPDATE - Updates existing data in a database
DELETE - Removes data from a database
INSERT - Adds data to a database
CREATE - Creates an object such as a database or table
ALTER - Modifies an existing object in a database
DROP -Deletes an entire table or database
ORDER BY - Sorts the selected data in an ascending or descending order
WHERE β Condition used to filter a specific set of records from the database
GROUP BY - Groups a set of data by a common parameter
HAVING - Allows the use of aggregate functions within the query
JOIN - Joins two or more tables together to retrieve data
INDEX - Creates an index on a table, to speed up search times.
π60β€33π1
1. What are the ways to detect outliers?
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean Β± (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean Β± (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
π32β€10π₯°2
What's the fullform of ETL in context of data analysis?
Anonymous Quiz
8%
Explain, transfer and load
88%
Extract, transform and load
2%
Explain, traces and load
1%
Extract, teach and load
β€26π8π₯3π2
Which of the following is not a SQL constraint?
Anonymous Quiz
13%
DEFAULT
12%
NOT NULL
55%
DIFFERENCE
20%
CHECK
β€15π9
Which of the following command is used to group a set of data by common parameter in SQL?
Anonymous Quiz
11%
GROUP DATA
79%
GROUP BY
5%
SORT BY
5%
ORDER BY
π20π₯6
1. Define the term 'Data Wrangling.
Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.
2. What are the best methods for data cleaning?
Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.
3. Explain 4 steps to use CTE in sql.
All CTE starts with "with" clause.
After with you need to define CTE name and the field names. For instance in the below code snippet I have 3 fields Count,Column and Id. The name of CTE is "MyTemp".
Once you have defined CTE we need to specify the SQL which will give the result for the CTE.
Finally you can use the CTE in your SQL query.
Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.
2. What are the best methods for data cleaning?
Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.
3. Explain 4 steps to use CTE in sql.
All CTE starts with "with" clause.
After with you need to define CTE name and the field names. For instance in the below code snippet I have 3 fields Count,Column and Id. The name of CTE is "MyTemp".
Once you have defined CTE we need to specify the SQL which will give the result for the CTE.
Finally you can use the CTE in your SQL query.
π29β€9π€2π₯1π1
1. What is a Self-Join?
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
π32β€9π₯1
Which of the following is not a data visualization tool?
Anonymous Quiz
3%
Tableau
24%
Qlik
7%
Power BI
67%
Pega
π₯21π12π5π₯°2β€1π1
Which of the following is a python library?
Anonymous Quiz
5%
Alteryx
87%
Pandas
2%
Java
4%
Tableau
2%
Javascript
π18π16β€11
Which of the following is/are an example of machine learning usecase?
Anonymous Quiz
14%
Prediction of sales for a specific product
6%
Detecting audience segment for a new movie
5%
Weather forecasting for a city
75%
All of the above
π18β€9π5π4π₯°1