1. What is the difference between SQL and MySQL?
SQL is a standard language for retrieving and manipulating structured databases. On the contrary, MySQL is a relational database management system, like SQL Server, Oracle or IBM DB2, that is used to manage SQL databases.
2. What is a Cross-Join?
Cross join can be defined as a cartesian product of the two tables included in the join. The table after join contains the same number of rows as in the cross-product of the number of rows in the two tables. If a WHERE clause is used in cross join then the query will work like an INNER JOIN.
3. What is a Stored Procedure?
A stored procedure is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data dictionary. The sole disadvantage of stored procedure is that it can be executed nowhere except in the database and occupies more memory in the database server.
4. What is Pattern Matching in SQL?
SQL pattern matching provides for pattern search in data if you have no clue as to what that word should be. This kind of SQL query uses wildcards to match a string pattern, rather than writing the exact word. The LIKE operator is used in conjunction with SQL Wildcards to fetch the required information.
SQL is a standard language for retrieving and manipulating structured databases. On the contrary, MySQL is a relational database management system, like SQL Server, Oracle or IBM DB2, that is used to manage SQL databases.
2. What is a Cross-Join?
Cross join can be defined as a cartesian product of the two tables included in the join. The table after join contains the same number of rows as in the cross-product of the number of rows in the two tables. If a WHERE clause is used in cross join then the query will work like an INNER JOIN.
3. What is a Stored Procedure?
A stored procedure is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data dictionary. The sole disadvantage of stored procedure is that it can be executed nowhere except in the database and occupies more memory in the database server.
4. What is Pattern Matching in SQL?
SQL pattern matching provides for pattern search in data if you have no clue as to what that word should be. This kind of SQL query uses wildcards to match a string pattern, rather than writing the exact word. The LIKE operator is used in conjunction with SQL Wildcards to fetch the required information.
π25β€10π₯5
Which of the following chart is used to show the relationship between different variables?
Anonymous Quiz
33%
Scatter plot
12%
Heatmap
8%
Bubble chart
47%
All of the above
π12π₯10π₯°1
SQL Constraints
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the table. If there is any violation between the constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column, and table level constraints apply to the whole table.
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the table. If there is any violation between the constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column, and table level constraints apply to the whole table.
π30β€9
SQL CHEAT SHEETπ©βπ»
SQL is a language used to communicate with databases it stands for Structured Query Language and is used by database administrators and developers alike to write queries that are used to interact with the database. Here is a quick cheat sheet of some of the most essential SQL commands:
SELECT - Retrieves data from a database
UPDATE - Updates existing data in a database
DELETE - Removes data from a database
INSERT - Adds data to a database
CREATE - Creates an object such as a database or table
ALTER - Modifies an existing object in a database
DROP -Deletes an entire table or database
ORDER BY - Sorts the selected data in an ascending or descending order
WHERE β Condition used to filter a specific set of records from the database
GROUP BY - Groups a set of data by a common parameter
HAVING - Allows the use of aggregate functions within the query
JOIN - Joins two or more tables together to retrieve data
INDEX - Creates an index on a table, to speed up search times.
SQL is a language used to communicate with databases it stands for Structured Query Language and is used by database administrators and developers alike to write queries that are used to interact with the database. Here is a quick cheat sheet of some of the most essential SQL commands:
SELECT - Retrieves data from a database
UPDATE - Updates existing data in a database
DELETE - Removes data from a database
INSERT - Adds data to a database
CREATE - Creates an object such as a database or table
ALTER - Modifies an existing object in a database
DROP -Deletes an entire table or database
ORDER BY - Sorts the selected data in an ascending or descending order
WHERE β Condition used to filter a specific set of records from the database
GROUP BY - Groups a set of data by a common parameter
HAVING - Allows the use of aggregate functions within the query
JOIN - Joins two or more tables together to retrieve data
INDEX - Creates an index on a table, to speed up search times.
π60β€33π1
1. What are the ways to detect outliers?
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean Β± (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean Β± (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
π32β€10π₯°2
What's the fullform of ETL in context of data analysis?
Anonymous Quiz
8%
Explain, transfer and load
88%
Extract, transform and load
2%
Explain, traces and load
1%
Extract, teach and load
β€26π8π₯3π2
Which of the following is not a SQL constraint?
Anonymous Quiz
13%
DEFAULT
12%
NOT NULL
55%
DIFFERENCE
20%
CHECK
β€15π9
Which of the following command is used to group a set of data by common parameter in SQL?
Anonymous Quiz
11%
GROUP DATA
79%
GROUP BY
5%
SORT BY
5%
ORDER BY
π20π₯6
1. Define the term 'Data Wrangling.
Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.
2. What are the best methods for data cleaning?
Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.
3. Explain 4 steps to use CTE in sql.
All CTE starts with "with" clause.
After with you need to define CTE name and the field names. For instance in the below code snippet I have 3 fields Count,Column and Id. The name of CTE is "MyTemp".
Once you have defined CTE we need to specify the SQL which will give the result for the CTE.
Finally you can use the CTE in your SQL query.
Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.
2. What are the best methods for data cleaning?
Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.
3. Explain 4 steps to use CTE in sql.
All CTE starts with "with" clause.
After with you need to define CTE name and the field names. For instance in the below code snippet I have 3 fields Count,Column and Id. The name of CTE is "MyTemp".
Once you have defined CTE we need to specify the SQL which will give the result for the CTE.
Finally you can use the CTE in your SQL query.
π29β€9π€2π₯1π1
1. What is a Self-Join?
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
π32β€9π₯1
Which of the following is not a data visualization tool?
Anonymous Quiz
3%
Tableau
24%
Qlik
7%
Power BI
67%
Pega
π₯21π12π5π₯°2β€1π1
Which of the following is a python library?
Anonymous Quiz
5%
Alteryx
87%
Pandas
2%
Java
4%
Tableau
2%
Javascript
π18π16β€11
Which of the following is/are an example of machine learning usecase?
Anonymous Quiz
14%
Prediction of sales for a specific product
6%
Detecting audience segment for a new movie
5%
Weather forecasting for a city
75%
All of the above
π18β€9π5π4π₯°1
π15β€1π1
Which of the following tool support ETL and data modelling capabilities?
Anonymous Quiz
7%
Javascript
72%
Power BI
7%
Signavio
15%
Tableau Desktop
π26
Which of the following is a python library to create charts?
Anonymous Quiz
6%
Alteryx
78%
Matplotlib
4%
Javascript
13%
Tableau
β€5
Learning and Practicing SQL: Resources and Platforms
1. https://sqlbolt.com/
2. https://sqlzoo.net/
3. https://www.codecademy.com/learn/learn-sql
4. https://www.w3schools.com/sql/
5. https://www.hackerrank.com/domains/sql
6. https://www.windowfunctions.com/
7. https://selectstarsql.com/
8. https://quip.com/2gwZArKuWk7W
9. https://leetcode.com/problemset/database/
10. https://t.me/learndataanalysis/327
11. https://learnsql.com/?ref=analyst
1. https://sqlbolt.com/
2. https://sqlzoo.net/
3. https://www.codecademy.com/learn/learn-sql
4. https://www.w3schools.com/sql/
5. https://www.hackerrank.com/domains/sql
6. https://www.windowfunctions.com/
7. https://selectstarsql.com/
8. https://quip.com/2gwZArKuWk7W
9. https://leetcode.com/problemset/database/
10. https://t.me/learndataanalysis/327
11. https://learnsql.com/?ref=analyst
π58β€41
β€9π4π2π2
Which of the following is join operation where a table is joined with itself?
Anonymous Quiz
13%
Full join
71%
Self join
12%
Union
4%
Cross join
β€29π11π₯°4
1. What is Density-based Clustering?
Density-Based Clustering is an unsupervised machine learning method that identifies different groups or clusters in the data space. These clustering techniques are based on the concept that a cluster in the data space is a contiguous region of high point density, separated from other such clusters by contiguous regions of low point density.
Partition-based(K-means) and Hierarchical clustering techniques are highly efficient with normal-shaped clusters while density-based techniques are efficient in arbitrary-shaped clusters or detecting outliers.
2. How to create empty tables with the same structure as another table?
To create empty tables:
Using the INTO operator to fetch the records of one table into a new table while setting a WHERE clause to false for all entries, it is possible to create empty tables with the same structure. As a result, SQL creates a new table with a duplicate structure to accept the fetched entries, but nothing is stored into the new table since the WHERE clause is active.
3. What is a Parameter in Tableau? Give an Example.
A parameter is a dynamic value that a customer could select, and you can use it to replace constant values in calculations, filters, and reference lines.
For example, when creating a filter to show the top 10 products based on total profit instead of the fixed value, you can update the filter to show the top 10, 20, or 30 products using a parameter.
4. How will you write the formula for the following in Excel? - Multiply the value in cell A1 by 10, add the result by 5, and divide it by 2.
To write a formula for the above-stated question, we have to follow the PEDMAS Precedence. The correct answer is ((A1*10)+5)/2.
Answers such as =A1*10+5/2 and =(A1*10)+5/2 are not correct. We must put parentheses brackets after a particular operation.
5. How can you remove duplicate values in a range of cells?
1. To delete duplicate values in a column, select the highlighted cells, and press the delete button. After deleting the values, go to the βConditional Formattingβ option present in the Home tab. Choose βClear Rulesβ to remove the rules from the sheet. 2. You can also delete duplicate values by selecting the βRemove Duplicatesβ option under Data Tools present in the Data tab.
Density-Based Clustering is an unsupervised machine learning method that identifies different groups or clusters in the data space. These clustering techniques are based on the concept that a cluster in the data space is a contiguous region of high point density, separated from other such clusters by contiguous regions of low point density.
Partition-based(K-means) and Hierarchical clustering techniques are highly efficient with normal-shaped clusters while density-based techniques are efficient in arbitrary-shaped clusters or detecting outliers.
2. How to create empty tables with the same structure as another table?
To create empty tables:
Using the INTO operator to fetch the records of one table into a new table while setting a WHERE clause to false for all entries, it is possible to create empty tables with the same structure. As a result, SQL creates a new table with a duplicate structure to accept the fetched entries, but nothing is stored into the new table since the WHERE clause is active.
3. What is a Parameter in Tableau? Give an Example.
A parameter is a dynamic value that a customer could select, and you can use it to replace constant values in calculations, filters, and reference lines.
For example, when creating a filter to show the top 10 products based on total profit instead of the fixed value, you can update the filter to show the top 10, 20, or 30 products using a parameter.
4. How will you write the formula for the following in Excel? - Multiply the value in cell A1 by 10, add the result by 5, and divide it by 2.
To write a formula for the above-stated question, we have to follow the PEDMAS Precedence. The correct answer is ((A1*10)+5)/2.
Answers such as =A1*10+5/2 and =(A1*10)+5/2 are not correct. We must put parentheses brackets after a particular operation.
5. How can you remove duplicate values in a range of cells?
1. To delete duplicate values in a column, select the highlighted cells, and press the delete button. After deleting the values, go to the βConditional Formattingβ option present in the Home tab. Choose βClear Rulesβ to remove the rules from the sheet. 2. You can also delete duplicate values by selecting the βRemove Duplicatesβ option under Data Tools present in the Data tab.
β€24π17π2π₯1