Here are the SQL interview questions:
Basic SQL Questions
1. What is SQL, and what is its purpose?
2. Write a SQL query to retrieve all records from a table.
3. How do you select specific columns from a table?
4. What is the difference between WHERE and HAVING clauses?
5. How do you sort data in ascending/descending order?
SQL Query Questions
1. Write a SQL query to retrieve the top 10 records from a table based on a specific column.
2. How do you join two tables based on a common column?
3. Write a SQL query to retrieve data from multiple tables using subqueries.
4. How do you use aggregate functions (SUM, AVG, MAX, MIN)?
5. Write a SQL query to retrieve data from a table for a specific date range.
SQL Optimization Questions
1. How do you optimize SQL query performance?
2. What is indexing, and how does it improve query performance?
3. How do you avoid full table scans?
4. What is query caching, and how does it work?
5. How do you optimize SQL queries for large datasets?
SQL Joins and Subqueries
1. Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
2. Write a SQL query to retrieve data from two tables using a subquery.
3. How do you use EXISTS and IN operators in SQL?
4. Write a SQL query to retrieve data from multiple tables using a self-join.
5. Explain the concept of correlated subqueries.
SQL Data Modeling
1. Explain the concept of normalization and denormalization.
2. How do you design a database schema for a given application?
3. What is data redundancy, and how do you avoid it?
4. Explain the concept of primary and foreign keys.
5. How do you handle data inconsistencies and anomalies?
SQL Advanced Questions
1. Explain the concept of window functions (ROW_NUMBER, RANK, etc.).
2. Write a SQL query to retrieve data using Common Table Expressions (CTEs).
3. How do you use dynamic SQL?
4. Explain the concept of stored procedures and functions.
5. Write a SQL query to retrieve data using pivot tables.
SQL Scenario-Based Questions
1. You have two tables, Orders and Customers. Write a SQL query to retrieve all orders for customers from a specific region.
2. You have a table with duplicate records. Write a SQL query to remove duplicates.
3. You have a table with missing values. Write a SQL query to replace missing values with a default value.
4. You have a table with data in an incorrect format. Write a SQL query to correct the format.
5. You have two tables with different data types for a common column. Write a SQL query to join the tables.
SQL Behavioral Questions
1. Can you explain a time when you optimized a slow-running SQL query?
2. How do you handle database errors and exceptions?
3. Can you describe a complex SQL query you wrote and why?
4. How do you stay up-to-date with new SQL features and best practices?
5. Can you walk me through your process for troubleshooting SQL issues?
Like this post if you need more 👍❤️
Hope it helps :)
Basic SQL Questions
1. What is SQL, and what is its purpose?
2. Write a SQL query to retrieve all records from a table.
3. How do you select specific columns from a table?
4. What is the difference between WHERE and HAVING clauses?
5. How do you sort data in ascending/descending order?
SQL Query Questions
1. Write a SQL query to retrieve the top 10 records from a table based on a specific column.
2. How do you join two tables based on a common column?
3. Write a SQL query to retrieve data from multiple tables using subqueries.
4. How do you use aggregate functions (SUM, AVG, MAX, MIN)?
5. Write a SQL query to retrieve data from a table for a specific date range.
SQL Optimization Questions
1. How do you optimize SQL query performance?
2. What is indexing, and how does it improve query performance?
3. How do you avoid full table scans?
4. What is query caching, and how does it work?
5. How do you optimize SQL queries for large datasets?
SQL Joins and Subqueries
1. Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
2. Write a SQL query to retrieve data from two tables using a subquery.
3. How do you use EXISTS and IN operators in SQL?
4. Write a SQL query to retrieve data from multiple tables using a self-join.
5. Explain the concept of correlated subqueries.
SQL Data Modeling
1. Explain the concept of normalization and denormalization.
2. How do you design a database schema for a given application?
3. What is data redundancy, and how do you avoid it?
4. Explain the concept of primary and foreign keys.
5. How do you handle data inconsistencies and anomalies?
SQL Advanced Questions
1. Explain the concept of window functions (ROW_NUMBER, RANK, etc.).
2. Write a SQL query to retrieve data using Common Table Expressions (CTEs).
3. How do you use dynamic SQL?
4. Explain the concept of stored procedures and functions.
5. Write a SQL query to retrieve data using pivot tables.
SQL Scenario-Based Questions
1. You have two tables, Orders and Customers. Write a SQL query to retrieve all orders for customers from a specific region.
2. You have a table with duplicate records. Write a SQL query to remove duplicates.
3. You have a table with missing values. Write a SQL query to replace missing values with a default value.
4. You have a table with data in an incorrect format. Write a SQL query to correct the format.
5. You have two tables with different data types for a common column. Write a SQL query to join the tables.
SQL Behavioral Questions
1. Can you explain a time when you optimized a slow-running SQL query?
2. How do you handle database errors and exceptions?
3. Can you describe a complex SQL query you wrote and why?
4. How do you stay up-to-date with new SQL features and best practices?
5. Can you walk me through your process for troubleshooting SQL issues?
Like this post if you need more 👍❤️
Hope it helps :)
👍1
Pyspark Interview Questions!!
Interviewer: "How would you remove duplicates from a large dataset in PySpark?"
Candidate: "To remove duplicates from a large dataset in PySpark, I would follow these steps:
Step 1: Load the dataset into a DataFrame
Step 2: Check for duplicates
Step 3: Partition the data to optimize performance
Step 4: Remove duplicates using the
Step 5: Cache the resulting DataFrame to avoid recomputing
Step 6: Save the cleaned dataset
Interviewer: "That's correct! Can you explain why you partitioned the data in Step 3?"
Candidate: "Yes, partitioning the data helps to distribute the computation across multiple nodes, making the process more efficient and scalable."
Interviewer: "Great answer! Can you also explain why you cached the resulting DataFrame in Step 5?"
Candidate: "Caching the DataFrame avoids recomputing the entire dataset when saving the cleaned data, which can significantly improve performance."
Interviewer: "Excellent! You have demonstrated a clear understanding of optimizing duplicate removal in PySpark."
All the best 👍👍
Interviewer: "How would you remove duplicates from a large dataset in PySpark?"
Candidate: "To remove duplicates from a large dataset in PySpark, I would follow these steps:
Step 1: Load the dataset into a DataFrame
df = spark.read.csv("path/to/data.csv", header=True, inferSchema=True)Step 2: Check for duplicates
duplicate_count = df.count() - df.dropDuplicates().count()
print(f"Number of duplicates: {duplicate_count}")
Step 3: Partition the data to optimize performance
df_repartitioned = df.repartition(100)Step 4: Remove duplicates using the
dropDuplicates() methoddf_no_duplicates = df_repartitioned.dropDuplicates()Step 5: Cache the resulting DataFrame to avoid recomputing
df_no_duplicates.cache()Step 6: Save the cleaned dataset
df_no_duplicates.write.csv("path/to/cleaned/data.csv", header=True)Interviewer: "That's correct! Can you explain why you partitioned the data in Step 3?"
Candidate: "Yes, partitioning the data helps to distribute the computation across multiple nodes, making the process more efficient and scalable."
Interviewer: "Great answer! Can you also explain why you cached the resulting DataFrame in Step 5?"
Candidate: "Caching the DataFrame avoids recomputing the entire dataset when saving the cleaned data, which can significantly improve performance."
Interviewer: "Excellent! You have demonstrated a clear understanding of optimizing duplicate removal in PySpark."
All the best 👍👍
👍2
Top Interview Questions for Apache Airflow 👇👇
1. What is Apache Airflow?
2. Is Apache Airflow an ETL tool?
3. How do we define workflows in Apache Airflow?
4. What are the components of the Apache Airflow architecture?
5. What are Local Executors and their types in Airflow?
6. What is a Celery Executor?
7. How is Kubernetes Executor different from Celery Executor?
8. What are Variables (Variable Class) in Apache Airflow?
9. What is the purpose of Airflow XComs?
10. What are the states a Task can be in? Define an ideal task flow.
11. What is the role of Airflow Operators?
12. How does airflow communicate with a third party (S3, Postgres, MySQL)?
13. What are the basic steps to create a DAG?
14. What is Branching in Directed Acyclic Graphs (DAGs)?
15. What are ways to Control Airflow Workflow?
16. Explain the External task Sensor.
17. What are the ways to monitor Apache Airflow?
18. What is TaskFlow API? and how is it helpful?
19. How are Connections used in Apache Airflow?
20. Explain Dynamic DAGs.
21. What are some of the most useful Airflow CLI commands?
22. How to control the parallelism or concurrency of tasks in Apache Airflow configuration?
23. What do you understand by Jinja Templating?
24. What are Macros in Airflow?
25. What are the limitations of TaskFlow API?
26. How is the Executor involved in the Airflow Life cycle?
27. List the types of Trigger rules.
28. What are SLAs?
29. What is Data Lineage?
30.What is a Spark Submit Operator?
31. What is a Spark JDBC Operator?
32. What is the SparkSQL operator?
33. Difference between Client mode and Cluster mode while deploying to a Spark Job.
34. How would you approach if you wanted to queue up multiple dags with order dependencies?
35. What if your Apache Airflow DAG failed for the last ten days, and now you want to backfill those last ten days' data, but you don't need to run all the tasks of the dag to backfill the data?
36. What will happen if you set 'catchup=False' in the dag and 'latest_only = True' for some of the dag tasks?
37. What if you need to use a set of functions to be used in a directed acyclic graph?
38. How would you handle a task which has no dependencies on any other tasks?
39. How can you use a set or a subset of parameters in some of the dags tasks without explicitly defining them in each task?
40. Is there any way to restrict the number of variables to be used in your directed acyclic graph, and why would we need to do that?
Hope this helps you 😊
1. What is Apache Airflow?
2. Is Apache Airflow an ETL tool?
3. How do we define workflows in Apache Airflow?
4. What are the components of the Apache Airflow architecture?
5. What are Local Executors and their types in Airflow?
6. What is a Celery Executor?
7. How is Kubernetes Executor different from Celery Executor?
8. What are Variables (Variable Class) in Apache Airflow?
9. What is the purpose of Airflow XComs?
10. What are the states a Task can be in? Define an ideal task flow.
11. What is the role of Airflow Operators?
12. How does airflow communicate with a third party (S3, Postgres, MySQL)?
13. What are the basic steps to create a DAG?
14. What is Branching in Directed Acyclic Graphs (DAGs)?
15. What are ways to Control Airflow Workflow?
16. Explain the External task Sensor.
17. What are the ways to monitor Apache Airflow?
18. What is TaskFlow API? and how is it helpful?
19. How are Connections used in Apache Airflow?
20. Explain Dynamic DAGs.
21. What are some of the most useful Airflow CLI commands?
22. How to control the parallelism or concurrency of tasks in Apache Airflow configuration?
23. What do you understand by Jinja Templating?
24. What are Macros in Airflow?
25. What are the limitations of TaskFlow API?
26. How is the Executor involved in the Airflow Life cycle?
27. List the types of Trigger rules.
28. What are SLAs?
29. What is Data Lineage?
30.What is a Spark Submit Operator?
31. What is a Spark JDBC Operator?
32. What is the SparkSQL operator?
33. Difference between Client mode and Cluster mode while deploying to a Spark Job.
34. How would you approach if you wanted to queue up multiple dags with order dependencies?
35. What if your Apache Airflow DAG failed for the last ten days, and now you want to backfill those last ten days' data, but you don't need to run all the tasks of the dag to backfill the data?
36. What will happen if you set 'catchup=False' in the dag and 'latest_only = True' for some of the dag tasks?
37. What if you need to use a set of functions to be used in a directed acyclic graph?
38. How would you handle a task which has no dependencies on any other tasks?
39. How can you use a set or a subset of parameters in some of the dags tasks without explicitly defining them in each task?
40. Is there any way to restrict the number of variables to be used in your directed acyclic graph, and why would we need to do that?
Hope this helps you 😊
👍1
ACCENTURE Interview Experience
1) Self Intro ?
2) Project - Major & Minor ?
3) Difficulty Faced in Project?
4) Least subject u like. Why it is least?
5) Hobbies I said in Self Intro
From Hobbies ( chess) he Asked y u
like that, do u play by Moves Names
or Randomly Last but not the Least.
6) Do u have any Questions ?
1) Self Intro ?
2) Project - Major & Minor ?
3) Difficulty Faced in Project?
4) Least subject u like. Why it is least?
5) Hobbies I said in Self Intro
From Hobbies ( chess) he Asked y u
like that, do u play by Moves Names
or Randomly Last but not the Least.
6) Do u have any Questions ?
👍1
Accenture Interview Experience
Self intro
Explain final year Project
How many members are there in your team ?
How you assigned work to your teammates
Have you gave any other interviews?
Problems in project
How you overcome a situation where you have timelines and deadlines?
What change you observed in you now and you before joining in college ?
Any questions
Self intro
Explain final year Project
How many members are there in your team ?
How you assigned work to your teammates
Have you gave any other interviews?
Problems in project
How you overcome a situation where you have timelines and deadlines?
What change you observed in you now and you before joining in college ?
Any questions
🚨Data Science Interview Questions
1. How many cars are there in Chennai? How do u structurally approach coming up with that number?
2. Multiple Linear Regression?
3. OLS vs MLE?
4. R2 vs Adjusted R2? During Model Development which one do we consider?
5. Lift chart, drift chart
6. Sigmoid Function in Logistic regression
7. ROC what is it? AUC and Differentiation?
8. Linear Regression from Multiple Linear Regression
9. P-Value what is it and its significance? What does P in P-Value stand for? What is Hypothesis Testing? Null hypothesis vs Alternate Hypothesis?
10. Bias Variance Trade off?
11. Over fitting vs Underfitting in Machine learning?
12. Estimation of Multiple Linear Regression
13. Forecasting vs Prediction difference? Regression vs Time Series?
14. p,d,q values in ARIMA models
1. What will happen if d=0
2. What is the meaning of p,d,q values?
15. Is your data for Forecasting Uni or multi-dimensional?
16. How to find the nose to start with in a Decision tree.
17. TYPES of Decision trees - CART vs C4.5 vs ID3
18. Genie index vs entropy
19. Linear vs Logistic Regression
20. Decision Trees vs Random Forests
21. Questions on liner regression, how it works and all
22. Asked to write some SQL queries
23. Asked about past work experience
24. Some questions on inferential statistics (hypothesis testing, sampling techniques)
25. Some questions on table (how to filter, how to add calculated fields etc)
26. Why do u use Licensed Platform when other Open source packages are available?
27. What certification Have u done?
28. What is a Confidence Interval?
29. What are Outliers? How to Detect Outliers?
30. How to Handle Outliers?
1. How many cars are there in Chennai? How do u structurally approach coming up with that number?
2. Multiple Linear Regression?
3. OLS vs MLE?
4. R2 vs Adjusted R2? During Model Development which one do we consider?
5. Lift chart, drift chart
6. Sigmoid Function in Logistic regression
7. ROC what is it? AUC and Differentiation?
8. Linear Regression from Multiple Linear Regression
9. P-Value what is it and its significance? What does P in P-Value stand for? What is Hypothesis Testing? Null hypothesis vs Alternate Hypothesis?
10. Bias Variance Trade off?
11. Over fitting vs Underfitting in Machine learning?
12. Estimation of Multiple Linear Regression
13. Forecasting vs Prediction difference? Regression vs Time Series?
14. p,d,q values in ARIMA models
1. What will happen if d=0
2. What is the meaning of p,d,q values?
15. Is your data for Forecasting Uni or multi-dimensional?
16. How to find the nose to start with in a Decision tree.
17. TYPES of Decision trees - CART vs C4.5 vs ID3
18. Genie index vs entropy
19. Linear vs Logistic Regression
20. Decision Trees vs Random Forests
21. Questions on liner regression, how it works and all
22. Asked to write some SQL queries
23. Asked about past work experience
24. Some questions on inferential statistics (hypothesis testing, sampling techniques)
25. Some questions on table (how to filter, how to add calculated fields etc)
26. Why do u use Licensed Platform when other Open source packages are available?
27. What certification Have u done?
28. What is a Confidence Interval?
29. What are Outliers? How to Detect Outliers?
30. How to Handle Outliers?
Capgemini Interview Questions for #Automation Engineer (4+ Years)
1. Explain the automation framework you have worked on and its components.
2. What are the different types of waits in Selenium? Provide examples.
3. How do you handle dynamic web elements in Selenium?
4. Write a program to check if a given string is a palindrome.
5. What is the Page Object Model (POM), and why is it used?
6. Write a program to merge two sorted arrays without using inbuilt functions.
7. What is the difference between implicit wait, explicit wait, and fluent wait?
8. How can you rerun failed test cases in TestNG?
9. How do you manage test data in your automation scripts?
10. Explain the difference between Selenium WebDriver and Selenium Grid.
11. How would you handle pop-ups and alerts in Selenium?
12. Write a SQL query to fetch the second-highest salary from a table.
13. Write a Java program to swap two numbers without using a temporary variable.
14. What is the difference between abstraction and encapsulation?
15. What are RESTful APIs? How would you test them using Postman or RestAssured?
16. Write a program to count the number of vowels in a string.
17. Write a Java program to reverse a string without using inbuilt functions.
18. How do you prioritize and plan test automation?
19. Explain the difference between @BeforeTest, @BeforeClass, and @BeforeMethod annotations in TestNG.
20. What is continuous integration? Which CI tools have you worked with?
21. Tricky: Write a Java program to check if a number is prime without using inbuilt functions.
22. Explain the differences between HashMap and ConcurrentHashMap.
23. How do you avoid deadlocks in a multithreaded program?
24. Write a Java program to reverse the digits of a number.
25. What is XPath? Explain the difference between absolute and relative XPath.
26. How do you ensure cross-browser compatibility in Selenium scripts?
27. What are the common challenges faced in automation testing, and how do you overcome them?
👍2
Nagarro Interview Experience – 25 LPA Cracked! 🎯
I’m thrilled to share my interview journey with Nagarro! Here’s a detailed breakdown of the process:
📌 Round 1: Aptitude and Technical Online Test
👉 The online aptitude test included verbal ability questions and Java program output questions.
💡 Pro Tip: Ensure a stable internet connection and active webcam. Even a single disconnection could lead to disqualification.
📌 Round 2: Technical Round
Here are some of the questions I tackled:
• Shift all even numbers to the left side of an array and odd numbers to the right.
• Can you create an object of an interface or abstract class? Explain.
• Why is String immutable in Java?
• What is the purpose of LinkedHashMap in Java? Have you used it in a framework?
• What is the invocationCount in TestNG?
• How do you wait for the visibility of an element in Selenium?
• How do you use AutoIT to upload a file?
• What is an “Element Click Intercepted Exception,” and how do you resolve it?
• Challenges faced while working with frameworks?
• What is the normalize-space function in XPath, and how is it used?
📌 Round 3: Advanced Technical Round
Some key questions in this round included:
• What is the Singleton Design Pattern in Java? What are its advantages?
• How do you disable images in Selenium?
• Difference between Action and Actions in Selenium?
• How do you handle elements with dynamic attributes in Selenium scripts?
• What is the purpose of the ThreadLocal class in Selenium?
• API status codes: What’s the difference between 200, 400, 410, and 403?
• How do you write a test case in Postman to validate the status code?
• Data-driven testing in Postman: How is it done?
• Difference between HEAD and OPTIONS API methods? (Drop your answer in the comments!)
• Basics of JMeter: ThreadGroup, Listeners, and more.
📌 Round 4: HR Round
• How soon can you join?
• How was your overall interview experience with us?
I’m thrilled to share my interview journey with Nagarro! Here’s a detailed breakdown of the process:
📌 Round 1: Aptitude and Technical Online Test
👉 The online aptitude test included verbal ability questions and Java program output questions.
💡 Pro Tip: Ensure a stable internet connection and active webcam. Even a single disconnection could lead to disqualification.
📌 Round 2: Technical Round
Here are some of the questions I tackled:
• Shift all even numbers to the left side of an array and odd numbers to the right.
• Can you create an object of an interface or abstract class? Explain.
• Why is String immutable in Java?
• What is the purpose of LinkedHashMap in Java? Have you used it in a framework?
• What is the invocationCount in TestNG?
• How do you wait for the visibility of an element in Selenium?
• How do you use AutoIT to upload a file?
• What is an “Element Click Intercepted Exception,” and how do you resolve it?
• Challenges faced while working with frameworks?
• What is the normalize-space function in XPath, and how is it used?
📌 Round 3: Advanced Technical Round
Some key questions in this round included:
• What is the Singleton Design Pattern in Java? What are its advantages?
• How do you disable images in Selenium?
• Difference between Action and Actions in Selenium?
• How do you handle elements with dynamic attributes in Selenium scripts?
• What is the purpose of the ThreadLocal class in Selenium?
• API status codes: What’s the difference between 200, 400, 410, and 403?
• How do you write a test case in Postman to validate the status code?
• Data-driven testing in Postman: How is it done?
• Difference between HEAD and OPTIONS API methods? (Drop your answer in the comments!)
• Basics of JMeter: ThreadGroup, Listeners, and more.
📌 Round 4: HR Round
• How soon can you join?
• How was your overall interview experience with us?
👍2
Coding Interview ⛥ pinned «Nagarro Interview Experience – 25 LPA Cracked! 🎯 I’m thrilled to share my interview journey with Nagarro! Here’s a detailed breakdown of the process: 📌 Round 1: Aptitude and Technical Online Test 👉 The online aptitude test included verbal ability questions…»
Interview Experience at Global Logic
Round 1: Technical Questions
1. Tell me about yourself.
2. What are the different types of exceptions you’ve faced in your framework, and how did you resolve them?
3. What is a stale element exception? Why does it occur?
4. What is the use of test() in XPath?
5. Why is WebDriver driver = new ChromeDriver() preferred?
6. What is the parent class of all exceptions in Java?
7. Questions about different API status codes.
8. What is the difference between PUT and PATCH?
9. Write Java code to remove duplicate elements from an array without using a HashMap.
10. How do you take a full-page screenshot in Selenium?
Round 2: Advanced Technical Questions
1. Explain your current project and your roles and responsibilities.
2. What is the use of dynamic XPath? Write a dynamic XPath for the “Check Availability” button on Rediffmail’s “Create Account” page.
3. Explain XPath axes and mention the XPath functions you’ve used.
4. Questions on RestAssured, including the use of RequestSpecification and ResponseSpecification.
5. What is the full form of REST?
6. Explain JavaScriptExecutor with code.
7. Different ways to click on elements in Selenium.
8. How do you handle multiple windows in Selenium? Provide code.
9. Write code to read data from an Excel file.
10. Are you comfortable working with manual testing if needed?
11. What is the difference between final, finally, and finalize?
12. Can you use multiple catch blocks with a single try block?
Round 3: HR Discussion
1. Why are you looking for a change?
2. Tell us something about your achievements.
3. Why did you leave your last job?
4. What are your salary expectations?
Overall, the interview process covered both technical and behavioral aspects, focusing heavily on Selenium, Java, and API testing.
Round 1: Technical Questions
1. Tell me about yourself.
2. What are the different types of exceptions you’ve faced in your framework, and how did you resolve them?
3. What is a stale element exception? Why does it occur?
4. What is the use of test() in XPath?
5. Why is WebDriver driver = new ChromeDriver() preferred?
6. What is the parent class of all exceptions in Java?
7. Questions about different API status codes.
8. What is the difference between PUT and PATCH?
9. Write Java code to remove duplicate elements from an array without using a HashMap.
10. How do you take a full-page screenshot in Selenium?
Round 2: Advanced Technical Questions
1. Explain your current project and your roles and responsibilities.
2. What is the use of dynamic XPath? Write a dynamic XPath for the “Check Availability” button on Rediffmail’s “Create Account” page.
3. Explain XPath axes and mention the XPath functions you’ve used.
4. Questions on RestAssured, including the use of RequestSpecification and ResponseSpecification.
5. What is the full form of REST?
6. Explain JavaScriptExecutor with code.
7. Different ways to click on elements in Selenium.
8. How do you handle multiple windows in Selenium? Provide code.
9. Write code to read data from an Excel file.
10. Are you comfortable working with manual testing if needed?
11. What is the difference between final, finally, and finalize?
12. Can you use multiple catch blocks with a single try block?
Round 3: HR Discussion
1. Why are you looking for a change?
2. Tell us something about your achievements.
3. Why did you leave your last job?
4. What are your salary expectations?
Overall, the interview process covered both technical and behavioral aspects, focusing heavily on Selenium, Java, and API testing.
Coding Interview ⛥ pinned «Interview Experience at Global Logic Round 1: Technical Questions 1. Tell me about yourself. 2. What are the different types of exceptions you’ve faced in your framework, and how did you resolve them? 3. What is a stale element exception? Why does it occur?…»
Data engineering Interview questions: Accenture
Q1.Which Integration Runtime (IR) should be used for copying data from an on-premise database to Azure?
Q2.Explain the differences between a Scheduled Trigger and a Tumbling Window Trigger in Azure Data Factory. When would you use each?
Q3. What is Azure Data Factory (ADF), and how does it enable ETL and ELT processes in a cloud environment?
Q4.Describe Azure Data Lake and its role in a data architecture. How does it differ from Azure Blob Storage?
Q5. What is an index in a database table? Discuss different types of indexes and their impact on query performance.
Q6.Given two datasets, explain how the number of records will vary for each type of join (Inner Join, Left Join, Right Join, Full Outer Join).
Q7.What are the Control Flow activities in the Azure Data Factory? Explain how they differ from Data Flow activities and their typical use cases.
Q8. Discuss key concepts in data modeling, including normalization and denormalization. How do security concerns influence your choice of Synapse table types in a given scenario? Provide an example of a scenario-based ADF pipeline.
Q9. What are the different types of Integration Runtimes (IR) in Azure Data Factory? Discuss their use cases and limitations.
Q10.How can you mask sensitive data in the Azure SQL Database? What are the different masking techniques available?
Q11.What is Azure Integration Runtime (IR), and how does it support data movement across different networks?
Q12.Explain Slowly Changing Dimension (SCD) Type 1 in a data warehouse. How does it differ from SCD Type 2?
Q13.SQL questions on window functions - rolling sum and lag/lead based. How do window functions differ from traditional aggregate functions?
Q1.Which Integration Runtime (IR) should be used for copying data from an on-premise database to Azure?
Q2.Explain the differences between a Scheduled Trigger and a Tumbling Window Trigger in Azure Data Factory. When would you use each?
Q3. What is Azure Data Factory (ADF), and how does it enable ETL and ELT processes in a cloud environment?
Q4.Describe Azure Data Lake and its role in a data architecture. How does it differ from Azure Blob Storage?
Q5. What is an index in a database table? Discuss different types of indexes and their impact on query performance.
Q6.Given two datasets, explain how the number of records will vary for each type of join (Inner Join, Left Join, Right Join, Full Outer Join).
Q7.What are the Control Flow activities in the Azure Data Factory? Explain how they differ from Data Flow activities and their typical use cases.
Q8. Discuss key concepts in data modeling, including normalization and denormalization. How do security concerns influence your choice of Synapse table types in a given scenario? Provide an example of a scenario-based ADF pipeline.
Q9. What are the different types of Integration Runtimes (IR) in Azure Data Factory? Discuss their use cases and limitations.
Q10.How can you mask sensitive data in the Azure SQL Database? What are the different masking techniques available?
Q11.What is Azure Integration Runtime (IR), and how does it support data movement across different networks?
Q12.Explain Slowly Changing Dimension (SCD) Type 1 in a data warehouse. How does it differ from SCD Type 2?
Q13.SQL questions on window functions - rolling sum and lag/lead based. How do window functions differ from traditional aggregate functions?
Date: 15-02-2025
Company name: Ikea
Role: Data Scientist
01. What is the meaning of term weight initialization in neural networks?
Answer- In neural networking, weight initialization is one of the essential factors. A bad weight initialization prevents a network from learning. On the other side, a good weight initialization helps in giving a quicker convergence and a better overall error. Biases can be initialized to zero. The standard rule for setting the weights is to be close to zero without being too small.
02. What is the usage of the NVL() function?
Answer- The NVL() function is used to convert the NULL value to the other value. The function returns the value of the second parameter if the first parameter is NULL. If the first parameter is anything other than NULL, it is left unchanged. This function is used in Oracle, not in SQL and MySQL. Instead of NVL() function, MySQL have IFNULL() and SQL Server have ISNULL() function.
03. How to create a dictionary in Python?
Answer- In Python, a dictionary can be created by placing a sequence of elements within curly {} braces, separated by ‘comma’. Dictionary holds pairs of values, one being the Key and the other corresponding pair element being its Key:value. Values in a dictionary can be of any data type and can be duplicated, whereas keys can’t be repeated and must be immutable.
04. What is matplotlib and some of the basic plots in Matplotlib?
Answer- Matplotlib comes with a wide variety of plots. Plots help to understand trends, patterns, and to make correlations. They’re typically instruments for reasoning about quantitative information. Some of the basic plots are line plot, bar plot, scatter plot, etc.
————————————————————
Stay Safe & Happy Learning 💙
Company name: Ikea
Role: Data Scientist
01. What is the meaning of term weight initialization in neural networks?
Answer- In neural networking, weight initialization is one of the essential factors. A bad weight initialization prevents a network from learning. On the other side, a good weight initialization helps in giving a quicker convergence and a better overall error. Biases can be initialized to zero. The standard rule for setting the weights is to be close to zero without being too small.
02. What is the usage of the NVL() function?
Answer- The NVL() function is used to convert the NULL value to the other value. The function returns the value of the second parameter if the first parameter is NULL. If the first parameter is anything other than NULL, it is left unchanged. This function is used in Oracle, not in SQL and MySQL. Instead of NVL() function, MySQL have IFNULL() and SQL Server have ISNULL() function.
03. How to create a dictionary in Python?
Answer- In Python, a dictionary can be created by placing a sequence of elements within curly {} braces, separated by ‘comma’. Dictionary holds pairs of values, one being the Key and the other corresponding pair element being its Key:value. Values in a dictionary can be of any data type and can be duplicated, whereas keys can’t be repeated and must be immutable.
04. What is matplotlib and some of the basic plots in Matplotlib?
Answer- Matplotlib comes with a wide variety of plots. Plots help to understand trends, patterns, and to make correlations. They’re typically instruments for reasoning about quantitative information. Some of the basic plots are line plot, bar plot, scatter plot, etc.
————————————————————
Stay Safe & Happy Learning 💙
👍3
Date: 26-02-2025
Company name: Adecco
Role: Data Scientist
01. What is a stored procedure?
Answer- Stored Procedure is a function consists of many SQL statements to access the database system. Several SQL statements are consolidated into a stored procedure and execute them whenever and wherever required.
02. What is Dimensionality Reduction?
Answer- In the real world, Machine Learning models are built on top of features and parameters. These features can be multidimensional and large in number. Sometimes, the features may be irrelevant and it becomes a difficult task to visualize them. This is where dimensionality reduction is used to cut down irrelevant and redundant features with the help of principal variables. These principal variables conserve the features, and are a subgroup, of the parent variables.
03. What are Autoencoders?
Answer- An autoencoder is a kind of artificial neural network. It is used to learn efficient data codings in an unsupervised manner. It is utilised for learning a representation (encoding) for a set of data, mostly for dimensionality reduction, by training the network to ignore signal “noise”. Autoencoder also tries to generate a representation as close as possible to its original input from the reduced encoding.
04. What are some common Data Preparation Operations you would use for Time Series Data?
Answer-
a. Parsing time series information from various sources and formats.
b. Generating sequences of fixed-frequency dates and time spans.
c. Manipulating and converting date times with time zone information.
d. Resampling or converting a time series to a particular frequency.
————————————————————
Stay Safe & Happy Learning 💙
Company name: Adecco
Role: Data Scientist
01. What is a stored procedure?
Answer- Stored Procedure is a function consists of many SQL statements to access the database system. Several SQL statements are consolidated into a stored procedure and execute them whenever and wherever required.
02. What is Dimensionality Reduction?
Answer- In the real world, Machine Learning models are built on top of features and parameters. These features can be multidimensional and large in number. Sometimes, the features may be irrelevant and it becomes a difficult task to visualize them. This is where dimensionality reduction is used to cut down irrelevant and redundant features with the help of principal variables. These principal variables conserve the features, and are a subgroup, of the parent variables.
03. What are Autoencoders?
Answer- An autoencoder is a kind of artificial neural network. It is used to learn efficient data codings in an unsupervised manner. It is utilised for learning a representation (encoding) for a set of data, mostly for dimensionality reduction, by training the network to ignore signal “noise”. Autoencoder also tries to generate a representation as close as possible to its original input from the reduced encoding.
04. What are some common Data Preparation Operations you would use for Time Series Data?
Answer-
a. Parsing time series information from various sources and formats.
b. Generating sequences of fixed-frequency dates and time spans.
c. Manipulating and converting date times with time zone information.
d. Resampling or converting a time series to a particular frequency.
————————————————————
Stay Safe & Happy Learning 💙
👍1
Cisco Kafka interview questions for Data Engineers 2024.
➤ How do you create a topic in Kafka using the Confluent CLI?
➤ Explain the role of the Schema Registry in Kafka.
➤ How do you register a new schema in the Schema Registry?
➤ What is the importance of key-value messages in Kafka?
➤ Describe a scenario where using a random key for messages is beneficial.
➤ Provide an example where using a constant key for messages is necessary.
➤ Write a simple Kafka producer code that sends JSON messages to a topic.
➤ How do you serialize a custom object before sending it to a Kafka topic?
➤ Describe how you can handle serialization errors in Kafka producers.
➤ Write a Kafka consumer code that reads messages from a topic and deserializes them from JSON.
➤ How do you handle deserialization errors in Kafka consumers?
➤ Explain the process of deserializing messages into custom objects.
➤ What is a consumer group in Kafka, and why is it important?
➤ Describe a scenario where multiple consumer groups are used for a single topic.
➤ How does Kafka ensure load balancing among consumers in a group?
➤ How do you send JSON data to a Kafka topic and ensure it is properly serialized?
➤ Describe the process of consuming JSON data from a Kafka topic and converting it to a usable format.
➤ Explain how you can work with CSV data in Kafka, including serialization and deserialization.
➤ Write a Kafka producer code snippet that sends CSV data to a topic.
➤ Write a Kafka consumer code snippet that reads and processes CSV data from a topic.
All the best 👍👍
➤ How do you create a topic in Kafka using the Confluent CLI?
➤ Explain the role of the Schema Registry in Kafka.
➤ How do you register a new schema in the Schema Registry?
➤ What is the importance of key-value messages in Kafka?
➤ Describe a scenario where using a random key for messages is beneficial.
➤ Provide an example where using a constant key for messages is necessary.
➤ Write a simple Kafka producer code that sends JSON messages to a topic.
➤ How do you serialize a custom object before sending it to a Kafka topic?
➤ Describe how you can handle serialization errors in Kafka producers.
➤ Write a Kafka consumer code that reads messages from a topic and deserializes them from JSON.
➤ How do you handle deserialization errors in Kafka consumers?
➤ Explain the process of deserializing messages into custom objects.
➤ What is a consumer group in Kafka, and why is it important?
➤ Describe a scenario where multiple consumer groups are used for a single topic.
➤ How does Kafka ensure load balancing among consumers in a group?
➤ How do you send JSON data to a Kafka topic and ensure it is properly serialized?
➤ Describe the process of consuming JSON data from a Kafka topic and converting it to a usable format.
➤ Explain how you can work with CSV data in Kafka, including serialization and deserialization.
➤ Write a Kafka producer code snippet that sends CSV data to a topic.
➤ Write a Kafka consumer code snippet that reads and processes CSV data from a topic.
All the best 👍👍