Q73. What is writing of wrapper can you explain elaborately?
Ans:
Writing a wrapper script helps you to run the graph in sequence as you want.
Ex: When u need to run 3 graphs but the condition is after the first graph ran successfully u need to take the feedback generated by it & use it in next graph & so on.
Then u have to right Unix script in which run the ksh of the 1st graph after it finished u have to check the graph ran successfully then run the second ksh so on.
Ans:
Writing a wrapper script helps you to run the graph in sequence as you want.
Ex: When u need to run 3 graphs but the condition is after the first graph ran successfully u need to take the feedback generated by it & use it in next graph & so on.
Then u have to right Unix script in which run the ksh of the 1st graph after it finished u have to check the graph ran successfully then run the second ksh so on.
Q74. I have a 4way mfs. I connected to rollup, i have mentioned key as null. How many records will come to output?
Ans:
You have to understand how roll-up works, it makes group of records based on the key, If the key is null it considers all the record in a partition as one so it will throw 1 record per number of partition. In your case Its 4 records as 4 partition file is in input.
Ans:
You have to understand how roll-up works, it makes group of records based on the key, If the key is null it considers all the record in a partition as one so it will throw 1 record per number of partition. In your case Its 4 records as 4 partition file is in input.
Q75. If we give NULL key in scan component what will be the output and in dedup with keep parameter as unique?
Ans:
In case of {} key in scan it will give all the records in output.
In case of dedup if you keep it as first it will give first record, if you keep it as last it will give last record and in case of unique only it will give no records.
Ans:
In case of {} key in scan it will give all the records in output.
In case of dedup if you keep it as first it will give first record, if you keep it as last it will give last record and in case of unique only it will give no records.
Tip 16:
Types of parallelism
1. Pipeline Parallelism:
In pipeline parallelism, multiple components process data simultaneously. Each component in the pipeline continuously reads from upstream components, processes data, and writes to downstream components. Since a downstream component can process records previously written by an upstream component, both components can operate in parallel.
For example you can keep on reading the data from input file(say 10 records) but till now processed only 6 records. This is called pipeline parallelism when one component does not wait for all the data to come and starts processing parallely in a pipe.
NOTE: there are certain cases where pipeline parallelism breaks for example on Sort component since for sorting, all data must be read. At phase brake also pipeline parallelism breaks.
2. Data Parallelism:
In Data parallelism, data is processed on different servers parallely. Most commonly data parallelism occurs in Multi Files that is in partitioning.
For example, if we have 4 way multifile then after partitioning data, it gets divided in 4 processes and same component acts parallely 4 times.
3. Component Parallelism:
A graph with multiple processes running simultaneously on separate data uses component parallelism.
This kind of parallelism is specific to your graph when 2 different components are not interrelated and they process the data parallely.
For example you have 2 input files and you sort the data of both of them in 2 different flows. Then these 2 components are under component parallelism.
Types of parallelism
1. Pipeline Parallelism:
In pipeline parallelism, multiple components process data simultaneously. Each component in the pipeline continuously reads from upstream components, processes data, and writes to downstream components. Since a downstream component can process records previously written by an upstream component, both components can operate in parallel.
For example you can keep on reading the data from input file(say 10 records) but till now processed only 6 records. This is called pipeline parallelism when one component does not wait for all the data to come and starts processing parallely in a pipe.
NOTE: there are certain cases where pipeline parallelism breaks for example on Sort component since for sorting, all data must be read. At phase brake also pipeline parallelism breaks.
2. Data Parallelism:
In Data parallelism, data is processed on different servers parallely. Most commonly data parallelism occurs in Multi Files that is in partitioning.
For example, if we have 4 way multifile then after partitioning data, it gets divided in 4 processes and same component acts parallely 4 times.
3. Component Parallelism:
A graph with multiple processes running simultaneously on separate data uses component parallelism.
This kind of parallelism is specific to your graph when 2 different components are not interrelated and they process the data parallely.
For example you have 2 input files and you sort the data of both of them in 2 different flows. Then these 2 components are under component parallelism.
Questions:
Q76. What is .profile and .bash_profile file?
Q77. In place of Reformat which component we can use?
Q78. Can I read from and write to the same file in a graph?
Q76. What is .profile and .bash_profile file?
Q77. In place of Reformat which component we can use?
Q78. Can I read from and write to the same file in a graph?
Q79. What will happen if we have multi file lookup and we are calling that in serial layout component?
Q80. What will happen if we are using lookup function for multi lookup file?
Q80. What will happen if we are using lookup function for multi lookup file?
Tip 17:
Use m_env –v to know which abinitio version you are using.
Example:
m_env -v
Output:
ab initio version 2.15.8.0
Here,
2.15 is release version
and 8 is update version
and 0 denotes patch level
Use m_env –v to know which abinitio version you are using.
Example:
m_env -v
Output:
ab initio version 2.15.8.0
Here,
2.15 is release version
and 8 is update version
and 0 denotes patch level
Scenario Questions:
Q81. Input file is multifile and then you have used dedup sorted with null key. How many records will be in output with keep first, last and unique only?
Q82. Input file is multifile and then you have used rollup with null key. How many records will be in output?
Q83. In input file there are 100 million records and you want to create 100 sub files with 1 million in each file. How you will achieve this?
Q84. In graph you have two files one with 100 gb data and another with 100 mb data. Which component you will use to match the file data?
Q85. In graph you have two files one with 100 gb data and another with 100 mb data you using join component with in memory to joining file data. Which file you choose as driving input and why?
Q81. Input file is multifile and then you have used dedup sorted with null key. How many records will be in output with keep first, last and unique only?
Q82. Input file is multifile and then you have used rollup with null key. How many records will be in output?
Q83. In input file there are 100 million records and you want to create 100 sub files with 1 million in each file. How you will achieve this?
Q84. In graph you have two files one with 100 gb data and another with 100 mb data. Which component you will use to match the file data?
Q85. In graph you have two files one with 100 gb data and another with 100 mb data you using join component with in memory to joining file data. Which file you choose as driving input and why?
Tip 18:
Use below logic generate the sequence number for each record of multi file:
(Next_in_sequence-1)*number_of_partitions+this_partition()
Use below logic generate the sequence number for each record of multi file:
(Next_in_sequence-1)*number_of_partitions+this_partition()
Questions:
Q86. How do you find the size of MFS file?
Q87. Difference between PBRR and Interleave?
Q88. Explain Metaprogramming with example?
Q89. What is the difference between versioned and nonversioned objects?
Q90. What will happen if we use the reformat component with NEVER ABORT, Force_error, Force_abort?
Q86. How do you find the size of MFS file?
Q87. Difference between PBRR and Interleave?
Q88. Explain Metaprogramming with example?
Q89. What is the difference between versioned and nonversioned objects?
Q90. What will happen if we use the reformat component with NEVER ABORT, Force_error, Force_abort?
Tip 19:
String_Split: Splits a string into pieces.
NOTE: If you need to split a sequence of whitespace, use re_split.
Syntax: string[ ]string_split(string splittee ,string splitter )
Details: This function searches for occurrences of the splitter string within the splittee string, returning a vector whose elements contain the pieces of the splittee string that were separated from one another by the splitter string. Its behavior depends on the position and occurrence of the string to find:
• If the splitter string does not occur in the splittee string, the function returns a single-element vector containing the splittee string.
• If the splitter string occurs at the beginning of the splittee string, the function returns a vector where the first element is a zero-length string.
• If the splitter string occurs one or multiple times in succession and the string ends with the splitter string, the function returns a vector where the last element is a zero-length string.
Examples: These examples illustrate the behavior of string_split with various splitter strings.
In this example, the splitter string comma (,) returns a vector containing 3 elements:
string_split("quick,brown,fox", ",")
[vector "quick", "brown", "fox"]
In this example, the splitter string # is not found. The function returns a single element vector containing the splittee string:
string_split("what a bad day dad had", "#")
[vector "what a bad day dad had"]
In this example, the splitter string x is found at the beginning of the splittee string:
string_split("xoanon", "x")
[vector "", "oanon"]
In this example, the splitter string x is found at the end of the splittee string. The final element is a zero-length string:
string_split("flax", "x")
[vector "fla", ""]
In this example, the splitter string occurs multiple times in succession and the string ends with it. The final element is a zero-length string:
string_split("who? what? where? ", "? ")
[vector "who", "what", "where", ""]
To avoid empty spaces vector value use string_split_no_empty()
re_split: Splits a string into pieces using supplied regular expression. This function is more powerful than string_split, bcoz the expression pattern can include a sequence of whitespace character — not just a single whitespace character. This function supports Unicode characters.
NOTE: All DML regular expression functions accept any native character set/UTF-8, with the exception of functions run on the mainframe. To work on the mainframe, DML must be able to convert the character set to EBCDIC.
Syntax: string(int) [int] re_ split( string target_str , string pattern_expr )
String_Split: Splits a string into pieces.
NOTE: If you need to split a sequence of whitespace, use re_split.
Syntax: string[ ]string_split(string splittee ,string splitter )
Details: This function searches for occurrences of the splitter string within the splittee string, returning a vector whose elements contain the pieces of the splittee string that were separated from one another by the splitter string. Its behavior depends on the position and occurrence of the string to find:
• If the splitter string does not occur in the splittee string, the function returns a single-element vector containing the splittee string.
• If the splitter string occurs at the beginning of the splittee string, the function returns a vector where the first element is a zero-length string.
• If the splitter string occurs one or multiple times in succession and the string ends with the splitter string, the function returns a vector where the last element is a zero-length string.
Examples: These examples illustrate the behavior of string_split with various splitter strings.
In this example, the splitter string comma (,) returns a vector containing 3 elements:
string_split("quick,brown,fox", ",")
[vector "quick", "brown", "fox"]
In this example, the splitter string # is not found. The function returns a single element vector containing the splittee string:
string_split("what a bad day dad had", "#")
[vector "what a bad day dad had"]
In this example, the splitter string x is found at the beginning of the splittee string:
string_split("xoanon", "x")
[vector "", "oanon"]
In this example, the splitter string x is found at the end of the splittee string. The final element is a zero-length string:
string_split("flax", "x")
[vector "fla", ""]
In this example, the splitter string occurs multiple times in succession and the string ends with it. The final element is a zero-length string:
string_split("who? what? where? ", "? ")
[vector "who", "what", "where", ""]
To avoid empty spaces vector value use string_split_no_empty()
re_split: Splits a string into pieces using supplied regular expression. This function is more powerful than string_split, bcoz the expression pattern can include a sequence of whitespace character — not just a single whitespace character. This function supports Unicode characters.
NOTE: All DML regular expression functions accept any native character set/UTF-8, with the exception of functions run on the mainframe. To work on the mainframe, DML must be able to convert the character set to EBCDIC.
Syntax: string(int) [int] re_ split( string target_str , string pattern_expr )
Tip 20:
To convert string into character array or in character vector.
For eg.
Input string:
Kishor
Output:
Vector [K,i,s,h,o,r]
Solution:
string_split_no_empty(string_replace(“kishor”,””,”-“),”-“)
To convert string into character array or in character vector.
For eg.
Input string:
Kishor
Output:
Vector [K,i,s,h,o,r]
Solution:
string_split_no_empty(string_replace(“kishor”,””,”-“),”-“)
Abinitio Interview Questions pinned «Tip 20: To convert string into character array or in character vector. For eg. Input string: Kishor Output: Vector [K,i,s,h,o,r] Solution: string_split_no_empty(string_replace(“kishor”,””,”-“),”-“)»
Abinitio Interview Questions pinned «Please share this link to people who are preparing for the abi interview: https://t.me/abi_interview_qstn For any queries and new questions submission email us at AbiInterviewQuestions@gmail.com We will try add new questions daily, please do keep visiting…»
Questions and Answers:
Q91. What is driving port? When do you use it?
Answer:
When you set the sorted-input parameter of "JOIN" component to "In memory: Input need not be sorted", you can find the driving port.
Generally driving port use to improve performance in a graph. The driving input is the largest input. All other inputs are read into memory.
For example, suppose the largest input to be joined is on the in1 port. Specify a port number of 1 as the value of the driving parameter. The component reads all other inputs to the join for example, in0, and in2 into memory. Default is 0, which specifies that the driving input is on port in0.Join also improves performance by loading all records from all inputs except the driving input into main memory.
Q91. What is driving port? When do you use it?
Answer:
When you set the sorted-input parameter of "JOIN" component to "In memory: Input need not be sorted", you can find the driving port.
Generally driving port use to improve performance in a graph. The driving input is the largest input. All other inputs are read into memory.
For example, suppose the largest input to be joined is on the in1 port. Specify a port number of 1 as the value of the driving parameter. The component reads all other inputs to the join for example, in0, and in2 into memory. Default is 0, which specifies that the driving input is on port in0.Join also improves performance by loading all records from all inputs except the driving input into main memory.
Q92. What is AB_LOCAL expression where do you use it in ab-initio?
Answer:
ablocal_expr is a parameter of input table component of Ab Initio. ABLOCAL() is replaced by the contents of ablocal_expr. Which we can make use in parallel unloads.
There are two forms of AB_LOCAL() construct, one with no arguments and one with single argument as a table name(driving table). The use of AB_LOCAL() construct is in Some complex SQL statements contain grammar that is not recognized by the Ab Initio parser when unloading in parallel.
You can use the ABLOCAL() construct in this case to prevent the Input Table component from parsing the SQL (it will get passed through to the database). It also specifies which table to use for the parallel clause.
Answer:
ablocal_expr is a parameter of input table component of Ab Initio. ABLOCAL() is replaced by the contents of ablocal_expr. Which we can make use in parallel unloads.
There are two forms of AB_LOCAL() construct, one with no arguments and one with single argument as a table name(driving table). The use of AB_LOCAL() construct is in Some complex SQL statements contain grammar that is not recognized by the Ab Initio parser when unloading in parallel.
You can use the ABLOCAL() construct in this case to prevent the Input Table component from parsing the SQL (it will get passed through to the database). It also specifies which table to use for the parallel clause.
Q93. How to Create Surrogate Key using Ab Initio?
Answer:
There are many ways to create surrogate key but it depends on your business logic. Here you can try belows ways.
1. next_in_sequence() function in your transform.
2. Assign key values component
3. Write a stored proc to this and call this stor proc wherever u need.
4. If you writing data into multi file then you have to use below formula to write numbers sequentially:
(next_in_sequence()-1)*no_of_partition()+this_partition()
Answer:
There are many ways to create surrogate key but it depends on your business logic. Here you can try belows ways.
1. next_in_sequence() function in your transform.
2. Assign key values component
3. Write a stored proc to this and call this stor proc wherever u need.
4. If you writing data into multi file then you have to use below formula to write numbers sequentially:
(next_in_sequence()-1)*no_of_partition()+this_partition()
Scenario Questions:
Q94. How to add header after every fifth record?
Q95. How to read a file having a header after 10 records?
Q96. How to develop CDC logic with only single join component?
Q94. How to add header after every fifth record?
Q95. How to read a file having a header after 10 records?
Q96. How to develop CDC logic with only single join component?
Questions and Answer:
Q97. For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?
Answer:
You can use components that does not require any sorted data (explicit or in memory sort) to get pipeline parallelism. Like Reformat,FBE, Redefine components.
And Components that needed sorted data like join, roll-up, merge, sort, partition by key and sort breaks the pipeline parallelism.
Q97. For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?
Answer:
You can use components that does not require any sorted data (explicit or in memory sort) to get pipeline parallelism. Like Reformat,FBE, Redefine components.
And Components that needed sorted data like join, roll-up, merge, sort, partition by key and sort breaks the pipeline parallelism.