Abinitio Interview Questions
3.35K subscribers
1 photo
19 links
This group is designed for posting Abinitio tool related interview questions.
Download Telegram
Q5. How to remove header and trailer from CSV file?

Ans:
Read by new line initially and remove header trailer then redefine.

To remove trailer, use dedup component with option key as null and keep last record. Finally retrieve all data records from dedup port.
Q6. How to do the count(distinct ID) In Abinitio?

Ans:
1) dedup and the count function in rollup
2) use expanded mode in rollup

Yes, dedup on cust id and count on null key.

And if you want to count only unique then “keep unique”. if you only to count at least once per id then “keep first”.
Q7. Kill vs m_kill ?

Ans:
Kill is for unix process and m_kill for ab initio graph. kill needs process id, and m_kill needs rec file name which is easy to find out, and may be it would be doing something more in backend, like waiting for AB_TIMEOUT
Q8. What is the difference between .abintiorc and abinitiorc file?

Ans:
abinitiorc file is server file which is common to all server user and only admin can do changes to it.

.abinitiorc is user config file and has higher precedence over system config file abinitiorc.

Both contains lot of configuration variables like ab home, ab air root, host connection etc.

For mode details visit help file.
Tip 5:
We can use below command to change the version of an existing object in a tag.

Air tag change-version <tagname> object_name

We can pass -version and specific version number as well.
Q9. I have 100 records in a input file and I want to send frst 20 to the output file. When I run the graph 2nd time it will take 21 st to 40 and so on

Ans1:
I/p file then reformat having two out port.

One port connect to a ROLLUP and then file this will hold the count.

2nd port to target file or table.

In reformat, use the count from that lookup file in select

Invocation_number > (lookup.count) and invocation_number < (lookup.count +20)

Ans2:
Use read multiple file component with read count option as 20 and skip count with sandbox parameter which will gets increased by 20 after every run.
Export global parameter in the end script.
Q10. How to add trailer record at every Third record of a file having 10 records?

Ans:
In output index of reformat use logic if next in sequence % 3 == 0 then output add trailer from in1.

Ans2:
you can use normalize generate length as 2 when next in sequence %3 ==0.
Q11. I have 100 records in input file in output i want 1 million records. What will be best approach?

Ans:
Use normalize with length as 10,000 fixed number.
Graph will be having below components to produce output:
Input file-> normalize -> output file
Tip 6:
Maximum max-core value allowed for system is calculated as below:

It's 2^31 - 1 for 32 bit around 2GB space
Q12. How we can pass jobid while running air sandbox run?

Ans:
U have to pass -AB_JOB_PREFIX for graph and not job id.

Ab_job_prefix is for graph level parameter and job_id is on plan level.
Only questions:

Q13. Can rollup remove duplicates? How?

Q14. What will be the result if you don’t provide a key? (sort/dedup/rollup/lookup/join)

Q15. For Rollup, in what case can the output record count be more than the number of unique keys in input file? (show an example)

Q16. Differentiate between gather and fan-in flow.

Q17. How many priorities can be assigned at the max?

Q18. Length prefixed strings?

Q19. What is implicit and explicit reformat?

Q20. What do we receive in reject port of reformat?

Q21. if record is rejected from reformat, how can I find reason for that

Q22. I want to define default values for invalid date values in the date field of file, what should be done in transform of reformat component to achieve this.

Try to solve this questions and send unsolved questions on abinitio group.
Only questions:

Q23. Is it required to use sorted input record to use lookup function?

Q24. What all types of join can be used with lookup component? Which one type is not possible? Why?

Q25. What do you mean by spilling data on disk?

Q26. What is pipeline parallelism and which component breaks it?

Q27. If I use in-memory sort, will it give sorted output or records in order of input or random order?

Q28. When will you use output index against partition by expression.

Q29. Can you join 2 files having keys with different datatypes?

Q30. Which components can be used to reinterpret data?

Q31. What the difference is between redefine format and reformat.

Q32. What are multi stage transforms. Give examples of components which have multi stage transforms.

Q33. Explain run time behavior of Normalize.

Q34. List various functions in the normalize component.

Q35. What is the difference between using length variable and finished function in normalize.

Q36. Which parameter decides the count of How many times a finalize function will run.

Q37. List all the aggregation functions in rollup

Q38. What is the difference between rollup and scan?

Q39. What is difference between is_defined and first_defined.

Q40. I have 6 columns and 5 records in a file , I have to do transpose it , please tell possible ways of doing it.


Try to solve this questions and ask unsolved questions in abinitio group.
Tip 7:
Working of First defined function is same as coalesce function of SQL statement.

E.g.
Coalesce(id,00) - if id is NULL then default value 00 will be assigned to it in SQL queries.

first_defined(lookup(“lkp”,in.id),00) -
If lookup output is null then first defined will assign default value 00 to it.
Tip 8:

Using lookup function you cannot achieve full outer join. Only left or right join can be achieved as we can’t pull uncalled records from lookup file.

E.g.

Input file have below 3 records:
ID
1
2
3

Lookup file have below 5 records:
ID Name
1 A
2. B
3. C
4 D
5 E

Suppose now you have used lookup function and then you can’t pull all the records from lookup file in the output process.
As there is no ID is present in input file for number 4 & 5 hence their Name will not get pulled.

lookup(“input file”,in.ID).Name

Hence in Abinitio using lookup function we can’t achieve full outer join functionality. To achieve this we need to use Join component.
Logical questions:

Q41. I have 6 columns and 5 records in a file. I have to do transpose it , please tell possible ways of doing it.

Q42. How to convert 4-way multifile to 16-way-multilfile without using partition component?

Q43. You have one multi file and while operating it you come to know that one of its partition corrupted. So now how you will correct it? Provide at least three solutions.

Q44. You have a file and you have to extract header , trailer and body records. how to achieve it without rollup.

Q45. How to calculate second highest salary of each department? Provide 3 solutions.

For any queries and new question submission email at
AbiInterviewQuestions@gmail.com

We will add new questions daily, please do keep visiting this group.
Please share this link to people who are preparing for the abi interview:
https://t.me/abi_interview_qstn

For any queries and new questions submission email us at
AbiInterviewQuestions@gmail.com

We will try add new questions daily, please do keep visiting this group.
Abinitio Interview Questions pinned «Please share this link to people who are preparing for the abi interview: https://t.me/abi_interview_qstn For any queries and new questions submission email us at AbiInterviewQuestions@gmail.com We will try add new questions daily, please do keep visiting…»
Tip 9:
Information on .abi-unc files:

These are temporary files used by the checkout process. Under normal circumstances you should not see them. However, if the checkout failed or was interrupted, these files can be left behind. Delete them and try checking out again.

Details:
The checkout procedure includes two steps to ensure that a checkout does not leave your sandbox in a half-checked-out state:

1. It checks out the files, but gives them the .abi-unc suffix (for "uncommitted").

2. Once every file is successfully checked out, the .abi-unc files replace the real files. If an error occurs midway through checkout, any .abi-unc files that have been created might be left behind.

If you receive the error message "Cannot write project parameters file: /path/filename.abi-unc" during checkout and you see .abi-unc files, delete them and try checking out again. Uncommitted files are usually left behind because a checkout process was aborted.
Tip 10:

The job recovery file(.rec) working:

Co>Op monitors & records the state of jobs so that if a job fails, it can be restarted. This state info is stored in files associated with the job and enables the Co>Op to roll back the system to its initial state, or to its state as of the most recent completed checkpoint. Generally, if the application encounters a failure, all hosts and their respective files will be rolled back to their initial state or their state as of the most recent completed checkpoint; you recover the job simply by rerunning it.

Details:
An AI job is considered completed when the mp run command returns. This means that all the processes associated with the job — excluding commands u might have added in the script end — have completed. These include the process on the host system that executes the script, and all processes the job has started on remote computers. If any of these processes terminate abnormally, Co>Op terminates the entire job and cleans up as much as possible.

When an AI job runs, the Co>Op creates a file in the working directory on the host system with the name jobname.rec. This file contains a set of pointers to the log files on the host and on every computer associated with the job. The log files enable the Co>Op to roll back the system to its initial state or to its state as of the most recent checkpoint. If the job completes successfully, the recovery files are removed (they are also removed when a single-phase graph is rolled back).
If the app encounters a software failure (Ex, one of the processes signals an error or the operator aborts the app), all hosts & their respective files are rolled back to their initial state, as if the app had not run at all. The files return to the state they were in at the start, all temporary files & storage are deleted, & all processes r terminated. If the program contains checkpoint commands, the state restored is that of the most recent completed checkpoint.
When a job has been rolled back, u recover it simply by rerunning it. Of course, the cause of the original failure might repeat itself when you rerun the failed job. You will have to determine the cause of the failure by investigation or by debugging.
When a checkpointed application is rerun, the Co>Op performs a fast-forward replay of the successful phases. During this replay, no programs run & no data flows; that is, the phases r not actually repeated (although the monitoring system cannot detect the difference bet’n the replay & an actual execution). When the replayed phases are completed, the Co>Op runs the failed phase again.

Note that it might not always be possible for the Co>Op to restore the system to an earlier state. Ex, a failure could occur because a host or its native OS crashed. In this case, it is not possible to cleanly shut down flow or file operations, nor to roll back file operations performed in the current phase. In fact, it is likely that intermediate or temporary files will be left around.

To complete the cleanup and get the job running again, you must perform a manual rollback. You do this with the command m_rollback.

The syntax is:
m_rollback [-d] [-i] [-h] recoveryfile

Running m_rollback recoveryfile rolls the job back to its initial state or the last completed checkpoint. Using the -d option deletes the partially run job & the recovery file.
Please share this link to people who are preparing for the abi interview:
https://t.me/abi_interview_qstn

For any queries and new questions submission email us at
AbiInterviewQuestions@gmail.com

We will try add new questions daily, please do keep visiting this group.