Spark Interview Questions

Download PDF of Apache Spark Interview Questions

Q21. What is “lineageStartDate” of a FlowFile?

Ans: This attribute of the FlowFile represent that, the actual time when the FlowFile inserted or created in NiFi system. Even, it is possible a FlowFile can be cloned, merged, and splitted than a child FlowFile will be created. But the lineageStartDate will tell you the ancestor FlowFile time.

Q22. How to extract the attribute information from a FlowFile?

Ans : There are various processors are available for this purpose e.g ExtractText, EvaluateXQuery etc. Even you can create your custom processor for the same requirement, if no out of the box processor is available for the same.

NiFi Professional Training with HandsOn : Subscribe Now

Q23. Can you add your own define attributes to a FlowFile?

Ans: Yes, there is a processor named UpdateAttribute Processor, which can help you do this activity. UpdateAttribute processor also has an advanced UI, which can help to configure a set of rules for which Attributes should be added.

Q24. Can routing decision made based on attribute value of FlowFile?

Ans: Yes, that is one of the feature of the NiFi. If you want to take decision for FlowFile routing based on attribute than you have to use RouteOnAttribute processor.

Q25. In each property of the processor we can use the Expression language?

Ans: No, only few properties of a processor will have Expression language supported. You can check based on the individual processor, help icon or documents.

Q26. What is the template in NiFi?

Ans: Template is a re-usable workflow. Which you can import and export in the same or different NiFi instances. It can save lot of time rather than creating Flow again and again each time. Template is created as an xml file.

Q27. What happens, if you have stored a password in a DataFlow and create a template out of it?

Ans: Password is a sensitive property. Hence, while exporting the DataFlow as a template password will be dropped. As soon as you import the template in the same or different NiFi system, you have to provide the password again.

Q28. What happens to the ControllerService, when template is created using a DataFlow?

Ans: When a template is created from a DataFlow and if there is any ControllerService is attached to that. Than while importing a new copy of controller service will be created.

Q29. What is the bulleting and how it helps in NiFi?

Ans: If you want to know if any problems occur in a dataflow. You can check in the logs for anything interesting, it is much more convenient to have notifications pop up on the screen. If a Processor logs anything as a WARNING or ERROR, we will see a "Bulletin Indicator" show up in the top-right-hand corner of the Processor. This indicator looks like a sticky note and will be shown for five minutes after the event occurs. Hovering over the bulletin provides information about what happened so that the user does not have to sift through log messages to find it. If in a cluster, the bulletin will also indicate which node in the cluster emitted the bulletin. We can also change the log level at which bulletins will occur in the Settings tab of the Configure dialog for a Processor.

Q30. How do you define provenance repository?

Ans: Any events happen with the FlowFile e.g. received, cloned, forked, modified, sent or dropped a provenance event will be generated out of that and will be stored in the Provenance Repository. The Provenance Repository allows this information to be stored about each FlowFile as it traverses through the system and provides a mechanism for assembling a "Lineage view" of a FlowFile, so that a graphical representation can be shown of exactly how the FlowFile was handled.

Premium Training : Spark Full Length Training : with Hands On Lab

Previous Next

Home Spark Hadoop NiFi Java

Disclaimer :

1. Hortonworks® is a registered trademark of Hortonworks.

2. Cloudera® is a registered trademark of Cloudera Inc

3. Azure® is aregistered trademark of Microsoft Inc.

4. Oracle®, Java® are registered trademark of Oracle Inc

5. SAS® is a registered trademark of SAS Inc

6. IBM® is a registered trademark of IBM Inc

7. DataStax ® is a registered trademark of DataStax

8. MapR® is a registered trademark of MapR Inc.

2014-2017 © | Dont Copy , it's bad Karma |