Spark Interview Questions
11. Can RDD be shared between SparkContexts?
Ans: No, When an RDD is created; it belongs to and is completely owned by the Spark context it originated from. RDDs can’t be shared between SparkContexts.
12. In Spark-Shell, which all contexts are available by default?
Ans: SparkContext and SQLContext
13. Give few examples , how RDD can be created using SparkContext
Ans: SparkContext allows you to create many different RDDs from input sources like:
· Scala’s collections: i.e. sc.parallelize(0 to 100)
· Local or remote filesystems : sc.textFile("README.md")
· Any Hadoop InputSource : using sc.newAPIHadoopFile
14. How would you brodcast, collection of values over the Sperk executors?
15. What is the advantage of broadcasting values across Spark Cluster?
Ans: Spark transfers the value to Spark executors once, and tasks can share it without incurring repetitive network transmissions when requested multiple times.\
16. Can we broadcast an RDD?
Ans: Yes, you should not broadcast a RDD to use in tasks and Spark will warn you. It will not stop you, though.
17. How can we distribute JARs to workers?
Ans: The jar you specify with SparkContext.addJar will be copied to all the worker nodes.
18. How can you stop SparkContext and what is the impact if stopped?
Ans: You can stop a Spark context using SparkContext.stop() method. Stopping a Spark context stops the Spark Runtime Environment and effectively shuts down the entire Spark application.
19. Which scheduler is used by SparkContext by default?
Ans: By default, SparkContext uses DAGScheduler , but you can develop your own custom DAGScheduler implementation.
20 .How would you the amount of memory to allocate to each executor?
Ans: SPARK_EXECUTOR_MEMORY sets the amount of memory to allocate to each executor.
- Spark Certification & Training Material
- Spark Interview Questions-1
- Spark Interview Questions-2
- Spark Interview Questions-3
- Spark Interview Questions-4
- Spark Interview Questions-5
- Spark Interview Questions-6
- Spark Interview Questions-7
- Cloudera (Hadoop/Data Engineer) Certification
- Hortonworks (BigData) Certifications
Home Spark Hadoop NiFi Java
1. Hortonworks® is a registered trademark of Hortonworks.
2. Cloudera® is a registered trademark of Cloudera Inc
3. Azure® is aregistered trademark of Microsoft Inc.
4. Oracle®, Java® are registered trademark of Oracle Inc
5. SAS® is a registered trademark of SAS Inc
6. IBM® is a registered trademark of IBM Inc
7. DataStax ® is a registered trademark of DataStax
8. MapR® is a registered trademark of MapR Inc.
2014-2017 © HadoopExam.com | Dont Copy , it's bad Karma |