Phone

+919997782184

Email

support@roboticswithpython.com

Geeks of Coding

Join us on Telegram

Viewing 0 reply threads
  • Author
    Posts
    • #1162
      Abhishek TyagiAbhishek Tyagi
      Keymaster
      rdd = sc.parallelize(range(100))
      rdd2 = range(100)

      Question 1- Please consider the following code.
      Where is the execution of API calls on “rdd” taking place?

      Answer-
      In the Apachespark Worker Note

      rdd = sc.parallelize(range(100))
      rdd2 = range(100)

      Question 2- Please consider the following code.
      Where is data in ” rdd2 ” stored physically?

      Answer-
      On the Local Driver Machine

      Question 3- What is the parallel version of the following code?

      len(range(9999999999))

      Answer-
      size(sc.parallelize(range(9999999999)))

      Question 4- Which storage solutions support seamless modification of schemas? (Select all that apply)

      Answer-
      Object Storage
      NoSQL

      Question 5- Which storage solutions support dynamic scaling on storage? (Select all that apply)

      Answer-
      Object Storage
      NoSQL

      Question 6- Which storage solutions support normalization and integrity checks on data out of the box? (Select all that apply)?

      Answer-
      SQL/Relational Databases

      Question 7- What is the advantage of using ApacheSparkSQL over RDDs? (select all that apply)

      Answer-
      Catalyst and Tungsten are able to optimise the execution, so are more likely to execute more quickly than if you would had implemented something equivalent using the RDD API.

      The API is simpler and doesn’t require specific functional programming skills.

Viewing 0 reply threads
  • You must be logged in to reply to this topic.