Spark Context vs Spark Session

SparkContext:

Main entry point for Spark functionality. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster.

Syntax for SparkContext:

from pyspark import SparkContext

--Create a SparkContext object with local mode and specifying the number of worker threads

sc = SparkContext("local[*]", "example")

--Now you can use 'sc' to create RDDs and perform operations on them

SparkSession:
It’s a unified object to perform all the Spark operations. In the earlier version of the Spark 1.x there were separate objects like SparkContext, SQLContext, HiveContext, SparkConf, and StreamingContext. However with Spark 2.x all these different objects combine into one i.e. the SparkSession. You can perform all those operations using the SparkSession object itself.
This unison of all the objects has made life simpler for the Spark Developers.

Syntax for SparkSession

from pyspark.sql import SparkSession

spark = SparkSession \
.builder \
.appName("Name") \
.getOrCreate()

Why should use SparkSession over SparkContext?

from Spark 2.0, SparkSession provides a common entry point for a Spark application.
Instead of SparkContext, HiveContext, SQLContext, everything is now within a SparkSession.
It unifies all of sparks numerous contexts. before version 2.0 need to create separate context per JVM.
however with SparkSession this problem has been resolved.

Ticker

Spark Context vs Spark Session

Posted by kranthi kumar jorrigala

Post a Comment

0 Comments

Subscribe Us

Most Popular

Tags

Blog Archive

Report Abuse

Popular Posts

Pages

About Me

Popular Posts

Contact form

Ticker

Spark Context vs Spark Session

Posted by kranthi kumar jorrigala

You may like these posts

Post a Comment

0 Comments

Social Plugin

Subscribe Us

Most Popular

Tags

Blog Archive

Report Abuse

Popular Posts

Pages

About Me

Popular Posts

Contact form