Hi Shubham,

Hope you are doing well.

It was great speaking to you.

As discussed please find the below details for wordcount in spark.


                                       Execution In spark-shell:



==> First start spark-shell

image

My dataset:

kamini kanchan mishra
kasinath mishra
bijayalaxmi mishra
priyanka vashishta
kartik vashishta

Code:

   val lines= sc.textFile("file:///home/edureka/Desktop/student")
   val words = lines.flatMap(_.split(" "))
   val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
   wordCounts.foreach(println)

Screenshot:

image

image

image

Output:

image




                             Now executing the same code in eclipse IDE




Dataset:

kamini kanchan mishra
kasinath mishra
bijayalaxmi mishra
priyanka vashishta
kartik vashishta

Code:

    import org.apache.spark.SparkContext
    import org.apache.spark.SparkConf
   object WordCount extends App{

   val conf = new SparkConf().setAppName("Simple Application")
   val sc = new SparkContext(conf)
   val lines= sc.textFile("file:///home/edureka/Desktop/student")
   val words = lines.flatMap(_.split(" "))
   val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
   wordCounts.foreach(println)
}

build.sbt:

name := "WordCount"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2"


And to execute this code please go through the video.

https://edureka.wistia.com/medias/51ptizh8kh


Please try the same once on your side and let us know if you face any issue.

We are eagerly waiting for your response.





338441