Dear Ranganathan,

Hope you are doing well

Q1) Then what is Lineage Graph 

When a new RDD is derived from existing RDD using transformation, Spark keeps track of all the dependencies between these RDDs called the lineage graph. In case of data loss, this lineage graph is used to rebuild the data.

Q2) Are there any difference between Lineage Graph and Directed Acyclic Graph ?

Yes, DAG and Lineage graphs are different.

DAG shows the different stages of a spark job.

Q3) Are they both same, can they both be used interchangeably ?

No they cannot be used interchangeably, because workings are different. Lineage graph deals with RDDs so it is applicable up-till transformations ,  Whereas, DAG shows the complete task, ie; trasnformation + Action  

Hope this resolves your query

In case of any further issue,feel free to revert.

Kindly give your valuable feedback by clicking on any one of the smiley's below.

Please note if you are not happy with the response on this ticket, please escalate it to
We assure you that we will get back to you within 24 hours

Sumit Anand
edureka! Solution Team
Website -
Edureka claims 1st position at Deloitte's Technology Fast 50 India 2014

Please let us know your opinion on our support experience.

HappyAwesome NeutralJust Okay UnhappyNot Good