Data Lineage #510
Unanswered
LorenzoLaCognata
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, this is already great, but I am thinking of a data lineage feature that would be very helpful.
It would mean associating columns with each other with the meaning of "is used/necessary (directly or indirectly) to calculate".
I guess it can easily become tricky to solve, but a simple example would be the following (note that also the columns in a JOIN, a WHERE etc. should be considered for the lineage):
Input = "SELECT a.col1 AS x, b.col2 AS y FROM a INNER JOIN b ON a.col3 = b.col4"
Output = "{ 'x': ['a.col1', 'a.col3', 'b.col4'], 'y': ['b.col2', 'a.col3', 'b.col4']"
I don't think the current version can help with this request, and I don't know of any related project that has solved this problem, but let me know in case.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions