IJIEST

Leveraging AI for Data Provenance: Enhancing Tracking and Verification of Data Lineage in FATE Assessment

Swathi Chundru

Team Lead, Motivity Labs Pvt Ltd, Hyderabad, Telangana, India

87 - 104 Vol. 7, Issue 1, Jan-Dec, 2021

Receiving Date: 2021-03-30; Acceptance Date: 2021-06-05; Publication Date: 2021-06-29

Download PDF

Abstract

A record of the sources and processing of data, known as data provenance, holds new possibilities in the ever-growing role that artificial intelligence (AI)-based systems play in assisting human decision-making. Fairness, accountability, transparency, and explainability are the four key virtues that responsible AI builds upon to prevent the terrible consequences that might arise from biased AI systems. This work describes current biases and explores potential applications of data provenance to alleviate them, in an effort to spark more research on data provenance that facilitates responsible AI. We start by going over biases resulting from the pre-processing and data origins. Next, we talk about the practice as it is now, the difficulties it faces, and the solutions that have been suggested. In order to create responsible AI-based systems, we give an overview of how our recommendations might help establish data provenance and hence eliminate biases arising from the origins and preprocessing of the data. We wrap up by outlining future study directions in our research agenda.

Keywords: Artificial Intelligence; Data lineage; FATE assessment

References

Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 6, (2018), 52138–52160.DOI:https://doi.org/10.1109/ACCESS.2018.2870052
Gediminas Adomavicius, Jesse Bockstedt, ShawnCurley, and Jingjng Zhang. 2019. Reducing Recommender Systems Biases: An Investigation of Rating Display Designs. MIS Quarterly 43, 4 (February 2019), 18–19.
Gediminas Adomavicius and Mochen Yang. 2019.Integrating Behavioral, Economic, and Technical Insights to Address Algorithmic Bias: Challenges and Opportunities for Research. SSRN Journal (2019).DOI: https://doi.org/10.2139/ssrn.3446944
Alan Alexander, Megan McGill, Anna Tarasova, Cara Ferreira, and Delphine Zurkiya. 2019. Scanning the Future of Medical Imaging. Journal of the American College of Radiology 16, 4 (April 2019), 501–507.DOI:https://doi.org/10.1016/j.jacr.2018.09.050
Ilkay Altintas, Oscar Barney, and Efrat Jaeger-Frank. 2006. Provenance Collection Support in the Kepler Scientific Workflow System. In Provenance and Annotation of Data (Lecture Notes in Computer Science), Springer, Berlin, Heidelberg, 118–132. DOI: https://doi.org/10.1007/11890850_14
Marcus A. Badgeley, John R. Zech, Luke Oakden- Rayner, Benjamin S. Glicksberg, Manway Liu, William Gale, Michael V. McConnell, Bethany Percha, Thomas M. Snyder, and Joel T. Dudley. 2019. Deep learning predicts hip fracture using confounding patient and healthcare variables. npj Digit. Med. 2, 1 (December 2019), 31. DOI: https://doi.org/10.1038/s41746-019-0105-1
Khalid Belhajjame, Reza B’Far, James Cheney, Sam Coppens, Stephen Cresswell, Yolanda Gil, Paul Groth, Graham Klyne, Timothy Lebo, Jim McCusker, Simon Miles, James Myers, Satya Sahoo, and Curt Tilmes. 2013. PROV-DM: The PROV Data Model. (2013).
Francine Berman, Rob Rutenbar, Brent Hailpern, Henrik Christensen, Susan Davidson, Deborah Estrin, Michael Franklin, Margaret Martonosi, Padma Raghavan, Victoria Stodden, and Alexander S. Szalay. 2018. Realizing the potential of data science. Commun. ACM 61, 4 (March 2018), 67–72. DOI: https://doi.org/10.1145/3188721
Donald J. Berndt, James A. McCart, Dezon K. Finch, and Stephen L. Luther. 2015. A Case Study of Data Quality in Text Mining Clinical Progress Notes. ACM Trans. Manage. Inf. Syst. 6, 1 (April 2015), 1–21. DOI: https://doi.org/10.1145/2669368 [13] Peter Buneman and Susan B Davidson. Data provenance – the foundation of data quality. 8.
Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and Where: A Characterization of Data Provenance. In Database Theory — ICDT 2001, Jan Van den Bussche and Victor Vianu (eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 316–330. DOI: https://doi.org/10.1007/3-540-44503-X_20
Cansu Canca. 2020. Operationalizing AI ethics principles. Commun. ACM 63, 12 (November 2020), 18–21. DOI: https://doi.org/10.1145/3430368
James Cheney, Laura Chiticariu, and Wang-Chiew Tan. 2007. Provenance in Databases: Why, How, and Where. FNT in Databases 1, 4 (2007), 379–474. DOI: https://doi.org/10.1561/1900000006
Enrico Coiera. 2019. The Last Mile: Where Artificial Intelligence Meets Reality. J Med Internet Res 21, 11 (November 2019), e16323. DOI: https://doi.org/10.2196/16323

Back

Contact

2454-9584

2454-8111

6.713

6.464

2454-9584

2454-8111

6.713

6.464

2454-9584

2454-8111

6.713

6.464

INTERNATIONAL JOURNAL OF INVENTIONS IN ENGINEERING & SCIENCE TECHNOLOGY

International Peer Reviewed (Refereed), Open Access Research Journal

(By Aryavart International University, India)

Paper Details

Leveraging AI for Data Provenance: Enhancing Tracking and Verification of Data Lineage in FATE Assessment

Abstract

References