Causal Discovery using Dimensionality Reduction Partial Association Tree

Authors

  • Sreeraman.Y Researh Scholar, Dept. of CSE, Pondicherry Engineering College, Puducherry, India. Author
  • S. Lakshmana Pandian Associate Professor, Dept. of CSE, Pondicherry Engineering College, Puducherry, India. Author

DOI:

https://doi.org/10.47392/irjash.2021.137

Keywords:

Decision tree, supervised learning, partial association tree, causal rules

Abstract

Decision tree is a model to classify data based on labelled attribute values. This model is a supervised learning approach through which one can classify a new entry into an appropriate class. If we want to know the cause behind this classification then decision tree cannot provide the same. When we infer causes behind the classification then they will provide rich knowledge for better decision making. Causal Bayesian Networks, Structural Equation Models, Potential Outcome Models are some of the models that are used to get causal relationships. These models need experimental data. But it is not possible or it is very expensive to conduct full experiments. So, a model is needed to identify causes from effects from observational data rather than experimental data. In this paper, a novel approach is proposed for causal inference rule mining which can infer the causes from observational data in a faster way and also scalable. Statistical tools and techniques named partial association test, correlation are used to develop the model. A new way of constructing a tree called Dimensionality Reduction Partial Association Tree (DRPAT) is introduced. Sometimes the existing causality cannot be extracted where low associated dimensions are involved in data and hiding the underlying causality and this model extracts causal association in case of hidden causality in data. The model is applied on the 'Cardiovascular Disease dataset' sourced from Kaggle Progression System. The result is a Partial Association Tree. From this tree, one can get a set of causal rules which can form a basis for better data analytics and then better decision making.

Downloads

Published

2021-05-01