Keywords : Supervised Learning

Causal Discovery using Dimensionality Reduction Partial Association Tree

Sreeraman Y; S. Lakshmana Pandian

International Research Journal on Advanced Science Hub, 2021, Volume 3, Issue Special Issue ICITCA-2021 5S, Pages 38-43
DOI: 10.47392/irjash.2021.137

Decision tree is a model to classify data based on labelled attribute values. This model is a supervised learning approach through which one can classify a new entry into an appropriate class. If we want to know the cause behind this classification then decision tree cannot provide the same. When we infer causes behind the classification then they will provide a rich knowledge for better decision making. Causal Bayesian Networks, Structural Equation Models, Potential Outcome Models are the some of the models that are used to get causal relationships. These models need experimental data. But it is not possible/ it is very expensive to conduct full experiments. So a model is needed to identify causes from effects from observational data rather than experimental data. In this paper a novel approach is proposed for causal inference rule mining which can infer the causes from observational data in a faster way and also scalable. Statistical tools and techniques named partial association test, correlation are used to develop the model. A new way of constructing a tree called Dimensionality Reduction Partial Association Tree (DRPAT) is introduced. Sometimes the existing causality cannot be extracted where low associated dimensions are involved in data and hiding the underlying causality and this model extracts causal association in case of hidden causality in data.. The model is applied on “Cardiovascular Disease dataset” sourced from Kaggle Progression System. The result is a Partial Association Tree. From this tree one can get a set of causal rules which can form a basis for better data analytics and then the better decision making.