br Literature review br Health care captures huge amount of
1.1. Literature review
Health care captures huge amount of patient specific clinical in-formation e.g. diagnosis, medication, pathological test results and radiological imaging data along with patients’ socio demographic characteristics . Although EMR has been heralded for its potential but integrating scattered, heterogeneous data, and varieties of data [16,17] is still a technical challenge to researchers, who wish to analyze large amounts of patient data. Data mining has helped many re-searchers to reveal the hidden information using EMR.
Many researchers have applied supervised machine learning algo-rithm (a data mining tool) to separate the patient class from the po-pulation. Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes, Neural Network, K-Nearest Neighbour, Decision Tree are the most common data mining methods, helped in diagnosing coronary disease [6,7]. Association rule mining-based classifiers like Apriori, Predictive Apriori and Tertius methods also have contributed to detect factors, which contribute to AUY922 (NVP-AUY922) disease in males and females . Ensemble methods (boosting, bootstrap aggregation (bagging), Random Forests) or hybrid data mining models also have helped in determining to predict the probability of the presence of one sub-type of heart failure , and in treatment of heart diseases respectively . Wu et al.  have found that Logistic Regression (LR) one of the the most efficient method (may be due to imbalanced data) in predicting heart failure before more than 6 months before clinical diagnosis. k-NN al-gorithm and Ensemble methods (TreeBagger, LPBoost and Subspace) also have helped to find whether a particular region of interest (ROI) of the digital image, an output of mammogram, is carrying benign or malign masses to determine breast cancer  and in finding overall survival rate for woman, suffering from breast cancer . Jacob and Ramani  have compared 20 different classification algorithms using Wisconsin prognostic Breast Cancer data set to detect the breast cancer and concluded that Quinlan's C4.5 algorithm is the best.
Data mining techniques have also reveled unknown causes in pre-dicting many other type of diseases. Bayesian Network learning algo-rithm has helped to find an improved method of detecting lung cancer tumor type based on various properties of protein . Logistic Re-gression is also used to find out the health related quality of life (HRQoL) for Irritable Bowel Syndrome patients  and in prediction of factors associating with pressure ulcers . Brain et al.  has experimented using Artificial Neural Network, Multilayer Perceptron to conclude that HIV status of a person based can be predicted based on demographic data. Altikardes et al.  has noticed that Machine learning is able to predict diurnal blood pressure pattern depending on demographic, clinical and laboratory data. Wuyang et al.  has ex-perimented to predict the hospitalization due to heart disease with help Artificial Intelligence In Medicine 95 (2019) 16–26
of SVM, AdaBoost with trees, LR, Naïve Bayes Event Model and K-Likelihood Ratio Test(K-LRT). They have recommended a novel method of using K-LRT for feature identification during examining a patient and have reported AdaBoost as best achiever of highest detection rate at fixed false alarm rate.
A few works related to esophageal cancer in feature selection  or to study the esophageal adeno carcinoma after induction therapy
 shows an application of machine learning in this domain. Also, Hoerres et al.  have mentioned the need for improving the diag-nostic and prognostic testing related to Barrett's esophagus, an earlier stage of esophageal cancer. Zhang et al.  have constructed a pre-dictive models for tumor response evaluation to neoadjuvant chemor-adiation therapy (CRT) in patients with esophageal cancer and found that the SVM model that used all features accurately and precisely predicted pathologic tumor response to CRT in esophageal cancer. Another similar study of prediction of Pathologic Complete Response to Neoadjuvant Chemoradiation for esophageal cancer patient shows that fair-to-good prediction accuracy can be developed by K nearest neigh-bors in clinical decision making for patients undergoing trimodality therapy for esophageal cancer . However, we have not found any study to predict esophageal cancer with help of EMR.