A classification scheme for plotting Microarray Gene Expression Datasets to Decision Tree Algorithm using Distributed Systems

Publication Date : 10/07/2015

Neha V.Bhatambarekar.

4th International Conference on Recent Trends in Engineering & Technology(ICRTET-2015) July 2-4,2015 Organized by SNJB's KBJ College of Engineering,Chandwad,Nashik,Maharashtra,India

scrutinizing gene expression data is a challenging endeavor since the substantial number of features against the shortage of available samples can incline to over fitting. In order to circumvent this pitfall and achieve high performance, some schemes build complex classifiers, using state-of-the-art or well-established approaches. In medical decision making (classification, diagnosing) there are numerous situations where decision must be made competently and reliably. One of the recurrently employed techniques to extract knowledge from data is decision tree induction, since the representation of knowledge is very intuitive and easily interpreted by humans. Decision trees are authentic and effective decision making methodology that provides elevated classification accuracy with a simple representation of gathered knowledge which has been used in distinct areas of medical decision making. Instead of manually improving the design components of a decision tree algorithm as it has been done for the past 40 years, a novel approach of a hyperheuristic evolutionary algorithm using distributed systems for optimally combining design components from decision-tree algorithms called HEAD-DT is proposed. Thus, the hyper-heuristic automatically designs novel decision-tree algorithms, tailored to a particular type of data sets, associated with a given application domain. Hyper-heuristic methodology is capable of providing a faster, less-strenuous and at least equally effective strategy for improving decision-tree algorithms for particular application domains. By the end of the evolution, the proposed system HEAD-DT is expected to generate a new and possibly better decision-tree algorithm for a given application domain. The performance of HEAD-DT is assessed in real-world microarray gene expression data sets and it is compared against very well-known decision-tree algorithms such as REPTree ,CART,C4.5. HEAD-DT is expected to significantly outperform the baseline manually-designed decision-tree algorithms regarding predictive accuracy and F-Measure.

