Call For Paper Volume:4 Issue:10 Oct'2017 |

Information Theoretic Outlier Detection for Categorical Data

Publication Date : 10/07/2015

Author(s) :

Nilam S. Khairnar , Prof D.B.Kshirsagar.

Conference Name :
4th International Conference on Recent Trends in Engineering & Technology(ICRTET-2015) July 2-4,2015 Organized by SNJB's KBJ College of Engineering,Chandwad,Nashik,Maharashtra,India

Abstract :

Abstract-OUTLIER detection means the problem of finding objects in a data set that do not satisfy well-defined criteria of expected behavior. It can be implemented as a preprocessing step earlier to the application of an advanced data analysis method. It can also be used as an effective tool to discover interest patterns such as the expense behavior of to be bankrupt credit cardholder. Outlier detection is a crucial step in a variety of practical applications including intrusion detection, health system monitoring, and criminal activity detection in E-commerce, and can also be used in scientific research for data analysis and knowledge discovery in biology, chemistry, astronomy, oceanography, and other fields.Outlier detection can be done for Numerical as well as Categorical data. Most existing methods are designed for Numerical data but some or many of them encounter problem with real life applications that contain categorical data. Discovering rare events from categorical data is very vital in data mining because of the difficulty of defining a meaningful similarity measure. Also the common difficulty with the existing methods is the lack of a formal definition for the outlier detection problem. Without a formal definition, it is often designed as an ad-hoc process. In general, several user-defined parameters are often needed to define whether an object possesses sufficiently different properties, which separates them from others to be qualified as an outlier. The algorithms or methods which are taking input parameters from users are heavily dependent on suitable parameter settings for their results, which are very difficult to estimate without background knowledge about the data. Also many existing methods suffer from low effectiveness and low efficiency due to high dimensionality and large size of the data set, more complex statistical tests, or inefficient proximitybased measures. Here we are giving formal definition of outliers and an optimization model of outlier detection, via a new concept of holoentropy that takes both entropy and total correlation into consideration. Based on this model, we have defined a function for the outlier factor of an object which is solely determined by the object itself and is updated efficiently. We are implementing two practical, 1-parameter outlier detection methods, named ITB-SS and ITB-SP, which require no userdefined parameters for deciding whether an object is an outlier. Users need to only provide the number of outliers they want to detect. Experimental results show that ITB-SS and ITB-SP are more effective and efficient algorithms for large scale categorical data.

No. of Downloads :



Web Design MymensinghPremium WordPress ThemesWeb Development

Conference organizers are invited to submit conference proposals. Contact:

July 3, 2017

International Conference on Recent Trends in Engineering and Technology published by IJMTER

May 10, 2016
Prof. Rahulkumar M. Sonar Program Director, ICRTET'2015 Conference Venue: SNJB's Late Sau. K. B. Jain College of Engineering Chandwad

National Conference on “Advances in Engineering and Technology for Sustainable Development” @ Government Polytechnic, Nagpur (MS) , Published By IJMTER

February 24, 2015
Currently Being Held @ Government Polytechnic, Nagpur (MS), INDIA Date: 27th February,2015 Published & Proceeding By: IJMTER Published in Volume-2 , Issue- 2 February-2015

We welcome innovative Research from Researcher.

September 30, 2014

Welcome to International Journal of Modern Trends in Engineering and Research

July 21, 2014