PERFORMANCE ANALYSIS OF BREAST CANCER CLASSIFICATION USING DECISION TREE CLASSIFIERS
DOI:
https://doi.org/10.22159/ijcpr.2017v9i2.17383Keywords:
Classification, J48, REPTree, Random Forest, Random Tree, priority, AccuracyAbstract
Breast cancer is one of the dangerous cancers among world's women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Breast cancer dataset. This paper analyzes the different decision tree classifier algorithms for Wisconsin original, diagnostic and prognostic dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.
Downloads
References
RW Brause. Medical analysis and diagnosis by neural networks. Lecture Computer Sci 2001;2199:1-13.
Vaidehi K, Subashini TS. Breast tissue characterization using combined K-NN classifier. Indian J Sci Technol 2015;8:23–6.
Williams K, Idowu PA, Balogun JA, Oluwaranti A. Breast cancer risk prediction using data mining classification techniques. Transactions Networks Communications 2015;3:1–11.
Xindog Wu, Vipin Kumar. Top 10 algorithms in data mining. Knowledge Information Systems 2008;14:1-37.
Aruna S, Rajagopalan SP, Nandakishore LV. Knowledge-based analysis of various statistical tools in detecting breast cancer. Computer Sci Inf Technol 2011;2:37–45.
TM Cover. Geometrical and statistical properties of systems of linear with applications in pattern recognition. IEEE Transactions Electronic Computers EC-14; 1965. p. 326-34.
Ramnath Takiar. Projections of a number of cancer cases in India (2010-2020) by Cancer Groups. Asian Pac J Cancer Prev 2010;11:1045-9.
Evanthia E Tripoliti. Automated diagnosis of diseases based on classification: dynamic determination of the number of trees in random forests algorithm. IEEE Transactions Information Technol Biomed 2012;16:615-22.