Research on Technology of Disease Risk Automatic Warning for Health Assessment
|Keywords||Medical data Classifier Data clean Feature selection Deep learning|
With people having been taking attention to health, disease risk automatic warning is becoming the most important task for information technology in the medicine field. The risk warning technology is information processing, data mining and achieving the risk prediction as its final result from large medical data. This paper is to depth study on the disease warning.At first, the disease risk warning technology is analyzed in this paper. Physiological value can be detected by medical equipment to analyze people’s body. And disease risk warning needs an increasing number of data to predict the risk. In information field, classify technology is to predict unknown classes according to lots of marked data and it’s in according with meeting requirement of project. Therefore, classify methods is in depth study in this paper.Then the characteristics of medical data are analyzed in this paper. To summarize that medical data has diversity, temporality, inconsistency, redundancy, privacy and so on. And to introduce some classify methods and analyze their advantages and disadvantages. According to those characteristics, the applicability of different classify methods in medicine field is analyzed. And to select four classify methods to do the tests.Support vector machine, decision tree, Naive Bayesian and RBF network are selected for depth study. The feasibility of these four algorithms is analyzed for medical data. And some medical data sets from the UCI machine learning data base are used to the classify tests. Results tell us that classify model can accurate describe the disease risk warning problem. However, these classifiers still can be developed highly.Finally, the reasons for classifiers cannot get good results are in depth study and solution methods for each situation are proposed in this paper. For the problem of the large number and disarray of medical data, data cleaning is proposed as the solution method. For the attributes and features of some medical data sets, attribute selection is proposed as its solution. And a lot of tests have been done. Among them, there is a data set that has10thousand attributes, after using attributes selection, the correctly rate is from74%to81%. Therefore, this test proves attribute selection is a good way to develop function of classifier. For the structure of the classifier, a bi-layer structure with neural network and Naive Bayesian is proposed and used to test. And the correctly rate is also developed. Deep Learning is used as a new classify algorithm. Compared with other four classifiers, deep learning algorithm get a better result that the correctly rate can develop1%～10%. That’s to say, deep learning is very suitable for medicine field.In summary, the methods proposed in this paper that to develop the function of classify in medical data sets get a good effect to the disease risk warning. And it can make the disease risk warning technology more accurate and to get treatment as early as people can.