Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer software > Program design,software engineering > Programming > Database theory and systems

Application of Data Mining in the Analysis of Higher Vocational Colleges’ Achievement

Author LiuHuaMin
Tutor JiaRuiYu
School Anhui University
Course Computer technology
Keywords data mining decision tree high vocational institutions analysis of students’ grades
CLC TP311.13
Type Master's thesis
Year 2011
Downloads 198
Quotes 1
Download Dissertation

With the fast development of China’s high education, high vocational education also enters a rapid development. Private vocational education expands radically during this period with the enlargement of enrollment and expansion of teaching staff, which brings about a series of problems. Among the problems, the top priority is to ensure the quality of teaching. Finding out elements related to students’grades can not only direct classroom teaching and educational management, but remain highly meaningful to guarantee teaching quality.Application of data mining techniques to analyze the database of student achievement data, to explore which factors related to the curriculum, classroom education and management can provide a reference and helpful. Data mining is defined as from a large number of noisy ambiguous, incomplete, random data to extract hidden in them is unknown but it is certainly exists, with a certain value of the information and knowledge process. Data warehouse is a subject-oriented, integrated, Non-Volatile, Time Variant in the data set, is used to support management decision. Classification refers to a discontinuous identifier value is assigned to an unidentified recording process. The decision tree is the most widely used classification technology.ID3 algorithm is the classic algorithm of decision tree classification, application of information gain as the creation of the basic decision tree index.C4.5 algorithm is based on the improvement of the ID3, Application of information gain ratio for attribute. However, C4.5 involves Logarithms operations, the process is relatively complex and computationally intensive, so the application of mathematical theory refers to a kind of improved C4.5 algorithm, the improved algorithm requires only addition and subtraction multiplication and division, to simplify the calculation process, improve the computational speed and, thus enhancing the efficiency of the decision tree of achievements. The experiment shows that decision tree created by C4.5 algorithm makes up for the disadvantage of the ID3 algorithm which tends to select attributes of various values; the transformation of the constructed decision tree to If-Then syntax is more standard and better able to satisfy the demands of decision makers; in the process of constructing decision tree the C4.5 algorithm constantly keeps pruning, thus the leave nodes of the created decision trees are comparatively fewer to result in a compact structure; the improved C4.5 algorithm only needs the calculation of adding, subtracting, multiplying and dividing, which simplifies calculation and computational complexity, improves the computational speed and thus enhances the efficiency of the tree construction.Based on the data warehouse, data mining, classification technology basic knowledge are introduced; secondly, by using SQL Server 2005 Analysis Service on student achievement data for multidimensional data is created, combining our college student achievement management system, to create a student achievement fact table as the core of data warehouse, the main content covers the student achievement data warehouse logic model design, physical model and data integration, processing and analysis; Finally, data mining and classification techniques and fully introduces the ID3 and C4.5 algorithm. Combined with the students’grades management system of Wenda Information Engineering Institute, Anhui, the paper adopts the ID3 and C4.5 algorithms typical in the decision tree algorithm and the revised C4.5 to mine data of students’grades and generate decision trees respectively, which are transformed to classification principles with an aim to uncover regularities hidden behind data of students’ grades while meaningful for educational instruction.

Related Dissertations
More Dissertations