2018-2019 Undergraduate/Graduate Catalog

Official Certificate Program in Data Mining

Program Rationale:

This program is designed for the person who loves data and wants to learn how to uncover actionable results from large data sets, using a data scientific framework. Starting with the first course, students will learn data science by applying it on real-world, large data sets, gaining expertise in state-of-the-art data modeling methodologies, so as to prepare them for information-age careers in data science, analytics, data mining, statistics, and actuarial science.

Program Learning Outcomes:

Students in the program will be expected to:

·       Approach data analysis using a scientific approach, that is, through a systematic process that avoids expensive mistakes by assessing and accounting for the true costs of making various errors.

·       Apply data science using a systematic process, by implementing an adaptive, iterative, and phased framework to the process, including the research understanding phase, the data understanding phase, the exploratory data analysis phase, the modeling phase, the evaluation phase, and the deployment phase; 

·       Demonstrate proficiency with leading open-source analytics coding software such as R and python, as well as commercial platforms, such as IBM/SPSS Modeler;

·       understand and apply a wide range of clustering, estimation, prediction, and classification algorithms including k-means clustering, Kohonen clustering, classification and regression trees, logistic regression, k-nearest neighbor, multiple regression, and neural networks; and

·       learn more specialized techniques in bioinformatics, text analytics, and other current issues.

Program Prerequisites:

 Applicants to the Graduate Certificate in Data Mining program are expected to have completed two semesters of applied statistics (such as STAT104/STAT453, STAT 200/STAT201, or STAT 215/STAT 216) with grades of B or better, or two semesters of statistics approved by advisor with grade of B or better, or permission of the Data Mining Program Director. The second semester course may be take concurrently with STAT 521 Intro To Data Mining.

Admissions Criteria:

Students must hold a bachelor's degree from a regionally accredited institution of higher education. The undergraduate record must demonstrate clear evidence of ability to undertake and pursue studies successfully in a graduate field.

A minimum undergraduate GPA of 3.00 on a 4.00 scale (where A is 4.00), or is equivalent, and good standing (3.00 GPA) in all post-baccalaureate course work is required. Conditional admission may be granted to candidates with undergraduate GPAs as low as 2.40, conditioned on a student receiving no grades lower than a B in DATA 511, 512, and 513.

In addition to the materials required by the School of Graduate Studies, the following are required by the program:

·      A formal application essay of 500-1000 words that focuses on (a) academic and work history, and (b) reasons for pursuing the Graduate Certificate in Data Mining, (c) future professional aspirations, and (d) where and how the applicant has completed the program prerequisites. The essay will also be used to demonstrate a command of the English language; and

·   Two letters of recommendation, one each from academia and work environment or two from academia if the candidate has not been employed.

The application to the Data Mining program is filled out online. All transcripts should be sent to the Graduate Admissions Office. Send the formal application essay and the letter of recommendation by email to the Director of Data Science.

Course Requirements

Required Courses

DATA 511Introduction to Data Science

4

DATA 512Predictive Analytics: Estimation and Clustering

4

DATA 513Predictive Analytics: Classification

4

Total Credit Hours:12

Choose two electives from:

DATA 514Multivariate Statistics

4

DATA 521Introduction to Bioinformatics

4

DATA 522Mining Gene and Protein Expression Data

4

DATA 525Biomarker Discovery

4

DATA 531Text Analytics with Information Retrieval

4

DATA 532Text Analytics with Natural Language Processing

4

DATA 541Advanced Estimation Methods

4

DATA 542Advanced Clustering Methods

4

DATA 543Advanced Classification Methods

4

DATA 551Predictive Modeling for Insurance Data

4

DATA 565Web Data Science

4

Total Credit Hours:8

Other graduate-level data mining or statistics course(s) may be selected, with approval of program coordinator.

Total Credit Hours: 20

More information can be found at: http://web.ccsu.edu/datamining/