Data Science Courses
Courses
Statistical Programming: Introduction to statistical programming using statistical software packages such as R, SAS or Python. Emphasis on methods of data entry, data management, and creation of statistical reports. Topics covered include data manipulation, creation of user-defined functions, simulation methods, random variable generation, permutation methods, the bootstrap, the jackknife and methods of increasing computational efficiency.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
2 Lab Hours
1 Lecture Hour
0 Other Hours
Classification Restrictions:
Excluded Class: DR
Prerequisite(s): (STAT 5380 w/C or better)
Data Visualization: This course provides an introduction to the statistical application and data visualization with R or Python. The main goals of the course are to learn how to use tools for cleaning, exploring, analyzing, and visualizing data; making data-driven interferences and decisions; and effectively communicating results.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
1 Lab Hour
2 Lecture Hours
0 Other Hours
Classification Restrictions:
Excluded Class: DR
Prerequisite(s): (STAT 5329 w/B or better)
Mathematical Foundations of Data Science I: This course provides an overview of foundational mathematical concepts needed to develop a sophisticated understanding of concepts required to be a data scientist. The material covered in this course may stand alone or be built upon in DS 5381.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Classification Restrictions:
Excluded Class: DR
Prerequisite(s): (STAT 5329 w/B or better)
Mathematical Foundations of Data Science II: This course provides an overview of foundational mathematical concepts needed to develop a sophisticated understanding of concepts needed to be a data scientist. The concepts covered in this course build upon the material in DS 5380.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Classification Restrictions:
Excluded Class: DR
Prerequisite(s): (STAT 5329 w/B or better)
Introduction to techniques for data mining and analytics with emphasis on R programming and hands-on experience with real data; topics covered: dimension reduction, cluster analysis, ordinary and partial least squares regression, principal components regression, ridge regression, l1-regularization, logistic regression, assessment of classifier, decision/regression trees, bagging and boosting, random forests, and data visualization techniques.
Department: Data Science
4 Credit Hours
4 Total Contact Hours
1 Lab Hour
3 Lecture Hours
0 Other Hours
Classification Restrictions:
Excluded Class: DR
Prerequisite(s): (STAT 5428 w/B or better)
General statistical techniques for unsupervised and supervised learning, with more emphasis on methodology; topics covered: association rules, outlier detection, PageRank, parametric nonlinear regression; optimization, conventional nonparametric regression methods (including kernel smoothing/regression and smoothing and regression splines), generalized additive models (GAM), multivariate adaptive regression splines (MARS), recursive partitioning and extensions, hierarchical mixture of experts (HME), projection pursuit regression, artificial neural networks (ANN), support vector machine (SVM), and naive Bayes classifier.
Introduction to Data Science Collaborations: This course will develop communication skills (written and oral) to produce effective data scientists who can work in collaborative and diverse environments. Students will practice consulting in the Statistical Consulting Lab extention wing, the Quantitative Analytics Lab, and assist the lab with the writing up and presentation of results. Students will perform regular reflections regarding collaborations non-data scientists, ethics of data science and context of data science.
Math Applications in Data Science: This course will be a research-centered course for advanced data science graduate students. They will work on projects centered around the course topics and regularly present theory and application results to the class.
Data Visualization: This course provides an introduction to the statistical application and data visualization with R or Python. The main goals of the course are to learn how to use tools for cleaning, exploring, analyzing, and visualizing data; making data-driven interferences and decisions; and effectively communicating results.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
1 Lab Hour
2 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5329 w/B or better)
Mathematical Foundations of Data Science I: This course provides an overview of foundational mathematical concepts needed to develop a sophisticated understanding of concepts required to be a data scientist. The material covered in this course may stand alone or be built upon in DS 5381.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5329 w/B or better)
Mathematical Foundations of Data Science II: This course provides an overview of foundational mathematical concepts needed to develop a sophisticated understanding of concepts needed to be a data scientist. The concepts covered in this course build upon the material in DS 5380.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5329 w/B or better)
Statistical Theory for Big Data: The course gives a thorough introduction to large sample theory of estimation and inference for Big Data analysis. The topics include: modes of convergence, central limit theorems for averages and quantiles, and asymptotic relative efficiency; estimating equations including the law of large numbers for random functions, consistency and asymptotic normality for maximum likelihood and M-estimators, the EM algorithm, and asymptotic confidence regions and hypotheses tests; models of non-identically distributed or dependent random variables, etc.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Prerequisite(s): (MATH 6381 w/B or better)
Linear Models for Data Science: Least squares theory and Gauss-Markov theorem, confidence intervals, significance tests and multiple comparison tests, multiple linear regression, simple, partial and multiple correlation, variable selection, model diagnostics, orthogonal polynomials, analysis of variance in fixed and random effects models, randomized blocks, bases and singular value decomposition.
Multivariate Statistical Methods for High-dimensional Data: General statistical techniques for multivariate data analysis of variance, principal component analysis, factor analysis, canonical correlation, singular value decomposition (SVD) and multi-dimensional scaling, classification and clustering procedures, recursive partition and tree-based methods, k-nearest meighbors (KNN), artificial neural networks (ANN), support vector machines (SVM), variable selection, latent variable methods, high-dimensional testing and manifold learning.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5381 w/B or better)
Data Science Research Collaborative: This research centered course will provide an opportunity for students to practice their analytical skills on real-world projects. Students will learn effective methods of data science consulting, gain experience presenting results and discussing implications. The statistical consulting projects will have a focus on projects that address current issues in government and industry. Practical experience in consultation and in the writing of effective data science reports will be emphasized as the learning objectives of the course.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
0 Lecture Hours
3 Other Hours
Prerequisite(s): (STAT 5300 w/B or better)
Advanced Computational Data Science: This course introduces the theory behind and the implementation of advanced statistical techniques for performaing inference for existing or newly developed models. The course topics include Random variable generation, Monte Carlo integration, importance sampling, Markov chain Monte Carlo (MCMC) Methods, state space models, particle filters, Hamiltonian MCMC, Bayesian nonparametrics, variational methods, approximate Bayesian computation (ABC).
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
3 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5381 w/B or better)
Dissertation 1: Initial work on dissertation.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
0 Lecture Hours
3 Other Hours
Dissertation 2: Continuous enrollment required while working on dissertation.
Department: Data Science
3 Credit Hours
3 Total Contact Hours
0 Lab Hours
0 Lecture Hours
3 Other Hours
Introduction to techniques for data mining and analytics with emphasis on R programming and hands-on experience with real data; topics covered: dimension reduction, cluster analysis, ordinary and partial least squares regression, principal components regression, ridge regression, l1-regularization, logistic regression, assessment of classifier, decision/regression trees, bagging and boosting, random forests, and data visualization techniques.
Department: Data Science
4 Credit Hours
4 Total Contact Hours
1 Lab Hour
3 Lecture Hours
0 Other Hours
Prerequisite(s): (STAT 5428 w/B or better)
General statistical techniques for unsupervised and supervised learning, with more emphasis on methodology; topics covered: association rules, outlier detection, PageRank, parametric nonlinear regression; optimization, conventional nonparametric regression methods (including kernel smoothing/regression and smoothing and regression splines), generalized additive models (GAM), multivariate adaptive regression splines (MARS), recursive partitioning and extensions, hierarchical mixture of experts (HME), projection pursuit regression, artificial neural networks (ANN), support vector machine (SVM), and naive Bayes classifier.