COURSE DETAIL
This course is an introduction to research study design and the analysis of structured data. It covers blocking, randomization, and replication in designed experiments, as well as clusters, stratification, and weighting in samples.
COURSE DETAIL
This course examines basic concepts and methods for estimating various quantities of interest in the analysis of survival data. A distinct feature of the type of data that is covered in this course is that the main response of interest, survival time, is subject to censoring, which leads to an incomplete observation. Topics include survival models, censoring and truncation, functions characterizing survival times, nonparametric estimation of survival function and other functions, comparison of survival functions for different groups, parametric and semiparametric regression models, mathematical and graphical methods for assessing goodness of fit, analysis of multivariate failure time data, competing risks analysis, cohort-sampling designs, and other advanced topics.
COURSE DETAIL
This course aims at providing the basic theoretical and applied tools for a rigorous statistical analysis. Specifically, the course focuses on techniques to summarize and visualize data of different types and their possible relations, as well as on basic sampling and inferential procedures, and on the assessment of the risk associated to extrapolation and inference. In particular, students learn how to extract information from data and how to assess the reliability of such information. The course covers the following topics: collection, management, and summary of data using frequency distributions, graphical representations, and summaries; study of the relationship between two variables; statistical inference and sampling variability; theory of point estimation and confidence intervals; hypothesis testing; and simple and multiple regression models. All the descriptive and inferential tools introduced during the course are applied to data using the statistical software R - and in particular the integrated development environment (IDE) RStudio. Prerequisites: understanding of the concepts of probability theory and random variables.
COURSE DETAIL
This course examines stochastic processes. It will cover the basic concepts of the theory of stochastic processes and explore different types of stochastic processes including Markov chains, Poisson processes and Brownian motions.
COURSE DETAIL
This course examines the analysis of the equilibrium and dynamic behavior of mechanical systems. It covers equilibrium of particles and of rigid bodies; distributed forces; analysis of structures, including, trusses, frames, cables and beams; kinematics of particles; kinetics of particles, Newton's second law, energy, momenta, impact dynamics; systems of particles; kinematics of rigid bodies; and kinetics of rigid bodies in two and three dimensions.
COURSE DETAIL
This is a non-synchronous online course taught in English. This course aims to inspire students’ interests in numerical computation regarding epidemiology and biostatistic, cultivating students’ critical thinking and logic in programming. The course expects to facilitate students’ research in biostatistics, epidemiology, or related quantitative fields and build students’ further understanding of quantitative epidemiology and biostatistics.
In most biostatistics courses, instructors usually introduce theoretical models and then analyze data with statistical software such as SAS and R. However, there is a black box between these two parts. To link statistical theory to software output, this course introduces the numerical computation process involved in statistical models. The course instructs on matrix operations, numerical analyses, Monte Carlo simulations, etc. The course also teaches how to construct a log-likelihood function according to a statistical distribution; obtain maximum likelihood estimates from a logistic regression and a Poisson regression; find exact confidence intervals, and design Monte Carlo simulations for a given research topic, etc.
Prerequisite: At least one course in biostatistics (or statistics) or epidemiology.
COURSE DETAIL
This courses provides a foundation for statistical and probabilistic concepts based on mathematical tools. It explores how to apply statistical concepts to solve real-world problems. Topics include axioms of probability, random variables, the most important discrete and continuous probability distributions, expectation, moment generating functions, conditional probability and conditional expectations, multivariate distributions, some limit theorems.
COURSE DETAIL
Course Objective
1.To be familiar with the basic concepts of multivariate distribution theory and multivariate statistical analysis methods
2.To master the statistical ideas and mathematical principles of common multivariate statistical analysis methods.
3.To understand the application of multivariate statistical theory in numerical calculations and machine learning.
Course Content
Basic knowledge of probability theory and linear algebra, definition and properties of multivariate normal distributions (maximum likelihood estimation, properties of estimators, sampling distribution of sample means), applications of multivariate normal distributions in numerical calculations, central limit theorem, testing of multivariate statistics (likelihood ratio test, Hotelling T2 distribution), matrix element distribution (Wishart distribution and inverse Wishart distribution), topic model (Latent Dirichlet Allocation), Principal component analysis (probabilistic principal component analysis and EM algorithm).
COURSE DETAIL
This course provides a broad introduction to machine learning for students with a solid statistics background. Topics include supervised learning (generative/discriminative learning, parametric/non-parametric learning, support vector machines, neural networks), unsupervised learning (clustering, dimensionality reduction, generative models), and other learning theories.
COURSE DETAIL
The primary focus of this course is on the core machine learning techniques in the context of high-dimensional or large datasets (i.e. big data). The first part of the course covers elementary and important statistical methods including nearest neighbors, linear regression, logistic regression, regularization, cross-validation, and variable selection. The second part of the course deals with more advanced machine learning methods including regression and classification trees, random forests, bagging, boosting, deep neural networks, k-means clustering and hierarchical clustering. The course will also introduce causal inference motivated by analogy between double machine learning and two-stage least squares. All the topics are delivered using illustrative real data examples. Students also gain hands-on experience using R or Python (programming languages and software environments for data analysis, computing and visualization).
Pagination
- Previous page
- Page 23
- Next page