Courses
QMSS provides students with a basis in quantitative skills for social science research through its core curriculum, with the flexibility to engage in interdisciplinary pursuits through their elective selections. Students can select these classes from the elective offerings by QMSS, or draw from the wider university through crossregistration in the many departments across Columbia that QMSS has developed strong relationships with over the years. Below are descriptions of the courses offered by QMSS as well as a selection of courses around the university that may be of interest. Students are advised to work with the program to determine a course of study that suits goals for program design as well as to receive guidance on their options for study throughout the university.
See our Registration Procedures page for the enrollment process for both QMSS and nonQMSS Students.
QMSS students read about the various degree tracks/focuses HERE.
The five courses below or their equivalents are completed by ALL QMSS Students.
Theory and Methodology (QMSS GR5010)
This interdisciplinary course, taken in the fall semester, is a comprehensive introduction to quantitative research in the social sciences. The course focuses on foundational ideas of social science research, including strengths and weaknesses of different research designs, interpretation of data drawn from contemporary and historical contexts, and strategies for evaluating evidence. The majority of the course is comprised of twoweek units examining particular research designs, with a set of scholarly articles that utilize that design. Topics include: the “science” of social science and the role of statistical models, causality and causal inference, concepts and measurement, understanding human decision making, randomization and experimental methods, observation and quasiexperimentation, sampling, survey research, and working with archival data.
Data Analysis Requirement (QMSS GR5015).
The data analysis course covers specific statistical tools used in social science research using the statistical program R. Topics to be covered include statistical data structures, and basic descriptives, regression models, multiple regression analysis, interactions, polynomials, GaussMarkov assumptions and asymptotics, heteroskedasticity and diagnostics, models for binary outcomes, naive Bayes classifiers, models for ordered data, models for nominal data, first difference analysis, factor analysis, and a review of models that build upon OLS. Prerequisite: introductory statistics course that includes linear regression.
Equivalents:
Advanced Econometrics (ECON W4412) Economics Focus Only
Students who are planning on pursuing the economics focus are required to take this course. This course is intended for students who already have a firm grasp of introductory level econometrics, and interested in advanced topics including asymptotic theory. The prerequisites are linear algebra, intermediate microeconomics, macroeconomics, and econometrics. Topics to be covered include OLS in matrix form, finite sample and asymptotic properties, hypothesis testing, GLS, maximum likelihood, endogeneity, stationary time series, nonstationary time series, panel data, and discrete choice models.
Research Seminar I & II (QMSS GR5021 & GR5022)
This course is designed to expose students in the QMSS degree program to different methods and practices of social science research. Seminar presentations are given on a wide range of topics by faculty from Columbia and other New York City universities, as well as researchers from private, government, and nonprofit settings. QMSS students participate in a weekly seminar. Speakers include faculty from Columbia and other universities, and researchers from the numerous corporate, government, and nonprofit settings where quantitative research tools are used. Topics have included: NowCasting and the RealTime DataFlow; Art, Design & Science in Data Visualization; Educational Attainment and School Desegregation: Evidence from Randomized Lotteries; Practical Data Science: North American Oil and Gas Drilling Data.
Master's Thesis (GR5999)
All students must complete an MA thesis, which involves original statistical analysis, under the supervision of the student's advisor and the QMSS program director. Students should register for this course in the last semester of their program
TrackSpecific NonQMSS Requirements
ECONOMICS FOCUS
Advanced Econometrics (ECON GU4212)
Seyhan Erden
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Students must register for required discussion section. The linear regression model will be presented in matrix form and basic asymptotic theory will be introduced. The course will also introduce students to basic time series methods for forecasting and analyzing economic data. Students will be expected to apply the tools to real data.
VIEW PREVIOUS SYLLABUS HERE
Advanced Macroeconomics (ECON GU4213)
Andres Drenik
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Required discussion section ECON GU4214 An introduction to the dynamic models used in the study of modern macroeconomics. Applications of the models will include theoretical issues such as optimal lifetime consumption decisions and policy issues such as inflation targeting. This course is strongly recommended for students considering graduate work in economics.
VIEW PREVIOUS SYLLABUS HERE
Advanced Microeconomics (ECON GU4211)
Andres Drenik
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Required discussion section ECON GU4214 An introduction to the dynamic models used in the study of modern macroeconomics. Applications of the models will include theoretical issues such as optimal lifetime consumption decisions and policy issues such as inflation targeting. This course is strongly recommended for students considering graduate work in economics.
VIEW PREVIOUS SYLLABUS HERE
Elective Courses
QMSS Students typically take between 4 and 6 elective courses. Any 4000level or above course offered by QMSS, Computer Science, IEOR, Economics, Statistics, Psychology, Political Science, Sociology, History, Mathematics, or SIPA will satisfy one of these requirements. 4000level courses outside these departments MAY satisfy an elective requirement but require approval by the Director of QMSS. Send a copy of the syllabus to <[email protected]> for approval to count a course towards your degree progress.

NOTE: Approval of a course does not grant you permission to enroll in a nonQMSS course. That is controlled by the course instructor. CLICK HERE for more detailed registration procedures.
Each focus has its own guidelines regarding elective distribution, so be sure to read your Degree Requirements worksheet carefully.
Some popular elective courses are listed below. Be aware that course listings are always subject to change. You should always check in the Columbia Directory of Classes for the most uptodate information.
*Please note that some QMSS classes are only offered once per academic year. Those courses are enumerated below. All other QMSS classes are offered during both fall and spring semesters. Summer semester classes change yearly depending upon need and availability.*
EXCLUSIVELY FALL CLASSES
GR5010 QUANTITATIVE THEORY & METHODOLOGY
GR5016 REGRESSION MODELTEMP PROCESS
GR5070 GIS & SPATIAL ANALYSISSOC SCI
GR5058 DATA MINING FOR SOCIAL SCIENCE
EXCLUSIVELY SPRING CLASSES
GR5018 ADV ANALYTIC TECHNIQUES
GR5062 SOCIAL NETWORK ANALYSIS
GR5063 DATA VISUALIZATION
GR5065 BAYESIAN STATS FOR THE SOC SCI
GR5069 APPLIED DATA SCI FOR SOC SCIENTISTS
QMSS Students in the Flexible Focus must take TWO Research Methods Electives. We Strongly encourage they fulfill this requirement through QMSS department electives. All QMSS students are guaranteed a seat in QMSS electives (that do not directly conflict with other enrollments.)
Time Series, Panel Data, and Forecasting (QMSS GR5016)
This course will introduce students to the main concepts and methods behind regression analysis of temporal processes and highlight the benefits and limitations of using temporally ordered data. Students study the complementary areas of time series data and longitudinal (or panel) data. There are no formal prerequisites for the course, but a solid understanding of the mechanics and interpretation of OLS regression will be assumed (we will briefly review it at the beginning of the course). Topics to be covered include regression with panel data, probit and logit regression of pooled crosssectional data, differenceindifference models, time series regression, dynamic causal effects, vector autoregressions, cointegration, and GARCH models. Statistical computing will be carried out in R.
Advanced Analytic Techniques (QMSS GR5018)
This course is meant to train students in advanced quantitative techniques in the social sciences. Statistical computing will be carried out in R. Topics include: review of multiple/linear regression, review of logistic regression, generalized linear models, models with limited dependent variables, first differences analysis, fixed effects, random effects, lagged dependent variables, growth curve analysis, instrumental variable and twostage least squares, natural experiments, regression discontinuity, propensity score matching, multilevel models or hierarchical linear models, and textbased quantitative analysis.
Practicum in Data Analysis (QMSS GR5052)
This practicum course is meant to offer valuable training to students. Specifically, this practicum will mimic the typical conditions that students would face in an internship in a large dataintense institution. The practicum will focus on four core elements involved in most internships: (1) Developing the intuition and skills to properly scope ambiguous project ideas; (2) practicing organizing and accessing a variety of largescale data sources and formats; (3) conducting basic and advanced analysis of big data; and (4) communicating and “productizing” results and findings from the earlier steps, in things like dashboards, reports, interactive graphics, or apps. The practicum will also give students time to reflect on their work, and how it would best translate into corporate, nonprofit, startup and other contexts.
Data Mining for Social Science (QMSS GR5058)
The class is roughly divided into two parts: 1. programming best practices, exploratory data analysis (EDA), and unsupervised learning 2. supervised learning including regression and classification methods In the first part of the course we will focus writing R programs in the context of simulations, data wrangling, and EDA. Unsupervised learning is focused on problems where the outcome variable is not known and the goal of the analysis is to find hidden structure in data such as different market segments from buying patterns or human population structure from genetic data. Supervised learning deals with prediction problems where the outcome variable is known such as predicting the price of a house in a certain neighborhood or an outcome of a congressional race.
Internship (QMSS GR5050 & QMSS GR5051)
Students enrolled in the Quantitative Methods in the Social Sciences MA program have a number of opportunities for internships with various organizations in New York City. All internships will be graded on a pass/fail basis.
An internship must meet the following criteria:
 It is related to the core issues of concern to the MA Program in Quantitative Methods in the Social Sciences.
 The work is substantive (although students may perform some administrative tasks, we want to ensure that they receive experience in substantive research).
 It is a practical, professional experience.
Social Network Analysis (QMSS GR5062)
The course is designed to teach students the foundations of network analysis including how to manipulate, analyze and visualize network data themselves using statistical software. We will focus on using the statistical program R for most of the work. Topics will include measures of network size, density, and tie strength, measures of network diversity, sampling issues, making egonets from whole networks, distance, dyads, homophily, balance and transitivity, structural holes, brokerage, measures of centrality (degree, betweenness, closeness, eigenvector, beta/Bonacich), statistical inference using network data, community detection, affiliation/bipartite networks, clustering and small worlds; positions, roles and equivalence; visualization, simulation, and network evolution over time.
Data Visualization (QMSS GR5063)
This course is designed to the interdisciplinary and emerging field of data science. It will cover techniques and algorithms for creating effective visualizations based on principles from graphic design, visual art, perceptual psychology, and cognitive science to enhance the understanding of complex data. Students will be required to complete several scripting, data analysis and visualization design assignments as well as a final project. Topics include: data and image models, social and interactive visualizations, principles and designs, perception and attention, mapping and cartography, network visualization. Computational methods are emphasized and students will be expected to program in R, Javascript, D3, HTML and CSS and will be expected to submit and peer review work through Github. Students will be expected to write up the results of the project in the form of a conference paper submission.
Bayesian Statistics for the Social Sciences (QMSS GR5065)
An introduction to Bayesian statistical methods with applications to the social sciences. Considerable emphasis will be placed on regression modeling and model checking. The primary software used will be Stan, which students do not need to be familiar with in advance. Students in the course will access the Stan library via R, so some experience with R would be helpful but not required. Any QMSS student is presumed to have sufficient background. Any nonQMSS students interested in taking this course should have a comparable background to a QMSS student in basic probability. Topics to be covered are a review of calculus and probability, Bayesian principles, prediction and model checking, linear regression models, Bayesian data collection, Bayesian calculations, Stan, the BUGS language and JAGS, hierarchical linear models, nonlinear regression models, missing data, stochastic processes, and decision theory.
Natural Language Processing (QMSS GR5067)
Social scientists need to engage with natural language processing (NLP) approaches that are found in computer science, engineering, AI, tech and in industry. This course will provide an overview of natural language processing as it is applied in a number of domains. The goal is to gain familiarity with a number of critical topics and techniques that use text as data, and then to see how those NLP techniques can be used to produce social science research and insights. This course will be handson, with several largescale exercises. The course will start with an introduction to Python and associated key NLP packages and github. The course will then cover topics like language modeling; part of speech tagging; parsing; information extraction; tokenizing; topic modeling; machine translation; sentiment analysis; summarization; supervised machine learning; and hidden Markov models. Prerequisites are basic probability and statistics, basic linear algebra and calculus. The course will use Python, and so if students haveprogrammedd in at least one software language, that will make it easier to keep up with the course.
Applied Data Science for Social Science (QMSS GR5069)
In his now classic Venn diagram, Drew Conway described Data Science as sitting at the intersection between good hacking skills, math and statistics knowledge, and sub stantive expertise. As a result of normal instruction, social scientists possess a uid combination of all three but also bring an additional layer to the mix. We have acquired slightly dierent training, skills and expertise tailored to understand human behavior, and to explain why things happen the way they do. Social scientists are, thus, a particular kind of data scientist. This course is a collection of topics that ll very specic gaps identied over the years on what a social scientist should know at minimum when entering data science, and what a data scientist should know to hit the ground running and add immediate value to their teams.
GIS and Spatial Analysis for Social Science (QMSS GR5070)
This course introduces students to basic spatial analytic skills. It covers introductory concepts and tools in Geographic Information Systems (GIS) and database management. As well, the course introduces students to the process of developing and writing an original spatial research project. Topics to be covered include: social theories involving space, place and reflexive relationships; social demography concepts and databases; visualizing social data using geographic information systems; exploratory spatial data analysis of social data and spatially weighted regression models, spatial regression models of social data, and spacetime models. Use of opensource software (primarily the R software package) will be taught as well.
Modern Data Structures (QMSS GR5072)
This course is intended to provide a detailed tour of how to access, clean, “munge” and organize data, both big and small. (It should also give students a flavor of what would be expected of them in a typical data science interview.) Each week will have simple, moderate and complex examples in class, with code to follow. Students will then practice additional exercises at home. The end point of each project would be to get the data organized and cleaned enough so that it is in a dataframe, ready for subsequent analysis and graphing. Therefore, no analysis or visualization (beyond just basic tables and plots to make sure everything was correctly organized) will be taught; and this will free up substantial time for the “nittygritty” of all of this data wrangling.
Machine Learning for Social Sciences (QMSS GR5073)
This course will provide a comprehensive overview of machine learning as it is applied in a number of domains. Comparisons and contrasts will be drawn between this machine learning approach and more traditional regressionbased approaches used in the social sciences. Emphasis will also be placed on opportunities to synthesize these two approaches. The course will start with an introduction to Python, the scikitlearn package and GitHub. After that, there will be some discussion of data exploration, visualization in matplotlib, preprocessing, feature engineering, variable imputation, and feature selection. Supervised learning methods will be considered, including OLS models, linear models for classification, support vector machines, decision trees, and random forests, and gradient boosting. Calibration, model evaluation and strategies for dealing with imbalanced datasets, nonnegative matrix factorization, and outlier detection will be considered next. This will be followed by unsupervised techniques: PCA, discriminant analysis, manifold learning, clustering, mixture models, cluster evaluation. Lastly, we will consider neural networks, convolutional neural networks for image classification and recurrent neural networks. This course will primarily us Python. Previous programming experience will be helpful but not requisite. Prerequisites: basic probability and statistics, basic linear algebra, and calculus.
Projects in Advanced Machine Learning (QMSS GR5074)
Machine learning algorithms continue to advance in their capacity to predict outcomes and rival human judgment in a variety of settings. This course is designed to offer insight into advanced machine learning models, including Deep Learning, Convolutional Neural Networks for image and text data, Object detection models, Recurrent Neural Networks (Timeseries data), and Adversarial Neural Networks. Students are expected to have familiarity with using Python, the scikitlearn package, and Github. Roughly half of the course will engage machine learning methods while the other half of the course will be devoted to students working in key substantive areas, where advanced machine learning will prove helpful  areas like computer vision and images, text and natural language processing, and tabular data. Students will be tasked to develop team projects in these areas and they will develop a public portfolio of three (or four) meaningful projects. By the end of the course, students will be able to show their work by launching their models in live REST APIs and webapplications. Prerequisites are basic probability and statistics, basic linear algebra and calculus. Students are expected to have familiarity with using Python, the Scikitlearn package, and Github.
Independent Study (QMSS GR5998)
Students develop a course of study under the supervision of a faculty member. Please see the QMSS program coordinator for more details.
Below is just a sampling of some popular course offered through other departments that may be counted towards the QMSS degree. Any 4000level or above course offered by QMSS, Computer Science, Economics, Statistics, Political Science, Sociology, or Mathematics will satisfy one of these requirements. For full listings from each department, see the Directory of Classes.
Outside GSAS
We encourage students to explore course offings outside GSAS. Some popular options are listed below. Check out each schools' website for comprehensive listsings.
Be aware that these schools have their own distinct registration procedures. Visit the Registration page for full instructions.
 School of International and Public Affairs
 Teacher's College
 Mailman School of Public Health
 Columbia Business School