Affiliated Organizations

We have worked with a number of organizations on critical projects. A partial list is below.

The UN Joint SDG Fund

Team Members

Peishan Li, Qinyue HaoJasmine Hwang, Dan Li, Rina ShinConnie XuHanyu ZhangLizabeth Singh

Challenge: The 17 Sustainable Development Goals (SDG) are defined in a list of 169 SDG Targets and 231 unique SDG indicators. The 17 SDGs were designed not as separate and isolated goals, but as a network, in which links among the goals exist through targets and indicators that refer to multiple goals. With less than 10 years remaining to achieve all Sustainable Development Goals (SDGs) globally, there is a growing need for integrated implementation and measurement.  In this project, the team was tasked to devise a model/tool that defines and measures linkages and networks of SDGs and see how such SDG linkages/networks progress over time.

Data

SDG Indicators; UN High Level Political Forum Voluntary National Reviews (VNRs) on the SDGs Database

Solution

This project measured the success and linkages between the United Nation’s Sustainable Development Goals (SDG) as interactive networks. To do this, the team developed (1) a text-based network model, and (2) a coefficient-based model using the SDG Indicator Database in order to identify connections and interdependencies between goals. Read more on the project result on the Joint SDG Fund Blog, Youth Corner https://www.jointsdgfund.org/article/measuring-integration-and-network-effect-sdgs


Lovelytics

Student Team

Andrew Lai, Edmund LamJinghan MaNicole Neo, Xudan Wang, Yixuan Li

Challenge

Conduct comprehensive customer segmentation analysis using credit card data and present possible insights for marketing purposes.

Problem Statement: To successfully sell a product, knowing the target market and customer needs is key. While traditionally marketers had to rely on customer surveys and companies’ sales figures to understand consumer behavior, now they have access to large-scale data on credit card transactions and consumer demographic traits. The objective of this project was to leverage such data to harness insights that serve to inform more responsive marketing strategies across a myriad of industries.

Envisioned Outcome

The team will produce the following deliverables:

  1. Summary deck of the customer segmentation analysis and insights for marketing strategies;
  2. Visualization of data through a Tableau dashboard that presents the different customer segments and their traits and spending habits

Data

Credit card transaction data from Epsilon (Q3 2019 to Q2 2021); Household-level demographic and Psychographic data from Epsilon

Solution

Using partitional (K-means) and hierarchical (bisecting K-means) clustering algorithms, the team uncovered 2 primary clusters from the transaction data. Then they mapped the transaction data to the demographic and psychographic data to further understand the traits of the clusters. Overall, the analysis found that a majority of consumers (74%) were in Cluster 1 and the minority (26%) were in Cluster 2. Cluster 2 spends more overall compared to Cluster 1, with the biggest difference in spending being on the Supermarket, Other Retail, Restaurants and Home Improvement categories. Cluster 2 comprises more married individuals with children and who have higher purchasing power relative to Cluster 1, and these traits could possibly explain the difference in spending patterns between the two clusters.


The Black List

Student Team

Pruthvi PanatiLavanya NarayananXintong (Maxxie) TangArielle HermanTianqing Zhou

Challenge

The Black List is a platform for TV and film writers to showcase their screenplays for industry members and get their work evaluated by professional readers. Having accumulated a large number of scripts over the years, the challenge is to use these data in ways that help the mission of the organization, as well as provide service to the industry.

Problem Statement

Develop an exploratory tool to visualize some  of the trends in the data, improve the tagging system, and provide a better understanding of how to measure quality and feedback. 

Envisioned Outcome

A prototype of a dashboard for use by writers and producers interested in exploring the scripts, as well as an algorithm for classifying scripts.

Data

The Black List tags and describes the scripts by genre, roles, characters, and other characteristics, collects data on the writers, and collects quantitative and qualitative feedback from professional readers.

Solution

The team explored three avenues for analysis. First, they consolidated some of the tags associated with scripts and looked for patterns of how tags appear together: which ones often appear together and which one never appear together.  Next, the team explored and visualized gender and race representation in movie scripts. Finally, the team developed an algorithm to predict script similarity based on various characteristics such as genre, roles, topics, and more. 


The Opportunity Project (US Census)

Student Team

Alisha GurnaniMichelle A. ZeeGretchen StreettAlison RylandKyung Suk Lee, and Asahi Nino, in partnership with the US Census Bureau and the Environmental Protection Agency

Challenge

Create digital tools that help rural communities access and use data to implement solutions to economic, environmental, and human health challenges, taking care to reach places that have limited professional capacity and small budgets.

Problem Statement

In a rural Delaware community, a much-needed new health center is built in an open space designated as coastal land along a six-lane highway. Less than a quarter mile away, the walkable historic downtown is experiencing growing business vacancies. How might public data have led local decision makers to choose a more accessible site that could have catalyzed new business opportunities on Main Street? How much additional coastal open space could be conserved? Considering far-reaching challenges like loss of industry, extreme weather events, other economic shocks, and even lack of access to data and broadband internet, small towns and rural communities are struggling to strengthen their economies and revive downtowns while providing healthier lifestyles and cleaner environments for their residents. By taking advantage of walkable street grids and historic architecture built by generations gone, rural communities can also improve air quality, protect local watersheds, conserve open space, and reduce waste. However, rural communities often lack the capacity, data, strategies, or financial resources to tackle downtown revitalization, as they often have limited resources dedicated to comprehensive planning and regional collaboration. What’s more, rural communities may lack access to private and public capital for sustainable economic development and revitalization. Lack of or limited access to broadband internet also takes a toll on a community’s ability to access economic opportunity. The result can be development that fails to take advantage of the communities’ assets, creates long-term maintenance costs, and undermines health and environmental goals.

Envisioned Outcome

Rural communities can quickly and easily access curated datasets and implementation strategies to support sustainable economic growth—growth that is community-driven, leverages existing local assets, and provides walkable, compact downtowns to support the health of residents and ecosystems.

Data

  • Demographic and socioeconomic data (e.g., human capital, labor force characteristics) - U.S. Census
  • Walkability and transit access – EPA Smart Location Database
  • Environmental, geographic, climatic, cultural, and natural resource profiles – EPA, EnviroAtlas, Fish & Wildlife Service
  • Social Vulnerability Index, CDC 
  • Food security and food access – USDA Food Environment Atlas, USDA Atlas of Rural and Small Towns 
  • Economic performance factors (e.g., housing, health services, educational, cultural and recreational resources, public safety)
  • Assets such as anchor institutions, access to nature amenities, new and emerging economic drivers
  • Infrastructure assets (e.g., water, sewer, telecommunications/broadband, energy distribution systems, transportation)
  • Emerging or declining clusters or industry sectors
  • Workforce factors (e.g., innovation, supply chains, state and local laws, financial resources, transportation, energy cost, taxes, bonding capacity, land use patterns)

Solution: R Story

The QMSS team partnered with users, data experts, and product development experts in a human-centered design process to develop a tool that would help economic development in small rural towns in the United States. They got to know advocates from rural communities and learned about rural identity, the wide variety of assets in smaller communities, and the unique challenges they face. These initial discussions highlighted the criticality of cross-functional work to ensure that the final deliverable meets the needs of targeted end users. The QMSS team creatively distilled the key takeaways from these discussions to develop a usable prototype that excited the key stakeholders. 

Partnering with the town of Manistee, Michigan, and using publicly available data from the U.S. Census, EPA, BLS, and more, the QMSS team developed R Story: an all-in-one data visualization tool. Targeting residents, entrepreneurs, and developers, the tool helps local leaders champion their communities by saving time and getting easy access to data. Data is organized by region and by audience and is synthesized seamlessly by customized dashboards for each of the community's external stakeholders. R Story helps provide figures and visuals to small rural community leaders to enhance their presentations, meetings, and grant proposals for stakeholders in economic development. The community leader from Manistee, Michigan is already planning on using the tool in preparation for their next round of discussions.


KPMG

Student team

Sydney (Bolim) SonAndrew ThvedtLouisa OngAriel Luo.

Challenge

Improve on previous modeling efforts to forecast COVID-19 infection rates based on daily time series data. Evaluate how to forecast daily COVID-19 time series data in one geography (e.g. New York) based on daily COVID-19 time series data from both that geography and other geographies (e.g. South Korea, Italy).

Problem Statement

Given the unprecedented, fast-moving, health & economic impacts of COVID-19, a more dynamic forecasting approach was needed to leverage fast-changing external data and adaptive predictive models to inform an organization’s financial outlook. The objective was to generate a solution that harvested daily external signals around virus and social policy impact across countries, along with economic data related to the impact on goods and services at multiple sector and geographic resolutions–taking in the latest data from countries experiencing impacts and combining this with the organization’s historical financial data to forecast potential “shocks.” 

Envisioned Outcome

The team will deliver three elements:

  1. Summary deck of why the models used are recommended
  2. A prototype/proof-of-concept modeling using forecasting / machine learning algorithms; and
  3. Visualization of data through either a dashboard or chart(s) in Python notebook(s)

Data

John Hopkins University Covid-19 data, Oxford Policy data, US Census data, and Google Mobility data

Solution

Students developed short as well as long-term COVID-19 forecasting models using an epidemiological SEIRD (Susceptible Exposed Infected Recovered Deceased) modeling approach. The team creatively acknowledged the trajectory of COVID-19 varied not only by region but also within a state such that it needed to be accounted for in the model. They allowed for flexible, region-level customization that integrates non-traditional epidemic curves (given the nature of COVID-19_ that capture different waves). This project involved prescriptive and descriptive understanding of COVID-19 for the 50 biggest cities and 50 states across the U.S., a tailored COVID-19 forecasting model for each region, a mobility forecasting model, and an interactive website with visualizations that could enable businesses to easily understand the trajectory of this pandemic for their specific city or state. This proposed solution would enable organizations to quickly respond by providing the capability to flexibly and continuously adapt their financial forecasting models in order to provide timely guidance.