How to Create Data Subsets for the 2020 Fall Data Challenge
October 9, 2020
The 2020 Fall Data Challenge: Get Out the Vote submission window is almost here! In preparation, you can begin reviewing the dataset with your team now.
For this year’s challenge, all submissions must utilize the IPUMS-ASA U.S. Voting Behaviors dataset. This rich dataset includes information about voting behaviors in the U.S. over the past 14 years, including 28 variables on more than 640,000 cases, from all 50 states and the District of Columbia, from 2004 through 2018.
You are welcome to choose a smaller subset of the data for your team’s analysis. You could focus on a particular time or geography, for example.
Three subsets are already provided:
1) Data from 2016 only (~80,000 cases)
2) Data from 2018 only (~73,000 cases)
3) Data from 2016 – 2018 only.
Another interesting option would be to focus on only your own state across time (for example, Virginia has about 13,000 cases from 2004 through 2018).
What kinds of questions do you and your team want to explore within the dataset? Let this inform what kinds of subsets you examine.
How to Create a Subset
Choose questions about the data that interest you, and let them guide how you explore and subset the dataset.
Once you’ve decided on the subset of data you’d like to explore, you’ll need to create it from the provided dataset. Using the CSV or Excel data file, you can do this by filtering on the criterion you have selected (e.g., Virginia cases across time).
Your journey to this year’s Fall Data Challenge starts now! Submissions for team entries are open from October 19 to November 11.
Learn more about this year’s focus, the dataset, and entry guidelines with these resources:
In this year’s Fall Data Challenge, Get Out the Vote, 56 teams of 66 high school and 61 undergraduate students submitted their recommendations on how to increase voter-turnout using voting behavior data from the Census Bureau and Bureau of Labor Statistics, provided by the IPUMS organization. Students recommended a variety of impressive voter-turnout strategies to implement for future elections. Overwhelmingly, their statistical assessment of the dataset led them to a correlation between increased education and increased voter…
Statistics improves our lives—including our mental health! In our newest This is Statistics video, learn how data from research and trials can lead to insights that not only improve our understanding of mental health, but also make it more personalized to individuals’ unique needs. This video features Samprit Banerjee, Ph.D., M. Stat. Associate Professor of Biostatistics,…