How to Create Data Subsets for the 2020 Fall Data Challenge

The 2020 Fall Data Challenge: Get Out the Vote submission window is almost here! In preparation, you can begin reviewing the dataset with your team now.   

For this year’s challenge, all submissions must utilize the IPUMS-ASA U.S. Voting Behaviors datasetThis rich dataset includes information about voting behaviors in the U.S. over the past 14 years, including 28 variables on more than 640,000 cases, from all 50 states and the District of Columbia, from 2004 through 2018.

You are welcome to choose a smaller subset of the data for your team’s analysis. You could focus on a particular time or geography, for example. 

Three subsets are already provided 

           1) Data from 2016 only (~80,000 cases)  

           2) Data from 2018 only (~73,000 cases) 

           3) Data from 2016 – 2018 only.  

Another interesting option would be to focus on only your own state across time (for example, Virginia has about 13,000 cases from 2004 through 2018).  

What kinds of questions do you and your team want to explore within the dataset? Let this inform what kinds of subsets you examine.

How to Create a Subset 

Choose questions about the data that interest you, and let them guide how you explore and subset the dataset.

Once you’ve decided on the subset of data you’d like to explore, you’ll need to create it from the provided dataset. Using the CSV or Excel data file, you can do this by filtering on the criterion you have selected (e.g., Virginia cases across time). 

Good luck!  

Your journey to this year’s Fall Data Challenge starts now! Submissions for team entries are open from October 19 to November 11.

Learn more about this year’s focus, the dataset, and entry guidelines with these resources:   

Download the dataset and get started!

 

Facebooktwitterredditpinterestlinkedinmail

Related Posts

Hindsight Is 2023 for Former Statistics and Data Science Students

It’s back-to-school season! Gear up for the upcoming semester and consider diving into the captivating world of statistics and data science. Looking for diverse job opportunities that span across every industry? Look no further!  With a variety of graduate programs and jobs, now is a great time for students to become data scientists and statisticians….

Elizabeth J. Kelly: “Statistics is for Adrenaline Junkies”

Elizabeth J. Kelly has always loved math, and as a professional statistician at Los Alamos National Laboratory (LANL) and a recreational rock climber, Elizabeth is an avid thrill-seeker who enjoys a challenge. “Math reminds me of climbing, including the need to focus, problem solve and persevere. I guess I ended up in statistics because I…

Comments are closed.