Fall Data Challenge: Data Cookbook

Data Sources

In your submission to the Fall Data Challenge, your team is required to use the U.S. Department for Housing and Urban Development (HUD) data, which is available in multiple forms, including raw datasets and a written report.

The Annual Homeless Assessment Report (AHAR) provides a summary of the Point-in-Time (PIT) estimates, and so offers a “snapshot” of homelessness. The report includes a useful “definition of terms” section.

The report is available here

The raw datasets are available here:

Data Dictionary

Dataset terms and descriptions 

Program Type 
  • ES: Emergency shelter
  • TH: Transitional housing
  • HPRP: Homelessness prevention and rapid re-housing program 
  • SH: Safe haven 
  • PSH: Permanent supportive housing 
Bed Type
  • F: Facility-based beds
  • V: Voucher beds  
  • O: Other beds

Enter the six-digit HUD-assigned Geocode corresponding to the jurisdiction in which the program is physically located.

Inventory Type

Determine if the bed inventory is current (C), new (N), or under development (U)

Target Population A
  • SM: Single males 
  • SF: Single females
  • SMF: Single males and females 
  • CO: Couples only, no children 
  • HC: Households with children
  • SMHC: Single males and households with children
  • SFHC: Single females and households with children
  • SMF+HC: Single males and females plus households with children
  • YM: Unaccompanied males under 18 years old
  • YF: Unaccompanied females under 18 years old
  • YMF: Unaccompanied males and females under 18 years old
Target Population B
  • DV: Domestic violence victims only
  • VET: Veterans only
  • HIV: HIV/AIDS populations only
  • NA: Not applicable
HUD McKinney-Vento 

For each program, identify whether or not the program receives any funds from HUD McKinney-Vento.

HUD McKinney Vento programs include: Emergency Shelter Grant (ESG), Shelter plus Care (S+C), Section 8 Moderate Rehabilitation Single-Room Occupancy (SRO), Supportive Housing Program (SHP)


RStudio Cloud

RStudio Cloud was created to make it easy for professionals, hobbyists, trainers, teachers and students to do, share, teach and learn data science using R. 

Common Online Data Analysis Platform (CODAP)

CODAP is “free educational software for data analysis. This web-based data science tool is designed as a platform for developers and as an application for students in grades 6-14.”

Data Wrangling Cheat Sheet from RStudio 

This data-wrangling cheat sheet has tips and tricks for using dplyr and tidyr. 


iNZight is designed to allow students to quickly and easily explore data and understand some statistical ideas. iNZight Lite (https://lite.docker.stat.auckland.ac.nz/) is the online version of the software. 

National Coalition for the Homeless 

“The National Coalition for the Homeless is a national network of people who are currently experiencing or who have experienced homelessness, activists and advocates, community-based and faith-based service providers, and others committed to a single mission: To end and prevent homelessness while ensuring the immediate needs of those experiencing homelessness are met and their civil rights are respected and protected.”