I have gathered three quarters of my branches on the “Other Data Sources” category. The next up is to get real estate data based on the project plan.

Real Estate Data
Finding good real estate data took a some work to explore and gather. I’m sure there is a lot of data available privately for a cost, but the point of my project is to focus on what is publicly available so anyone can do what I’m doing.
In the process, I decided I would try to gather information associated with real estate that was not individual transactions, though I did explore permitting as a mean to understand development or investment geographically. This lead to exploring economic data that pertained to households, rentals and home-ownership as well. While the data is not the fidelity that I would have preferred, the high-level view should be interesting to see if they line up with any aggregated activity. My ideal data would contain individual home sales with housing descriptions by date, as well as available supply of vacant rental units by date. This may be something to explore separately.
Permits
I initially came across permit data during my search of the OpenData DC website. They had permit information for other building and construction permits by location throughout the city.
Although this might not provide me housing value, it could point to where construction, renovations and improvements were and are taking place. This may help to understand where investments and revitalization efforts are taking place and what density looks like over time.
My initial question is whether this can confirm or deny the volume of community investment influences an improvement (decrease) in criminal activity for a particular area or increases activity. I would also like to learn about the types of activity that are common in proximity to these permit locations.

Building
I found building permit data from 2009 through 2020. After an initial exploration of the data, I could see aggregating the data along spatial boundaries and time. This aggregation could be interesting to what is happening in locales, and at what volume the activity is occurring. What types of investments or improvements are being made? Are there trends? What affect do the improvements have on the local community? The data exploration process of this data will likely help me ask some additional questions and propose hypotheses.
Construction
Additionally the construction permit data I found was not available for my full project time frame but the majority was still available, covering 2012 through 2020. This data is the same as the building permits data but pertains to construction. The same questions mentioned in the Building Permit section will also apply here too. This data could compliment the building permit data, but will also likely pose additional questions and propose hypotheses.
Federal Reserve Economic Data
This resource provided a lot of spreadsheets with economic data. Some of the data was weekly, monthly, and quarterly. The majority of the data was however only available at an annualized level. Not great granularity but it will have to do. While looking at yearly activity levels, maybe this can add context or general background information for the area. I’m not confident it will tell us a much or be leveraged in a significant way. Over the 2009-2020 time frame, it’s possible we could see some relationship.
The following is a list of the data to be explored:
- Median Household Income (Annual)
- Rental Vacancy Rates (Annual)
- Home Vacancy Rates (Annual)
- Home Ownership Rates (Annual)
- Housing Price Index (Quarterly)
- New Private Housing Units (Monthly)
- Business Applications (Weekly)
The image below gives some quick context to the data source. Each of the items listed above had their own page. They each had a feature to download the data in a couple different formats, though I downloaded CSVs. You could also edit the date range for the interactive graph. Beneach the chart in the image provided source information and notes about the particular data.

Housing Data
Sale
While combing through websites, I was able to get a couple of resources with housing related information from both Realtor.com and Redfin.com. The granularity still was not what I wanted, but it will have to do for now. Also, the date ranges did not cover my entire project time frame.
Redfin Home Prices, Sales & Inventory
Over at Redfin.com they have some aggregated data available by month from February 2012 through October 2020 (at the time of research). The data provides high-level average/median information as well as other metrics to describe prices, sales and inventory.

Redfin COVID-19 Weekly Housing Market Data
Redfin did have weekly housing data going back to 2017 in a rolling 1, 4 or 12-week window. The data is available by county or metro area. There was a disclaimer about data being subject to revisions weekly and use caution. That being said, I should be critical of the data but at a bare minimum it could add some context. For more information on the data, they compiled all the definitions for each metric.
Realtor.com Inventory Metrics
Over at Realtor.com they have some aggregated inventory data available by month from July 2016 through October 2020 (at the time of research). Some of the data fields are similar to Redfin’s Home Prices, Sales & Inventory so this will be interesting to compare the two sources where possible.
Rental
This is still a short-fall. For the time being, I’m going to leave it be. I will continue to look around but for my project, I think it’s safe to gap it. I do have economic data for Rental Vacancy Rates (annualized) for Washington DC. This isn’t much, but it does provide some information.
Planning Progress
I will reference back to my plan and update what we have so far. Using XMind’s icons, I put task completion status next to each data item to indicate level of progress. I have expanded a couple areas to add some granularity. If during the process I expand or modify my plan, I should make sure my plan reflects those changes. Project documentation is a good skill to have. If you document properly as you go, it will save you time and patience later.

What’s Next?
At the present time, I am going to start exploring the current data that I have gathered so far. I still have an open item for temporal data (i.e. events). I will be concurrently searching and gather related data to that. When that particular post is ready, I will publish. However, in the meantime there will be other posts, not related to my “Data Search” posts, detailing those processes.
Posts in Project Series
- Criminal Analysis: Planning
- Criminal Analysis: Data Search (part 0)
- Criminal Analysis: Data Search (part 1)
- Criminal Analysis: Data Search (part 2)
- Criminal Analysis: Data Search (part 3)
- Criminal Analysis: Data Storage
- Criminal Analysis: Data Storage (part 2)
- Criminal Analysis: Data Search (part 4)
- Derive a Star Schema By Example
- Criminal Analysis: Data Exploration
- Criminal Analysis: Data Exploration (part 1)
- Criminal Analysis: Data Exploration (part 2a)
- Criminal Analysis: Data Exploration (part 2b)
- Criminal Analysis: Data Storage (Part 3)