Random Forest Model of Building Energy Consumption

I built an R Shiny app to model a building’s energy consumption. The app allows the user to select a baseline period and observe the building’s performance post-baseline. The energy rate can be adjusted to provide a quick estimate of savings (or losses) in the post-baseline period.

So what: My clients wanted to know the effect of energy saving measures they were taking, e.g. new AC units, new freezers, new temperature settings, new lighting, etc. Conversely they might want to assess the negative impact of an event (e.g. equipment malfunction). In the screenshot below energy use has increased post-baseline which would be a concern to any business. This tool gives decision makers and sales engineers quick and “good enough” information they need to take action.

Screenshot of the app

It is no surprise that ambient temperature has the biggest impact on electricity consumed. Below we see a typical annual profile – higher temperatures in summer lead to higher AC use. Note that if you use electric heating instead of gas your winter electricity consumption could be just as high if not higher than your summer consumption.

Typical annual profile of local temperature and energy consumption

The conventional way to build a model of a building’s energy consumption is to use temperature records to compute the number of degree days in each billing period (typically monthly), get the energy consumed from the monthly utility bill and build a regression model (with only 12 data points). For more on degree days and a good account of the pros and cons of this modelling approach in general see here.

When I worked at an energy management company I gained valuable insight into how office, retail, pharmacy and grocery buildings consume electricity. A quick glance at a daily profile of energy consumption (aside: my old statistics lecturer always stressed the importance of drawing pictures early and often) shows how me there is more than temperature influencing energy consumed: namely business operating hours. I was fortunate to have temperature and energy consumption data at the hourly level and therefore had the opportunity to develop a much richer model of energy consumption.

The rmd script is available here on Google Drive. It pulls in data from a publicly available google sheets page so you should be able to download it and run it in RStudio without any fuss. Hopefully I have commented my code sufficiently well but please contact me if you have any questions. For more on random forests check out this video from about 41:20.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s