9.2. Predicting Bike Rentals¶
The data we will use in this chapter is used with the permission of Capital Bikeshare. You can download the data from their website. We are using a prepared version of this data that has already been augmented with additional weather data from the UCI Machine Learning Repository. To download the data sets click here [*].
The basic data for the sql lessons is in bikeshare.db. The additional data about weather is not needed until the last section of this chapter in which we try to predict bike rentals. Later sections of this chapter use bikeshare_11_12.db which has the same schema as bikeshare.db but data for two years instead of just one. These two files are sqllite database files, feel free to download them and use them with sqllite directly.
Predicting bike rental trends is very important from both an operational and planning perspective. Bikeshare companies need to stay up to date on rental trends to know where they should add new facilities, and how to reposition bikes to get them to the locations with the highest demand. They do not want to wait until all of the bikes are rented at a particular location before moving additional bikes into position, as that is lost revenue for them.
In the zip file you downloaded from the UCI Machine Learning Repository there are two data sets: hour.csv
and day.csv
.
Both have the following fields (with the exception of hr
which is not available in day.csv
).
instant
: record indexdteday
: dateseason
: season (1:spring, 2:summer, 3:fall, 4:winter)yr
: year (0: 2011, 1:2012)mnth
: month (1 to 12)hr
: hour (0 to 23)holiday
: whether day is holiday or notweekday
: day of the weekworkingday
: 0 if day is either weekend nor holiday is 1, otherwise 1weathersit
:1: Clear, Few clouds, Partly cloudy, Partly cloudy
2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
4: Heavy Rain + Ice Pellets + Thunderstorm + Mist, Snow + Fog
temp
: Normalized temperature in Celsiusatemp
: Normalized feeling temperature in Celsiushum
: Normalized humiditywindspeed
: Normalized wind speedcasual
: count of casual usersregistered
: count of registered userscnt
: count of total rental bikes including both casual and registered