I have a Postgres table where each row shows the number of traffic tickets written by each city per month.
I needed to show the percentage change between the current and prior month’s ticketwriting. For example, if a city wrote 1000 tickets this month and 500 last month, then I wanted a field in the row for this month to show a 200% increase over last month.
Here’s how I did it.
I first had to create two functions:
-- number of traffic tickets for -- the given month and city CREATE FUNCTION traffictickets( cityName CHARACTER VARYING, MONTH TIMESTAMP WITHOUT TIME zone ) RETURNS INTEGER AS 'select "TrafficTickets" FROM raw."OCA tickets per city" WHERE "Date" = $2 AND "City" = $1;' LANGUAGE SQL IMMUTABLE RETURNS NULL ON NULL INPUT; |
Then I made a short SQL script that referenced this function:
UPDATE raw."OCA tickets per city" SET "Change" = "TrafficTickets"::REAL / (CASE WHEN traffictickets("City", "Date" - INTERVAL '1 month') = 0 THEN -1 ELSE traffictickets("City", "Date" - INTERVAL '1 month') END)::REAL; |
Viola, now each of my rows shows the percentage change between the current and prior month’s ticket counts! Except there’s a huge problem–the changes aren’t consistently meaningful.
If ticketwriting dropped month over month, the change field will be between 0 and 1: going from 100 tickets to 20 tickets the next month, the change is 0.2. But here’s the problem: if ticketwriting increased, the number will be between 1 and infinity. If ticketwritng increased from 20 to 100, the change is 5. Or if it changed from 2 to 150, the change is 75!
Yikes, I need some way of bringing these positive changes down to Earth.
I think I have a solution: show the change relative to the city’s average ticketwriting over all its years in my database.
I added a new function:
-- average number of traffic tickets written by -- that city over all months in my table CREATE FUNCTION averageticketsforcity( cityName CHARACTER VARYING ) RETURNS REAL AS 'select avg("TrafficTickets"::real)::real FROM raw."OCA tickets per city" WHERE "City" = $1;' LANGUAGE SQL IMMUTABLE RETURNS NULL ON NULL INPUT; |
Here’s the new version of the query:
UPDATE raw."OCA tickets per city" SET "Change" = ("TrafficTickets"::REAL - traffictickets("City", "Date" - INTERVAL '1 month')::REAL) / CASE WHEN averageticketsforcity("City") = 0 THEN 1 ELSE averageticketsforcity("City") END; |
That seems to have fixed it. Now all my values appear to be equally meaningful to each other.