Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

[8i] Way to calculate a rolling average?

696240Nov 29 2010 — edited Nov 29 2010
I am attempting to calculate a rolling average over a specific time period, but I can't quite figure out how to do that. For this particular problem, I'm also stuck dealing with an 8i database.

Here's some sample data for the problem:
CREATE TABLE	my_data
(	order_no	CHAR(10)
,	seq_nbr		CHAR(4)
,	area_id		CHAR(4)
,	start_date	DATE
,	unit_time	NUMBER(7,2)
);

INSERT INTO my_data VALUES ('0000567890','0010','A001',TO_DATE('05/01/2000','mm/dd/yyyy'),0.34);
INSERT INTO my_data VALUES ('0000567890','0020','A001',TO_DATE('05/02/2000','mm/dd/yyyy'),0.78);
INSERT INTO my_data VALUES ('0000567890','0030','A002',TO_DATE('05/03/2000','mm/dd/yyyy'),0.91);
INSERT INTO my_data VALUES ('0000567890','0040','A003',TO_DATE('05/03/2000','mm/dd/yyyy'),0.27);
INSERT INTO my_data VALUES ('0000123456','0010','A001',TO_DATE('04/01/2001','mm/dd/yyyy'),0.39);
INSERT INTO my_data VALUES ('0000123456','0020','A001',TO_DATE('04/02/2001','mm/dd/yyyy'),0.98);
INSERT INTO my_data VALUES ('0000123456','0030','A002',TO_DATE('04/03/2001','mm/dd/yyyy'),0.77);
INSERT INTO my_data VALUES ('0000123456','0040','A003',TO_DATE('04/03/2001','mm/dd/yyyy'),0.28);
INSERT INTO my_data VALUES ('0000123123','0010','A001',TO_DATE('12/01/2001','mm/dd/yyyy'),0.31);
INSERT INTO my_data VALUES ('0000123123','0020','A001',TO_DATE('12/02/2001','mm/dd/yyyy'),0.86);
INSERT INTO my_data VALUES ('0000123123','0030','A002',TO_DATE('12/03/2001','mm/dd/yyyy'),0.82);
INSERT INTO my_data VALUES ('0000123123','0040','A003',TO_DATE('12/03/2001','mm/dd/yyyy'),0.23);
INSERT INTO my_data VALUES ('0000111111','0010','A001',TO_DATE('06/01/2002','mm/dd/yyyy'),0.29);
INSERT INTO my_data VALUES ('0000111111','0020','A001',TO_DATE('06/02/2002','mm/dd/yyyy'),0.84);
INSERT INTO my_data VALUES ('0000111111','0030','A002',TO_DATE('06/03/2002','mm/dd/yyyy'),0.78);
INSERT INTO my_data VALUES ('0000111111','0040','A003',TO_DATE('06/03/2002','mm/dd/yyyy'),0.26);
INSERT INTO my_data VALUES ('0000654321','0010','A001',TO_DATE('05/01/2003','mm/dd/yyyy'),0.28);
INSERT INTO my_data VALUES ('0000654321','0020','A001',TO_DATE('05/02/2003','mm/dd/yyyy'),0.88);
INSERT INTO my_data VALUES ('0000654321','0030','A002',TO_DATE('05/03/2003','mm/dd/yyyy'),0.75);
INSERT INTO my_data VALUES ('0000654321','0040','A003',TO_DATE('05/03/2003','mm/dd/yyyy'),0.25);
This is a sample of the real data, which has around 50K rows, and are the results of a query, not a table. Additionally, the real data also has a part number associated with each order, and the final result will be partitioned by part number and area id. This sample basically represents something you might see for a single part number. Also, the spacing of these records in time is not predictable. You could have a single year that has 20 orders, and another year (or period of years) that has none.

Here's where I'm at right now in the problem. For the example, let's assume I want to calculate a 2-year rolling average, (an average for the 2 years previous to the date of the current row). For my real data, I will probably want to be able to vary the time period (different time periods in different queries) over which the rolling average is calculated.

I thought that using the AVERAGE analytical function with a window clause might be the way to go, but I'm not exactly sure how to use the window clause. I may even be completely mis-interpreting what its use is, but hopefully it should still show what I'm after.
SELECT	area_id
,	AVG(tot_area_unit_hrs)	
	OVER	(
		PARTITION BY	area_id
		ORDER BY	min_start
		ROWS		BETWEEN	--2 years prior to min_start of current row
				AND	CURRENT ROW
		)
,	--something to indicate the time period the average is over	AS time_period
FROM	(
	SELECT	order_no
	,	area_id
	,	MIN(start_date)	AS min_start
	,	SUM(unit_time)	AS tot_area_unit_hrs
	FROM	my_data
	GROUP BY	order_no
	,		area_id
	)
ORDER BY	area_id
,		time_period
As you can see from the query above, I want to calculate the sum of the unit_time for each order/area_id combo, then get the rolling average of that total time. (I am calculating the average total time a product spends in each area_id per order).

I want to see results like something along these lines... I don't really care how the time period is identified, as long as I can sort by it in chronological order, and it tells me what period of time it covers. The way I show it with period_start and period_end in the sample results below is just one way I thought of to do that.
area_id   period_start period_end   avg_unit_time   
----------------------------------------------------
A001      4/30/1998    5/1/2000     1.120           
A001      3/31/1999    4/1/2001     1.245           
A001      11/30/1999   12/1/2001    1.220           
A001      5/31/2000    6/1/2002     1.223           
A001      4/30/2001    5/1/2003     1.153           
A002      5/2/1998     5/3/2000     0.910            
A002      4/2/1999     4/3/2001     0.840            
A002      12/2/1999    12/3/2001    0.833            
A002      6/2/2000     6/3/2002     0.790            
A002      5/2/2001     5/3/2003     0.783            
A003      5/2/1998     5/3/2000     0.270            
A003      4/2/1999     4/3/2001     0.275            
A003      12/2/1999    12/3/2001    0.260            
A003      6/2/2000     6/3/2002     0.257            
A003      5/2/2001     5/3/2003     0.247            
Any suggestions?
This post has been answered by Frank Kulash on Nov 29 2010
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 27 2010
Added on Nov 29 2010
4 comments
1,715 views