Hi,
I have bunch of files each having daily stock price of a company.
say, for a company named "AXIS" (in file axis.csv),
| Date | Price | 
| 1999-11-18 | 31.47353 | 
| 1999-12-27 | 37.77718 | 
| 1999-12-28 | 43.99142 | 
| 1999-12-29 | 51.50215 | 
| 1999-12-30 | 56.68813 | 
| 1999-12-31 | 55.30222 | 
| 2000-01-03 | 51.50215 | 
| 2000-01-04 | 47.56796 | 
| 2000-01-05 | 44.61731 | 
| 2000-01-06 | 42.91845 | 
| 2000-01-07 | 46.49499 | 
i'm trying to:
1. averaging the annual price, and then rearranging the columns and rows as:
| Firmname | 1990 | 1991 | 1992 | … | 2019 | 
| Axis | 31.3 | 32.3 | 35.6 | … | 38.8 | 
2. Merging all files in that folder to a sing file.
I have loaded them into SAS and was trying to get average of a single year by:
proc sql;
	SELECT 
		AVG(price)
	FROM
		axis
	WHERE
		year(date) = 1990;
	;
quit;wish I could iterate through year 1990 to 2019 but seems that DO loop won't work in PROC SQL.
Use group by in sql:
proc sql;
select
  year(date) as year,
  avg(price) 
from axis
group by calculated year
;
quit;
					
				
			
			
				
			
			
			
			
			
			
			
		Hi @jimmychoi
Instead of making a per-company-analysis, you could also append all company datasets to one input dataset and avoid alle looping, like this:
/* Input data - 2 examples */
data axis;
	firmname = 'axis'; date = '01jan1990'd; price = 1; output;
	firmname = 'axis'; date = '01jan1991'd; price = 2; output;
	firmname = 'axis'; date = '01jan1992'd; price = 3; output;
	firmname = 'axis'; date = '01jan1993'd; price = 4; output;
run;
data beto;
	firmname = 'beto'; date = '01jan1990'd; price = 6; output;
	firmname = 'beto'; date = '01jan1991'd; price = 7; output;
	firmname = 'beto'; date = '01jan1992'd; price = 8; output;
	firmname = 'beto'; date = '01jan1993'd; price = 9; output;
run;
/* Append to a work data set */
data have; set axis beto;
run;
/* Calculate average price per company / year */
proc sql;
	create table wdata as 
		select
			firmname,
	  		year(date) as year,
	  		avg(price) as price
		from have
		group by firmname, calculated year
		order by firmname;
quit;
/* Transpose to get years as variables */
proc transpose data=wdata out=want(drop=_NAME_);
	by firmname;
	id year;
run;
Hi ErikLund, your code works beautifully thanks.
one thing, i have used below code to aggregate separate files(example):
data someData;
length _xFilename $ 256;
infile "c:\temp\*.csv" filename=_xfilename;
xFilename = _xFilename;
input
@1 line $10.
;
run;
					
				
			
			
				
			
			
			
			
			
			
			
		"DO LOOP" not permitted in PROC SQL?
Why would you use SQL instead of a DATA step?
data axis;
	firmname = 'axis'; date = '01jan1990'd; price = 1; output;
	firmname = 'axis'; date = '01jan1991'd; price = 2; output;
	firmname = 'axis'; date = '01jan1992'd; price = 3; output;
	firmname = 'axis'; date = '01jan1993'd; price = 4; output;
run;
data beto;
	firmname = 'beto'; date = '01jan1990'd; price = 6; output;
	firmname = 'beto'; date = '01jan1991'd; price = 7; output;
	firmname = 'beto'; date = '01jan1992'd; price = 8; output;
	firmname = 'beto'; date = '01jan1993'd; price = 9; output;
run;
/* Append to a work data set */
data have; set axis beto;
run;
proc tabulate data=have;
class date firmname;
format date year4.;
var price;
table firmname=' ',date=' '*price=' '*mean=' '/box='firmname';
run;
P.S. steal the dataset from Jenson .
If you want a table , then try other's code . my code is for a REPORT .
Is your goal a report that people read or a (poorly structured for many purposes) data set to feed into another process?
Also, are your dates actual SAS date values or some character value impersonating a date? SAS date values allow a lot of manipulation that character values are a pain to work with. For example changing a format allows creating groups for reports.
Below is an example of a report if the data files for the firms are combined and an appropriate format applied to a date-valued variable.
data example;
informat date yymmdd10. price 12.5 firm $10.;
format date yymmdd10.;
input Date Price firm ;
datalines;
1999-11-18 31.47353  Axis
1999-12-27 37.77718  Axis 
1999-12-28 43.99142  Axis 
1999-12-29 51.50215  Axis 
1999-12-30 56.68813  Axis 
1999-12-31 55.30222  Axis 
2000-01-03 51.50215  Axis 
2000-01-04 47.56796  Axis 
2000-01-05 44.61731  Axis 
2000-01-06 42.91845  Axis 
2000-01-07 46.49499  Axis 
1999-11-18 32.47353  Bravo
1999-12-27 33.77718  Bravo 
1999-12-28 44.99142  Bravo 
1999-12-29 51.50215  Bravo 
1999-12-30 56.68813  Bravo 
1999-12-31 55.30222  Bravo 
2000-01-03 41.50215  Bravo 
2000-01-04 43.56796  Bravo 
2000-01-05 42.61731  Bravo 
2000-01-06 52.91845  Bravo 
2000-01-07 45.49499  Bravo 
;
run;
proc tabulate data=example;
   class firm date;
   format date year4.;
   var price;
   table firm,
         date='Year'*price*(mean min max)*f=f12.5
   ;
run;
Changing the format from YEAR4 to YYQ is all that would be needed to create a calendar quarter summary for each firm; YYMON format would create monthly summaries.
I show other statistics to show those are available as well.
Once you have the data combined you could filter the report to specific intervals with a WHERE statement in Proc Tabulate.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.