Hi,
I am using below prog. to output a report in excel using proc report.
As you can see from attachments current output has zero and duplicate dates.
How can i bring 4th row 01Feb2024 data from CENI class to 2nd row(as shown in expected output image attached). Similarly, 7th row 31Jan2024 CENI date to 1st row in CENI class.
And based on date variable how can i insert Year value as shown in expected output file.
data have;
infile datalines dlm=',';
input Date:$10. Year 11-15 Class $16-19 Plant $21-31 vol 32-39 Cum_vol;
datalines;
31-Jan-24 2024 CEMI B 45.5 45.5
01-Feb-24 2024 CEMI B 54.8 100.3
31-Jan-24 2024 CERI B 13.5 13.5
01-Feb-24 2024 CENI B 6.5 6.5
05-Feb-24 2024 CEMI B 26 26
06-Feb-24 2024 CEMI B 137 163
31-Jan-24 2024 CENI B 139 139
01-Feb-24 2024 CESI B 260.5 260.5
02-Feb-24 2024 CEMI B 184 184
;
run;
options missing = 0;
proc report data=have split="*";
/* List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol);
define class/across "Class";
define plant/display center "Plant";
define Date/display "Date";
define Vol/display right "Vol";
define Cum_vol/display right "Cumulative Volume";
run;
Current output
Expected output
You want to aggregate all obs for each PLANT/DATE combination, so change them from
define plant/display center "Plant";
define Date/display "Date";
to
define plant/group center "Plant";
define Date/group "Date";
and change the variables to aggregate (i.e. to sum) from
define Vol/display right "Vol";
define Cum_vol/display right "Cumulative Volume";
to
define Vol/sum right "Vol";
define Cum_vol/sum right "Cumulative Volume";
But the result is not in the date order you want, in part because you have DATE as a character variable. DATE needs to be a numeric variable (with a date format), which is done in the DATA HAVE step below. Then the report can request date in its internal order (i.e. calendar order). That's done below.
data have;
infile datalines dlm=',';
input Date_text:$10. Year 11-15 Class $16-19 Plant $21-31 vol 7.0 Cum_vol;
date=input(compress(date_text,'-'),date7.);
format date date9. ;
datalines;
31-Jan-24 2024 CEMI B 45.5 45.5
01-Feb-24 2024 CEMI B 54.8 100.3
31-Jan-24 2024 CERI B 13.5 13.5
01-Feb-24 2024 CENI B 6.5 6.5
05-Feb-24 2024 CEMI B 26 26
06-Feb-24 2024 CEMI B 137 163
31-Jan-24 2024 CENI B 139 139
01-Feb-24 2024 CESI B 260.5 260.5
02-Feb-24 2024 CEMI B 184 184
run;
options missing = 0;
proc report data=have split="*";
/* List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol) ;
define class/across "Class";
define plant/group center "Plant";
define Date/group order=internal "Date";
define year/group; /*Added to avoid treating YEAR as an analysis varible*/
define Vol/sum right "Vol";
define Cum_vol/sum right "Cumulative Volume";
run;
I speak as a neophyte in PROC REPORT, so there may be some better explanations and revisions to come.
Additional notes. The
date=input(compress(date_text,'-'),date7.);
uses the DATE7 informat because the dates use 2-digit years (bad practice). The default (on my SAS system) is to interpret all two-digit years from 00 through 25 are assigned to the 21st century (i.e. 2000 through 2025. 26 through 99 go to the 20th century.
You can control this with the yearcutoff option, as in
options yearcutoff=1951;
which will assigned 00 through 50 to the 21st century. This is an easy way to process two-digit years, but you'll probably be better off using the YEAR variable to generate DATE.
I also define YEAR as a group variable, to avoid it being treated as analysis variable.
You want to aggregate all obs for each PLANT/DATE combination, so change them from
define plant/display center "Plant";
define Date/display "Date";
to
define plant/group center "Plant";
define Date/group "Date";
and change the variables to aggregate (i.e. to sum) from
define Vol/display right "Vol";
define Cum_vol/display right "Cumulative Volume";
to
define Vol/sum right "Vol";
define Cum_vol/sum right "Cumulative Volume";
But the result is not in the date order you want, in part because you have DATE as a character variable. DATE needs to be a numeric variable (with a date format), which is done in the DATA HAVE step below. Then the report can request date in its internal order (i.e. calendar order). That's done below.
data have;
infile datalines dlm=',';
input Date_text:$10. Year 11-15 Class $16-19 Plant $21-31 vol 7.0 Cum_vol;
date=input(compress(date_text,'-'),date7.);
format date date9. ;
datalines;
31-Jan-24 2024 CEMI B 45.5 45.5
01-Feb-24 2024 CEMI B 54.8 100.3
31-Jan-24 2024 CERI B 13.5 13.5
01-Feb-24 2024 CENI B 6.5 6.5
05-Feb-24 2024 CEMI B 26 26
06-Feb-24 2024 CEMI B 137 163
31-Jan-24 2024 CENI B 139 139
01-Feb-24 2024 CESI B 260.5 260.5
02-Feb-24 2024 CEMI B 184 184
run;
options missing = 0;
proc report data=have split="*";
/* List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol) ;
define class/across "Class";
define plant/group center "Plant";
define Date/group order=internal "Date";
define year/group; /*Added to avoid treating YEAR as an analysis varible*/
define Vol/sum right "Vol";
define Cum_vol/sum right "Cumulative Volume";
run;
I speak as a neophyte in PROC REPORT, so there may be some better explanations and revisions to come.
Additional notes. The
date=input(compress(date_text,'-'),date7.);
uses the DATE7 informat because the dates use 2-digit years (bad practice). The default (on my SAS system) is to interpret all two-digit years from 00 through 25 are assigned to the 21st century (i.e. 2000 through 2025. 26 through 99 go to the 20th century.
You can control this with the yearcutoff option, as in
options yearcutoff=1951;
which will assigned 00 through 50 to the 21st century. This is an easy way to process two-digit years, but you'll probably be better off using the YEAR variable to generate DATE.
I also define YEAR as a group variable, to avoid it being treated as analysis variable.
@mkeintz Thank you for the help.
It did work, as i expected.
How can i get the Year in output as shown in my expected output.
@vnreddy wrote:
@mkeintz Thank you for the help.
It did work, as i expected.
How can i get the Year in output as shown in my expected output.
I have added a define YEAR as a group variable (so it would not be summed liked the VOL variable).
But you will need more experienced PROC REPORT users than me to determined how to "stack" YEAR over the DATE and PLANT variables.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.