BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
vnreddy
Quartz | Level 8

 

Hi,

 

I am using below prog. to output a report in excel using proc report. 

As you can see from attachments current output has zero and duplicate dates.

How can i bring 4th row 01Feb2024  data from CENI class to 2nd row(as shown in expected output image attached). Similarly, 7th row 31Jan2024 CENI date to 1st row in CENI class.

And based on date variable how can i insert Year value as shown in expected output file. 

 



data have;	
	infile datalines dlm=',';
		input Date:$10. Year 11-15 Class $16-19 Plant $21-31 vol 32-39  Cum_vol;
datalines;
31-Jan-24 2024 CEMI	B 	        45.5	 45.5
01-Feb-24 2024 CEMI	B 	        54.8	100.3
31-Jan-24 2024 CERI	B	        13.5	 13.5
01-Feb-24 2024 CENI	B	         6.5	  6.5
05-Feb-24 2024 CEMI	B	          26	   26
06-Feb-24 2024 CEMI	B	         137	  163
31-Jan-24 2024 CENI	B       	 139	  139
01-Feb-24 2024 CESI	B          260.5	260.5
02-Feb-24 2024 CEMI	B            184	  184
;
run;

options missing = 0;
proc report data=have split="*";
/*    List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol);
define class/across "Class";
define plant/display center "Plant";
define Date/display "Date";
define Vol/display right "Vol";
define Cum_vol/display  right "Cumulative Volume";
run;

Current output

current output.jpg

Expected output

expected output.jpg

  

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You want to aggregate all obs for each PLANT/DATE combination, so change them from

 

define plant/display center "Plant";
define Date/display "Date";

to

define plant/group center "Plant";
define Date/group "Date";

and change the variables to aggregate (i.e. to sum) from

define Vol/display right "Vol";
define Cum_vol/display  right "Cumulative Volume";

to

define Vol/sum right "Vol";
define Cum_vol/sum  right "Cumulative Volume";

But the result is not in the date order you want, in part because you have DATE as a character variable. DATE needs to be a numeric variable (with a date format), which is done in the DATA HAVE step below.   Then the report can request date in its internal order (i.e. calendar order).  That's done below.

 

 

data have;	
  infile datalines dlm=',';
  input Date_text:$10. Year 11-15 Class $16-19 Plant $21-31 vol 7.0  Cum_vol;
  date=input(compress(date_text,'-'),date7.);
  format date date9. ;
datalines;
31-Jan-24 2024 CEMI	B             45.5    45.5
01-Feb-24 2024 CEMI	B             54.8   100.3
31-Jan-24 2024 CERI	B             13.5    13.5
01-Feb-24 2024 CENI	B              6.5     6.5
05-Feb-24 2024 CEMI	B             26      26
06-Feb-24 2024 CEMI	B            137     163
31-Jan-24 2024 CENI	B            139     139
01-Feb-24 2024 CESI	B            260.5   260.5
02-Feb-24 2024 CEMI	B            184     184
run;

options missing = 0;
proc report data=have split="*";
/*    List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol) ;
define class/across "Class";
define plant/group center "Plant";
define Date/group order=internal "Date";
define year/group; /*Added to avoid treating YEAR as an analysis varible*/ define Vol/sum right "Vol"; define Cum_vol/sum right "Cumulative Volume"; run;

I speak as a neophyte in PROC REPORT, so there may be some better explanations and revisions to come.

 

 

Additional notes.  The 

date=input(compress(date_text,'-'),date7.);

uses the DATE7 informat because the dates use 2-digit years (bad practice).  The default (on my SAS system) is to interpret all two-digit years from 00 through 25 are assigned to the 21st century (i.e. 2000 through 2025.  26 through 99 go to the 20th century.

 

You can control this with the yearcutoff option, as in

options yearcutoff=1951;

which will assigned 00 through 50 to the 21st century.  This is an easy way to process two-digit years, but you'll probably be better off using the YEAR variable to generate DATE.

 

I also define YEAR as a group variable, to avoid it being treated as analysis variable.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

You want to aggregate all obs for each PLANT/DATE combination, so change them from

 

define plant/display center "Plant";
define Date/display "Date";

to

define plant/group center "Plant";
define Date/group "Date";

and change the variables to aggregate (i.e. to sum) from

define Vol/display right "Vol";
define Cum_vol/display  right "Cumulative Volume";

to

define Vol/sum right "Vol";
define Cum_vol/sum  right "Cumulative Volume";

But the result is not in the date order you want, in part because you have DATE as a character variable. DATE needs to be a numeric variable (with a date format), which is done in the DATA HAVE step below.   Then the report can request date in its internal order (i.e. calendar order).  That's done below.

 

 

data have;	
  infile datalines dlm=',';
  input Date_text:$10. Year 11-15 Class $16-19 Plant $21-31 vol 7.0  Cum_vol;
  date=input(compress(date_text,'-'),date7.);
  format date date9. ;
datalines;
31-Jan-24 2024 CEMI	B             45.5    45.5
01-Feb-24 2024 CEMI	B             54.8   100.3
31-Jan-24 2024 CERI	B             13.5    13.5
01-Feb-24 2024 CENI	B              6.5     6.5
05-Feb-24 2024 CEMI	B             26      26
06-Feb-24 2024 CEMI	B            137     163
31-Jan-24 2024 CENI	B            139     139
01-Feb-24 2024 CESI	B            260.5   260.5
02-Feb-24 2024 CEMI	B            184     184
run;

options missing = 0;
proc report data=have split="*";
/*    List columns to be used in the report;*/
column Date Year plant Class, (vol cum_vol) ;
define class/across "Class";
define plant/group center "Plant";
define Date/group order=internal "Date";
define year/group; /*Added to avoid treating YEAR as an analysis varible*/ define Vol/sum right "Vol"; define Cum_vol/sum right "Cumulative Volume"; run;

I speak as a neophyte in PROC REPORT, so there may be some better explanations and revisions to come.

 

 

Additional notes.  The 

date=input(compress(date_text,'-'),date7.);

uses the DATE7 informat because the dates use 2-digit years (bad practice).  The default (on my SAS system) is to interpret all two-digit years from 00 through 25 are assigned to the 21st century (i.e. 2000 through 2025.  26 through 99 go to the 20th century.

 

You can control this with the yearcutoff option, as in

options yearcutoff=1951;

which will assigned 00 through 50 to the 21st century.  This is an easy way to process two-digit years, but you'll probably be better off using the YEAR variable to generate DATE.

 

I also define YEAR as a group variable, to avoid it being treated as analysis variable.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
vnreddy
Quartz | Level 8

@mkeintz Thank you for the help. 

It did work, as i expected. 

How can i get the Year in output as shown in my expected output.

mkeintz
PROC Star

@vnreddy wrote:

@mkeintz Thank you for the help. 

It did work, as i expected. 

How can i get the Year in output as shown in my expected output.


I have added a define YEAR as a group variable (so it would not be summed liked the VOL variable).

 

But you will need more experienced PROC REPORT users than me to determined how to "stack" YEAR over the DATE and PLANT variables.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 463 views
  • 0 likes
  • 2 in conversation