BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Danglytics
Calcite | Level 5
Hi,

I have a transactional dataset that I want to roll-up to the contact level.
example:
id date sales
1 01Jan11 12
1 02Jan11 20
2 12Jan11 34
2 23Feb11 21
etc..

the rolled up set would look like
id earliestdate sales_sum
1 01Jan11 32
2 12Jan11 55
etc.

the trouble I am having is outputing the 'earliestdate' as the end of my datastep is if last.id then output ;

Thanks for your help.
1 ACCEPTED SOLUTION

Accepted Solutions
buckeye
Obsidian | Level 7


Editor's Note:  The solution provided by buckeye will allow you to total up SALES while capturing the earliest DATE.  Because a SAS date is a numeric value, the MIN function can be used to capture the earliest date.  The DATA step version of buckeye's PROC SQL code is show below as well.  Please note that in order for the DATA step version to work, the data set will have to be sorted by ID and DATE.

       

 

 

 


 PROC SQL;
 CREATE TABLE mySum AS
 SELECT ID,MIN(date) AS Earliest, SUM(sales) AS SalesTot FROM salesrecord GROUP BY ID;
 QUIT;



/*DATA Step version of the above code*/

data a; 
input id date : date9. sales; 
datalines; 
1 01Jan11 12
1 02Jan11 20
2 12Jan11 34
2 23Feb11 21
;

data b; 
set a; 
by id; 
retain earliest; 
if first.id then do; 
earliest=date;
 sales_sum=0; 
end; 
sales_sum+sales;
if last.id then output; 
drop sales date; 
run; 

proc print; 
format earliest date9.; 
run;

 

 

View solution in original post

7 REPLIES 7
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Share what SAS code you have created thus far to solve the problem/objective. You should be using a RETAIN statement to capture / retain a "temporary date variable" and then on LAST. you will want to output.

Otherwise why not use PROC SUMMARY/MEANS with the MIN function instead?

Scott Barry
SBBWorks, Inc.
Danglytics
Calcite | Level 5
This is an overly simplified example of the actual data i'm using, and for that reason theproc summary / means wouldnt work, so i was hoping to incorporate some sort code to my data step.

but in this case i would do

data rollup ;
set transactions ;
by id ;
if first.id then do sales_sum=. ;
sales_sum + sales ;
if last.id then output ;
run ;

i havent been able to figure out capturing the earliest transaction date part of the code yet.
Danglytics
Calcite | Level 5
I think ive figured it out.

data rollup ;
set transactions ;
by id ;
retain earliestdate ;
if first.id then do;
sales_sum=. ;
earliestdate= '01jan3000'd *set date to sometime way in the future, i know probably not the best practice.. ;
end;
earliestdate = min(earliestdate,date) ;
sales_sum + sales ;
if last.id then output ;
run ;
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Slight coding simplification, if you have FIRST.ID, then you can assign directly the DATE and SALES variables to capture it without using MIN function. Also eliminates need to set high values and/or MISSING value.

Honestly, you can accomplish this processing with PROC SUMMARY/MEANS -- have a look at the DOC.

Scott Barry
SBBWorks, Inc.
Peter_C
Rhodochrosite | Level 12
Danglytics
like the approach : starting with the "earliest" at a "high value"
Going a little further lets the SAS system indicate the missing with ****
The date constants allow for a year like 20000 (count those zeroes)
This SASlog indicates a change when the year grows larger than 20000[pre]19 data ;
20 retain earliest '31dec20000'd ;
21 do date= earliest to earliest +1 ;
22 put date date11. ;
23 put date date9. ;
24 put date ddmmyy10. ;
25 end ;
26 stop ;
27 run;

31-DEC-****
31DEC****
31/12/****
***********
*********
**********
NOTE: The data set [/pre] So, for a high-value date, I would recommend the number 6589336
or (my preference) the syntax
%sysevalf( "31dec20000"d +1 )
buckeye
Obsidian | Level 7


Editor's Note:  The solution provided by buckeye will allow you to total up SALES while capturing the earliest DATE.  Because a SAS date is a numeric value, the MIN function can be used to capture the earliest date.  The DATA step version of buckeye's PROC SQL code is show below as well.  Please note that in order for the DATA step version to work, the data set will have to be sorted by ID and DATE.

       

 

 

 


 PROC SQL;
 CREATE TABLE mySum AS
 SELECT ID,MIN(date) AS Earliest, SUM(sales) AS SalesTot FROM salesrecord GROUP BY ID;
 QUIT;



/*DATA Step version of the above code*/

data a; 
input id date : date9. sales; 
datalines; 
1 01Jan11 12
1 02Jan11 20
2 12Jan11 34
2 23Feb11 21
;

data b; 
set a; 
by id; 
retain earliest; 
if first.id then do; 
earliest=date;
 sales_sum=0; 
end; 
sales_sum+sales;
if last.id then output; 
drop sales date; 
run; 

proc print; 
format earliest date9.; 
run;

 

 

Ankitsas
Calcite | Level 5
Friend,

I hope the below code will suffix your problem, if not please do update me to make corrections

data test;
input id date date7. sales;
format date date7.;
datalines;
1 01Jan11 12
1 02Jan11 20
2 12Jan11 34
2 23Feb11 21
;
proc sort;
by id descending date;
run;
data test1;
set test;
by id;
if first.id then total_sales=0;
total_sales+sales;
if last.id;
run;
proc print;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 39937 views
  • 0 likes
  • 5 in conversation