DATA Step, Macro, Functions and more

First. and Last. Group by Processing Syntax

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

First. and Last. Group by Processing Syntax

Hi.  I’m having trouble geting some group by processing syntax correct.

I’m trying to calculate two variables one for first week revenue and one for Last week revenue. I'm not getting the correct looking output.  

 

Here is what I have for my program. 

 

 

 

proc sort data= Campaigns_all; 
by type Campaign_Name week_of; 
run;

data test (keep=type Campaign_Name start_dc end_dc  first_week_rev last_week_rev );
set Campaigns_all;
by type Campaign_Name ;
if first.Campaign_Name then first_week_rev = Revenue;
if last.Campaign_Name then last_week_rev = Revenue;
output ;
run;

I start out with a data set  which looks like this. 

 

 

Type campaign_name start_dc end_dc week_of Revenue visits Orders Units
Desktop 1 Piece 15-Sep-15 19-Oct-15 6-Oct-15 $5,461.38 662 33 197
Desktop 1 Piece  15-Sep-15 19-Oct-15 13-Oct-15 $3,712.08 293 10 101
Desktop  Get Dressed Up 4-Aug-15 19-Oct-15 6-Oct-15 $1,481.36 346 12 44
Desktop  Get Dressed Up 4-Aug-15 19-Oct-15 13-Oct-15 $1,898.78 149 15 85

 

After I run my I get this. Notice the null values in  First. last 

Type campaign_name start_dc end_dc first_week_rev last_week_rev
Desktop 1 Piece 15-Sep-15 19-Oct-15 5461.38 .
Desktop 1 Piece 15-Sep-15 19-Oct-15 . 3712.08
Desktop Get Dressed Up 4-Aug-15 19-Oct-15 1481.36 .
Desktop  Get Dressed Up 4-Aug-15 19-Oct-15 . 1898.78

 

 

What I want one row for each campain. 

 

Type campaign_name start_dc end_dc first_week_rev last_week_rev
Desktop 1 Piece 15-Sep-15 19-Oct-15 5461.38 3712.08
Desktop Get Dressed Up 4-Aug-15 19-Oct-15 1481.36 1898.78

Accepted Solutions
Solution
‎10-20-2015 04:28 PM
Occasional Contributor
Posts: 7

Re: First. and Last. Group by Processing Syntax

Try this:

 

proc sort data= Campaigns_all; 
by type Campaign_Name week_of; 
run;

data test (keep=type Campaign_Name start_dc end_dc  first_week_rev last_week_rev );
retain first_week_rev last_week_rev;
set Campaigns_all;
by type Campaign_Name ;
if first.Campaign_Name then first_week_rev = Revenue;
if last.Campaign_Name then do;
   last_week_rev = Revenue;
   output ;
end;
run;

CatCol

View solution in original post


All Replies
Trusted Advisor
Posts: 1,615

Re: First. and Last. Group by Processing Syntax

 

You can get the results using PROC SQL or PROC SUMMARY (and probably other procedures as well)

 

proc summary nway data=whatever;
    var first_week_rev last_week_rev;
    class type campaign_name start_dc end_dc;
    output out=_final_ sum=;
run;
Solution
‎10-20-2015 04:28 PM
Occasional Contributor
Posts: 7

Re: First. and Last. Group by Processing Syntax

Try this:

 

proc sort data= Campaigns_all; 
by type Campaign_Name week_of; 
run;

data test (keep=type Campaign_Name start_dc end_dc  first_week_rev last_week_rev );
retain first_week_rev last_week_rev;
set Campaigns_all;
by type Campaign_Name ;
if first.Campaign_Name then first_week_rev = Revenue;
if last.Campaign_Name then do;
   last_week_rev = Revenue;
   output ;
end;
run;

CatCol

Contributor
Posts: 55

Re: First. and Last. Group by Processing Syntax

[ Edited ]

You'll want to retain the value of first_week_rev throughout each by group. You'll also only want to output when at the end of the Campaign_Name by group. Something like below.

 

data test (keep=type Campaign_Name start_dc end_dc  first_week_rev last_week_rev );
set Campaigns_all;
by type Campaign_Name ;
retain first_week_rev;
if first.Campaign_Name then first_week_rev = Revenue;
if last.Campaign_Name then do;
  last_week_rev = Revenue;
  output;
end;
run;

Similarly, you could dynamically create the start_dc and end_dc columns in this same data step (assuming the week_of variable determines those values)

 

data test (keep=type Campaign_Name start_dc end_dc  first_week_rev last_week_rev );
set Campaigns_all(drop=start_dc end_dc);
by type Campaign_Name ;
retain first_week_rev start_dc;
if first.Campaign_Name then do;
  first_week_rev = Revenue;
  start_dc = week_of
end;
if last.Campaign_Name then do;
  last_week_rev = Revenue;
  end_dc = week_of
  output;
end;
format start_dc end_dc mmddyyd8.;
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 225 views
  • 4 likes
  • 4 in conversation