BookmarkSubscribeRSS Feed
Satori
Quartz | Level 8

I'm running the macro below, where lvar is a list of financial variables. There are other variables in the datasets. At some point in the macro I keep the max of multiple entries for the financial variables (lvar). I would like to know what SAS does with the non-financial variables: which one is kept in the case of multiple entries? How can I control this part of the process? 

 

 

option mprint symbolgen;
%macro cc(dsin,dsout);
%do y=1990 %to 2022;

data bv_&y; set &dsin; if FYEAR=&y; proc sort data=bv_&y nouniquekey uniqueout=bvu_&y; by ID;
data bv_&y; set &dsin; if FYEAR=&y; proc sort data=bv_&y nouniquekey out=bvm_&y; by ID;
proc summary data=bvm_&y nway; class ID; var &lvar.; id FYEAR; output out=bvm_&y max=;
%end;

data &dsout; set bvu_1990-bvu_2022 bvm_1990-bvm_2022; run;
data &dsout; set &dsout; drop _FREQ_ _TYPE_ i j;
%mend;

%cc(d1,d2)
%cc(c1,c2)
option nomprint nosymbolgen;

 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

@Satori wrote:

I'm running the macro below, where lvar is a list of financial variables. There are other variables in the datasets. At some point in the macro I keep the max of multiple entries for the financial variables (lvar). I would like to know what SAS does with the non-financial variables: which one is kept in the case of multiple entries? How can I control this part of the process? 

 


  • Can you provide a portion of the data as WORKING data step code?
  • Can you be specific about what you are talking about in your last two sentences? What non-financial variables? What do you mean "which one is kept"? Kept by what? Control so that what happens?
--
Paige Miller
Quentin
Super User

When trying to understand what a macro is doing, sometimes it's helpful to de-macrify the code, so you can run it in steps and inspect.  I think the below code will produce your d2 dataset, with two years of data.  You didn't show the value of &lvar, so I left that macro reference in place:

 

data bv_1990; set d1; if FYEAR=1990; proc sort data=bv_1990 nouniquekey uniqueout=bvu_1990; by ID;
data bv_1990; set d1; if FYEAR=1990; proc sort data=bv_1990 nouniquekey out=bvm_1990; by ID;
proc summary data=bvm_1990 nway; class ID; var &lvar.; id FYEAR; output out=bvm_1990 max=;


data bv_1991; set d1; if FYEAR=1991; proc sort data=bv_1991 nouniquekey uniqueout=bvu_1991; by ID;
data bv_1991; set d1; if FYEAR=1991; proc sort data=bv_1991 nouniquekey out=bvm_1991; by ID;
proc summary data=bvm_1991 nway; class ID; var &lvar.; id FYEAR; output out=bvm_1991 max=;

data d2; set bvu_1990-bvu_1991 bvm_1990-bvm_1991; run;

Looking at that, the code is confusing to me.  bvu_1990 will have records from D1 with FYEAR=1990 that have unique values for ID.  It could have variables that are not listed in &lvar.  bvm_1990 will only have variables that are listed in &lvar.  It's odd to stack the original data and the MAX data back together in this way.  It's possible all of this could be done without a macro, by adding FYEAR to the BY statement and CLASS statement.  Or perhaps in single PROC SQL step.

 

I would suggest trying to trace/debug the code without the macro.  Then once you have non-macro code working like you want, you can think about how to update the macro.

BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ballardw
Super User

@Satori wrote:

I'm running the macro below, where lvar is a list of financial variables. There are other variables in the datasets. At some point in the macro I keep the max of multiple entries for the financial variables (lvar). I would like to know what SAS does with the non-financial variables: which one is kept in the case of multiple entries? How can I control this part of the process? 

At which step???

 

Data steps will have all variables provided in data sets on a SET statement and created in the course of the data step unless they are explicitly dropped somewhere.

 

Proc Summary will generally only have variables used or created by the proc. If you expect "other" variables to appear in Proc Summary output you will have to provide a very explicit example with input data and what you expect the output to look like.

 

I will say that I suspect you have a poor data structure as you have different variables with the year in the name. Often it is better in the long run to have a variable with the Year value and all variables to be the same name.

 

This apparent process with a different data structure would likely remove any need for looping and likely any need for a macro at all. Just use the Year variable as one of the BY or Class variables.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 390 views
  • 1 like
  • 4 in conversation