data adj(keep=region product sub_product); set pd; nondefault=sum(active,-default); if default=. then default=0; if nondefault=. then nondefault=0; if default=0; run;
I find this very strange.
1. if we are only keeping region product sub_product, whatever we do on the following statements were not saved so what is the point..
2. I was told the subsequent line only run if the condition is met, so for the last if statement, what is it doing, it was followed by run.. what does that mean if default=0 is true.. what is left to run...
I point out that in response to your first question
if we are only keeping region product sub_product, whatever we do on the following statements were not saved so what is the point..
you are getting a data set named ADJ that contains variables REGION PRODUCT SUB_PRODUCT, and no other variables, and certain records are selected based up on the logic in the IF statements, that's the point. Perhaps the logic and DATA step could have been coded more cleanly, as it appears NONDEFAULT has no use in this code.
The last IF statement selects certain records to be included in the data set named ADJ.
Also, please see Maxim 4: "If in doubt do a test run and look at the results". Did you do that, and specifically, did you look at the results and compare them to the input data set named PD? This is an extremely effective and useful way to learn and debug.
89   data adj(keep=region product sub_product);
90   set pd;
91   put ;
92   put "1 - " region= product= sub_product= nondefault= active= default= ;
93   nondefault=sum(active,-default);
94   put "2 - " region= product= sub_product= nondefault= active= default= ;
95   if default=. then default=0;
96   put "3 - " region= product= sub_product= nondefault= active= default= ;
97   if nondefault=. then nondefault=0;
98   put "4 - " region= product= sub_product= nondefault= active= default= ;
99   if default=0;
100  put "5 - " region= product= sub_product= nondefault= active= default= ;
101  run;
1 - region=RegionA product=Missing sub_product=SP1 nondefault=64 active=32 default=.
2 - region=RegionA product=Missing sub_product=SP1 nondefault=32 active=32 default=.
3 - region=RegionA product=Missing sub_product=SP1 nondefault=32 active=32 default=0
4 - region=RegionA product=Missing sub_product=SP1 nondefault=32 active=32 default=0
5 - region=RegionA product=Missing sub_product=SP1 nondefault=32 active=32 default=0
1 - region=RegionB product=Zero sub_product=SP1 nondefault=32 active=8 default=0
2 - region=RegionB product=Zero sub_product=SP1 nondefault=8 active=8 default=0
3 - region=RegionB product=Zero sub_product=SP1 nondefault=8 active=8 default=0
4 - region=RegionB product=Zero sub_product=SP1 nondefault=8 active=8 default=0
5 - region=RegionB product=Zero sub_product=SP1 nondefault=8 active=8 default=0
1 - region=RegionA product=Product1 sub_product=SP1 nondefault=8 active=4 default=2
2 - region=RegionA product=Product1 sub_product=SP1 nondefault=2 active=4 default=2
3 - region=RegionA product=Product1 sub_product=SP1 nondefault=2 active=4 default=2
4 - region=RegionA product=Product1 sub_product=SP1 nondefault=2 active=4 default=2
Above log generated from following code:
data pd ;
	infile cards ;
	input region $ product $ sub_product $ nondefault active default ;
cards ; 
RegionA Missing SP1 64 32 . 
RegionB Zero SP1 32 8 0 
RegionA Product1 SP1 8 4 2
;
run ;
data adj(keep=region product sub_product);
set pd;
put ;
put "1 - " region= product= sub_product= nondefault= active= default= ;
nondefault=sum(active,-default);
put "2 - " region= product= sub_product= nondefault= active= default= ;
if default=. then default=0;
put "3 - " region= product= sub_product= nondefault= active= default= ;
if nondefault=. then nondefault=0;
put "4 - " region= product= sub_product= nondefault= active= default= ;
if default=0;
put "5 - " region= product= sub_product= nondefault= active= default= ;
run;
The final IF statement selects all records with default=0. Not sure about the meaning of your data, but it's possible the dataset work.adj has all sub-products that do not have a status of default. So while DEFAULT was not written to the output dataset, it was used to select the records that are written to the output dataset;
Just like:
data female (keep=name);
  set sashelp.class;
  if sex="F";
run;would select the names of the female students.
Hello @HeatherNewton 
This code creates the output data set adj with  region, product and sub_product from records in the source data sets in which default values is missing or zero.
Your observation that subsequent lines run only if the condition is met is correct. Here the action is implied and that is output to the dataset.
When I have doubts of these types I would prefer referring the fundamental concepts in books (My favorite is The Little SAS book A primer). I realize that making a living with SAS, one has to make an investment in learning. I would recommend the same.
data adj(keep=region product sub_product);
/* a new dataset named adj is created, and it will only contain the three variables named in the KEEP= dataset option */
set pd;
/* dataset pd is read, with all variables, so they all will be in the PDV */
nondefault=sum(active,-default);
/* a new variable is created (or an existing one gets a new value), using the SUM() function */
if default=. then default=0;
/* this replaces all missing values in variable default with zero */
if nondefault=. then nondefault=0;
/* the same is done for nondefault */
if default=0;
/* this Subsetting IF terminates a data step iteration if default is not 0 */
run;The code is misleading and inefficient, it is equivalent to this:
data adj (keep=region product sub_product);
set pd;
where default in (.,0);
run;The work on variables default and nondefault is unnecessary because these are not kept, and the condition for default can be handled in the much more performant WHERE than in a subsetting IF.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
