BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
theponcer
Quartz | Level 8

 

I am trying to conditionally assign multiple variables to the same value. Is there a better way to do this than using if/ then do logic?

 


data scores; infile datalines dsd; input Name : $9. Score1-Score3 Team ~ $25. Div $; datalines; Smith,12,22,46,"Green Hornets, Atlanta",AAA Mitchel,23,19,25,"High Volts, Portland",AAA Jones,09,17,54,"Vulcans, Las Vegas",AA ; data test; set scores; if name = "Smith" then do; Score1=10; Score2=10; Score3=10; end; run;

 

 

I tried something like this, but this returns 0, which I think means "no, the two statements are not equal" (because Score2 is not the same as Score3). 

 

data test;
set scores;
	if name = "Smith" then do;
		Score1=10=Score2=Score3;
*or Score1=Score2=Score3=10;
	end;
run;

Any ideas, or is my way the way to go?

1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@theponcer :

Actually, you own method is not bad at all; it has only a couple of drawbacks:

 

1. The verbosity of as many assignment statements as many Score variables you have. As @novinosrin and @PeterClemmensen have shown, it is easily addressed using an array and a DO loop. Alternatively, you can generate the code for multiple assignments using a macro, e.g.:

%macro vset (vname, N, value) ;
  %do i = 1 %to &N ;
    &vname&i = &value ;
  %end ;
%mend ;

data want ;
  set have ;
  if name = "Smith" then %vset (score, 3, 10) ;
run ;   

Plus, there exist other methods of assembling executable SAS code (such as CALL EXECUTE, for exampe).

 

2. Each Score variable has to be assigned 10 one at a time every time name="Smith", which can present a performance problem if you have many variables (say, 3000 instead of just 3) and many records where name="Smith" (100 million, say). Thus, the question would be: How to assign a given value to all the variables in question en masse, i.e. at once, rather than one at a time? Well, that can be done in more than one way:

 

(A) Methinks the simplest is to prepare a "fill" 1-record SAS data set beforehand and just read that record when needed:

data have ;                                  
  input name :$5. score1-score5 ;            
  cards ;                                    
Baker  1   2   3  23   5                     
Smith  4  50   6   1  13                     
Clark  7   8   9  19   7                     
Smith  1  13  17   3  29                     
Mason  8   7  12  17  11                     
run ;                                        

%let n = 5 ;
%let v = 10 ;
data fill ; array score [&n] (&n*&v) ; run ;
sasfile fill load ;
data want ; set have ; retain p 1 ; if name = "Smith" then set fill point = p ; run ;

sasfile fill close ;

Note that you can fill the Score array items in FILL with any values you want, not just the same single value. If so, the entire logic of doing so would be confined to the first DATA step. The SASFILE statement keeps FILL in memory for the sake of speed of execution. Of course, this method is good for as many "fill" variables as can be had in the DATA step - in other words, there's no practical limit.   

 

(B) A different method uses the APP functions and works by sticking a preset string of real binary images of the needed "fill" numeric values directly into memory. The necessary prep to compose the string and static memory addresses is prepared at _N_=1, so that it wouldn't have to be done in every record:

data want (drop = _:) ;                                   
  set have ;                                              
  if _n_ = 1 then do ;                                    
    retain _a _f ;                                        
    _a = addrlong (score1) ;                              
    _f = put (repeat (put (&v,rb8.),&n-1),$%eval(8*&n).) ;
  end ;                                                   
  if name = "Smith" then call pokelong (_f, _a) ;         
run ;                                                     

This en masse method is very fast, but it has a limitation: Since the string _F cannot exceed $32767, it's limited to int(32767/8)=4095 variables. 

 

(C) Yet another approach is in principle similar to the POINT= method, only instead of getting the fill values from the single data set record, it gets them from a single hash table item:

data want ;                                           
  if _n_ = 1 then do ;                                
    dcl hash h (dataset: "have(keep=score: obs=0)") ;
    h.definekey ("score1") ;                          
    h.definedata (all:"Y") ;                          
    h.definedone () ;                                 
    if 0 then set have (keep=score:) ;                
    array score [&n] (&n * &v) ;                      
    h.add() ;                                         
  end ;                                               
  set have ;                                          
  if name = "Smith" then h.find (key: &v) ;           
run ;                                                 

I expect this method to be on par with POINT= or a tad faster; and it has no limit on the number of Score variables, either.

 

Kind regards

Paul D.

View solution in original post

23 REPLIES 23
novinosrin
Tourmaline | Level 20
data test;
 set scores;
 array score(3);
 if name = "Smith" then 
 do _i_=1 to dim( score);
 score(_i_)=10;
 end;	
 drop _i_;
run;

hmm since score vars are in the input dataset, try an implicit array

 

data test;
 set scores;
 array s score:;
 if name = "Smith" then 
 do over s;
 s=10;
 end;	
run;

 

Tom
Super User Tom
Super User

Your first method is the way to do it.

You are right. The second syntax is being interpreted as a boolean expression to be evaluated and assigned to the variable listed as the target.

PeterClemmensen
Tourmaline | Level 20
data want(drop=i);
	set scores;
	array Score{*} Score:;
	if name = "Smith" then do i=1 to dim(Score);
		Score[i]=10;
	end;
run;
tomrvincent
Rhodochrosite | Level 12
data test;
set scores;

if name = "Smith" then
do;

%macro foo;
%do i=1 %to 3;
Score&i=10;
%end;
%mend foo;

%foo;
end;
run;
ballardw
Super User

Are you assigning a different set of values for each Name or only name='Smith'?

If you have different values for different names then you likely want something completely different for shorter code but a complete description of the needs will help.

hashman
Ammonite | Level 13

@theponcer :

Actually, you own method is not bad at all; it has only a couple of drawbacks:

 

1. The verbosity of as many assignment statements as many Score variables you have. As @novinosrin and @PeterClemmensen have shown, it is easily addressed using an array and a DO loop. Alternatively, you can generate the code for multiple assignments using a macro, e.g.:

%macro vset (vname, N, value) ;
  %do i = 1 %to &N ;
    &vname&i = &value ;
  %end ;
%mend ;

data want ;
  set have ;
  if name = "Smith" then %vset (score, 3, 10) ;
run ;   

Plus, there exist other methods of assembling executable SAS code (such as CALL EXECUTE, for exampe).

 

2. Each Score variable has to be assigned 10 one at a time every time name="Smith", which can present a performance problem if you have many variables (say, 3000 instead of just 3) and many records where name="Smith" (100 million, say). Thus, the question would be: How to assign a given value to all the variables in question en masse, i.e. at once, rather than one at a time? Well, that can be done in more than one way:

 

(A) Methinks the simplest is to prepare a "fill" 1-record SAS data set beforehand and just read that record when needed:

data have ;                                  
  input name :$5. score1-score5 ;            
  cards ;                                    
Baker  1   2   3  23   5                     
Smith  4  50   6   1  13                     
Clark  7   8   9  19   7                     
Smith  1  13  17   3  29                     
Mason  8   7  12  17  11                     
run ;                                        

%let n = 5 ;
%let v = 10 ;
data fill ; array score [&n] (&n*&v) ; run ;
sasfile fill load ;
data want ; set have ; retain p 1 ; if name = "Smith" then set fill point = p ; run ;

sasfile fill close ;

Note that you can fill the Score array items in FILL with any values you want, not just the same single value. If so, the entire logic of doing so would be confined to the first DATA step. The SASFILE statement keeps FILL in memory for the sake of speed of execution. Of course, this method is good for as many "fill" variables as can be had in the DATA step - in other words, there's no practical limit.   

 

(B) A different method uses the APP functions and works by sticking a preset string of real binary images of the needed "fill" numeric values directly into memory. The necessary prep to compose the string and static memory addresses is prepared at _N_=1, so that it wouldn't have to be done in every record:

data want (drop = _:) ;                                   
  set have ;                                              
  if _n_ = 1 then do ;                                    
    retain _a _f ;                                        
    _a = addrlong (score1) ;                              
    _f = put (repeat (put (&v,rb8.),&n-1),$%eval(8*&n).) ;
  end ;                                                   
  if name = "Smith" then call pokelong (_f, _a) ;         
run ;                                                     

This en masse method is very fast, but it has a limitation: Since the string _F cannot exceed $32767, it's limited to int(32767/8)=4095 variables. 

 

(C) Yet another approach is in principle similar to the POINT= method, only instead of getting the fill values from the single data set record, it gets them from a single hash table item:

data want ;                                           
  if _n_ = 1 then do ;                                
    dcl hash h (dataset: "have(keep=score: obs=0)") ;
    h.definekey ("score1") ;                          
    h.definedata (all:"Y") ;                          
    h.definedone () ;                                 
    if 0 then set have (keep=score:) ;                
    array score [&n] (&n * &v) ;                      
    h.add() ;                                         
  end ;                                               
  set have ;                                          
  if name = "Smith" then h.find (key: &v) ;           
run ;                                                 

I expect this method to be on par with POINT= or a tad faster; and it has no limit on the number of Score variables, either.

 

Kind regards

Paul D.

novinosrin
Tourmaline | Level 20

@hashman  Guru, I should get paid on Friday(Biweekly check at Citizens lol). Let me know should i pay for this enormous knowledge share though priceless. WOW!!!!!!!!!!

 

PS  Understood everything, SASFILE is something new. henceforth, new addition to my continued learning. Cheers!

hashman
Ammonite | Level 13

@novinosrin:

My pleasure. I knew you would understand everything ... in fact, methought you'd chime in with the APP approach yourself.

 

SASFILE helps particularly well when POINT= reads out of order - for typical example, if a sorted file is looked up using the binary or interpolation search. Of course, there should be enough memory to house the whole thing. But in this case, when we have only 1 record in all, it would never present a problem even with a few thousand variables in question.

 

Kind regards

Paul D.  

theponcer
Quartz | Level 8

Thanks! This works perfectly and gives me a lot to think about - I really appreciate your thorough answers (and your new book!)

hashman
Ammonite | Level 13

@theponcer :

 

You are very welcome.

And if by the "new book" you mean our hash book published in July 2018, @DonH deserves as much credit, if not more, than yours truly.

 

Kind regards

Paul D.

Tom
Super User Tom
Super User

Your first solution that is mixing macro and data step coding is missing a required DO/END block.  Currently the program will generate code like:

if name = "Smith" then score1=10;
score2=10;
score3=10;
;

Fix:

if name = "Smith" then do; %vset (score, 3, 10)  end;
hashman
Ammonite | Level 13

@Tom :

 

'Tis true. A truly good attentive eye! Thanks for the fix. 

 

OTOH, the macro can be kept as is but just called from the DATA step differently:

  if name ne "Smith" then return ;
  %vset (score, 3, 10)

Kind regards

Paul D.

 

whymath
Lapis Lazuli | Level 10

Hi, Hashman.

 

I am very interested at your method that using the memory addresses. It is so amazing to me. Could you please give me more examples about this? Many thanks.

hashman
Ammonite | Level 13

@whymath

At the risk of self-aggrandizing, I'd suggest that you read one of my APP papers, for instance (not sure it's the latest):

https://support.sas.com/resources/papers/proceedings14/1510-2014.pdf 

There're many examples in the paper - and, perhaps more importantly, a decent amount of theory. A good theory, as physicists say, is the most practical thing.

 

Kind regards

Paul D. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 15757 views
  • 8 likes
  • 11 in conversation