I am trying to conditionally assign multiple variables to the same value. Is there a better way to do this than using if/ then do logic?
data scores;
infile datalines dsd;
input Name : $9. Score1-Score3 Team ~ $25. Div $;
datalines;
Smith,12,22,46,"Green Hornets, Atlanta",AAA
Mitchel,23,19,25,"High Volts, Portland",AAA
Jones,09,17,54,"Vulcans, Las Vegas",AA
;
data test;
set scores;
if name = "Smith" then do;
Score1=10;
Score2=10;
Score3=10;
end;
run;
I tried something like this, but this returns 0, which I think means "no, the two statements are not equal" (because Score2 is not the same as Score3).
data test;
set scores;
if name = "Smith" then do;
Score1=10=Score2=Score3;
*or Score1=Score2=Score3=10;
end;
run;
Any ideas, or is my way the way to go?
Actually, you own method is not bad at all; it has only a couple of drawbacks:
1. The verbosity of as many assignment statements as many Score variables you have. As @novinosrin and @PeterClemmensen have shown, it is easily addressed using an array and a DO loop. Alternatively, you can generate the code for multiple assignments using a macro, e.g.:
%macro vset (vname, N, value) ;
%do i = 1 %to &N ;
&vname&i = &value ;
%end ;
%mend ;
data want ;
set have ;
if name = "Smith" then %vset (score, 3, 10) ;
run ;
Plus, there exist other methods of assembling executable SAS code (such as CALL EXECUTE, for exampe).
2. Each Score variable has to be assigned 10 one at a time every time name="Smith", which can present a performance problem if you have many variables (say, 3000 instead of just 3) and many records where name="Smith" (100 million, say). Thus, the question would be: How to assign a given value to all the variables in question en masse, i.e. at once, rather than one at a time? Well, that can be done in more than one way:
(A) Methinks the simplest is to prepare a "fill" 1-record SAS data set beforehand and just read that record when needed:
data have ;
input name :$5. score1-score5 ;
cards ;
Baker 1 2 3 23 5
Smith 4 50 6 1 13
Clark 7 8 9 19 7
Smith 1 13 17 3 29
Mason 8 7 12 17 11
run ;
%let n = 5 ;
%let v = 10 ;
data fill ;
array score [&n] (&n*&v) ;
run ;
sasfile fill load ;
data want ;
set have ;
retain p 1 ;
if name = "Smith" then set fill point = p ;
run ;
sasfile fill close ;
Note that you can fill the Score array items in FILL with any values you want, not just the same single value. If so, the entire logic of doing so would be confined to the first DATA step. The SASFILE statement keeps FILL in memory for the sake of speed of execution. Of course, this method is good for as many "fill" variables as can be had in the DATA step - in other words, there's no practical limit.
(B) A different method uses the APP functions and works by sticking a preset string of real binary images of the needed "fill" numeric values directly into memory. The necessary prep to compose the string and static memory addresses is prepared at _N_=1, so that it wouldn't have to be done in every record:
data want (drop = _:) ;
set have ;
if _n_ = 1 then do ;
retain _a _f ;
_a = addrlong (score1) ;
_f = put (repeat (put (&v,rb8.),&n-1),$%eval(8*&n).) ;
end ;
if name = "Smith" then call pokelong (_f, _a) ;
run ;
This en masse method is very fast, but it has a limitation: Since the string _F cannot exceed $32767, it's limited to int(32767/8)=4095 variables.
(C) Yet another approach is in principle similar to the POINT= method, only instead of getting the fill values from the single data set record, it gets them from a single hash table item:
data want ;
if _n_ = 1 then do ;
dcl hash h (dataset: "have(keep=score: obs=0)") ;
h.definekey ("score1") ;
h.definedata (all:"Y") ;
h.definedone () ;
if 0 then set have (keep=score:) ;
array score [&n] (&n * &v) ;
h.add() ;
end ;
set have ;
if name = "Smith" then h.find (key: &v) ;
run ;
I expect this method to be on par with POINT= or a tad faster; and it has no limit on the number of Score variables, either.
Kind regards
Paul D.
data test;
set scores;
array score(3);
if name = "Smith" then
do _i_=1 to dim( score);
score(_i_)=10;
end;
drop _i_;
run;
hmm since score vars are in the input dataset, try an implicit array
data test;
set scores;
array s score:;
if name = "Smith" then
do over s;
s=10;
end;
run;
Your first method is the way to do it.
You are right. The second syntax is being interpreted as a boolean expression to be evaluated and assigned to the variable listed as the target.
data want(drop=i);
set scores;
array Score{*} Score:;
if name = "Smith" then do i=1 to dim(Score);
Score[i]=10;
end;
run;
Are you assigning a different set of values for each Name or only name='Smith'?
If you have different values for different names then you likely want something completely different for shorter code but a complete description of the needs will help.
Actually, you own method is not bad at all; it has only a couple of drawbacks:
1. The verbosity of as many assignment statements as many Score variables you have. As @novinosrin and @PeterClemmensen have shown, it is easily addressed using an array and a DO loop. Alternatively, you can generate the code for multiple assignments using a macro, e.g.:
%macro vset (vname, N, value) ;
%do i = 1 %to &N ;
&vname&i = &value ;
%end ;
%mend ;
data want ;
set have ;
if name = "Smith" then %vset (score, 3, 10) ;
run ;
Plus, there exist other methods of assembling executable SAS code (such as CALL EXECUTE, for exampe).
2. Each Score variable has to be assigned 10 one at a time every time name="Smith", which can present a performance problem if you have many variables (say, 3000 instead of just 3) and many records where name="Smith" (100 million, say). Thus, the question would be: How to assign a given value to all the variables in question en masse, i.e. at once, rather than one at a time? Well, that can be done in more than one way:
(A) Methinks the simplest is to prepare a "fill" 1-record SAS data set beforehand and just read that record when needed:
data have ;
input name :$5. score1-score5 ;
cards ;
Baker 1 2 3 23 5
Smith 4 50 6 1 13
Clark 7 8 9 19 7
Smith 1 13 17 3 29
Mason 8 7 12 17 11
run ;
%let n = 5 ;
%let v = 10 ;
data fill ;
array score [&n] (&n*&v) ;
run ;
sasfile fill load ;
data want ;
set have ;
retain p 1 ;
if name = "Smith" then set fill point = p ;
run ;
sasfile fill close ;
Note that you can fill the Score array items in FILL with any values you want, not just the same single value. If so, the entire logic of doing so would be confined to the first DATA step. The SASFILE statement keeps FILL in memory for the sake of speed of execution. Of course, this method is good for as many "fill" variables as can be had in the DATA step - in other words, there's no practical limit.
(B) A different method uses the APP functions and works by sticking a preset string of real binary images of the needed "fill" numeric values directly into memory. The necessary prep to compose the string and static memory addresses is prepared at _N_=1, so that it wouldn't have to be done in every record:
data want (drop = _:) ;
set have ;
if _n_ = 1 then do ;
retain _a _f ;
_a = addrlong (score1) ;
_f = put (repeat (put (&v,rb8.),&n-1),$%eval(8*&n).) ;
end ;
if name = "Smith" then call pokelong (_f, _a) ;
run ;
This en masse method is very fast, but it has a limitation: Since the string _F cannot exceed $32767, it's limited to int(32767/8)=4095 variables.
(C) Yet another approach is in principle similar to the POINT= method, only instead of getting the fill values from the single data set record, it gets them from a single hash table item:
data want ;
if _n_ = 1 then do ;
dcl hash h (dataset: "have(keep=score: obs=0)") ;
h.definekey ("score1") ;
h.definedata (all:"Y") ;
h.definedone () ;
if 0 then set have (keep=score:) ;
array score [&n] (&n * &v) ;
h.add() ;
end ;
set have ;
if name = "Smith" then h.find (key: &v) ;
run ;
I expect this method to be on par with POINT= or a tad faster; and it has no limit on the number of Score variables, either.
Kind regards
Paul D.
@hashman Guru, I should get paid on Friday(Biweekly check at Citizens lol). Let me know should i pay for this enormous knowledge share though priceless. WOW!!!!!!!!!!
PS Understood everything, SASFILE is something new. henceforth, new addition to my continued learning. Cheers!
My pleasure. I knew you would understand everything ... in fact, methought you'd chime in with the APP approach yourself.
SASFILE helps particularly well when POINT= reads out of order - for typical example, if a sorted file is looked up using the binary or interpolation search. Of course, there should be enough memory to house the whole thing. But in this case, when we have only 1 record in all, it would never present a problem even with a few thousand variables in question.
Kind regards
Paul D.
Thanks! This works perfectly and gives me a lot to think about - I really appreciate your thorough answers (and your new book!)
You are very welcome.
And if by the "new book" you mean our hash book published in July 2018, @DonH deserves as much credit, if not more, than yours truly.
Kind regards
Paul D.
Your first solution that is mixing macro and data step coding is missing a required DO/END block. Currently the program will generate code like:
if name = "Smith" then score1=10;
score2=10;
score3=10;
;
Fix:
if name = "Smith" then do; %vset (score, 3, 10) end;
@Tom :
'Tis true. A truly good attentive eye! Thanks for the fix.
OTOH, the macro can be kept as is but just called from the DATA step differently:
if name ne "Smith" then return ;
%vset (score, 3, 10)
Kind regards
Paul D.
Hi, Hashman.
I am very interested at your method that using the memory addresses. It is so amazing to me. Could you please give me more examples about this? Many thanks.
At the risk of self-aggrandizing, I'd suggest that you read one of my APP papers, for instance (not sure it's the latest):
https://support.sas.com/resources/papers/proceedings14/1510-2014.pdf
There're many examples in the paper - and, perhaps more importantly, a decent amount of theory. A good theory, as physicists say, is the most practical thing.
Kind regards
Paul D.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.