Hi,
I have a dataset where fields are: CHANNEL | A | B | C
- CHANNEL is a character field which cointains 'A', 'B', or 'C'
- A, B, C are numeric fileds which contain some numbers
I'd like to create in a datastep a new column VALUE, with the following logic: if CHANNEL = 'A' then VALUE = A, if CHANNEL = 'B' THEN VALUE = B, etc.
In other words, VALUE must contain the respective value of the column indicated in field CHANNEL for each row.
An example:
CHANNEL | A | B | C | VALUE |
---|---|---|---|---|
B | 1 | 2 | 3 | 2 |
A | 4 | 5 | 6 | 4 |
A | 7 | 8 | 9 | 7 |
C | 10 | 11 | 12 | 12 |
In real world CHANNELs are hundreds, so I can't add hundreds of if in the datastep, I'd like to write a more elegant solution in a few rows of code.
Thanks a lot.
Greetings
Use the VVALUEX function
data have;
input (CHANNEL A B C) (:$8.);
cards;
B 1 2 3
A 4 5 6
A 7 8 9
C 10 11 12
;
run;
data want;
set have;
value=vvaluex(channel);
run;
Are you saying that you have hundreds of variables A B C or some non-sequentially named list of a hundred variables that you need to apply?
This works for you example but HUNDREDS of values for Channel are going to get a bit rough.
data work.junk;
input CHANNEL $ A B C;
array ChannelKeys {3} $ _temporary_ ("A" "B" "C"); /* the values of CHANNEL you are searching for go here*/
array vars A B C; /* list of variables with values to assign go here*/
value = vars[whichc(channel, of ChannelKeys
datalines;
B 1 2 3
A 4 5 6
A 7 8 9
C 10 11 12
;
run;
thank you BallardW for the test data.
This is a lookup or mapping problem.
I am sure there are issues with the case of the value of Channel
and the list of variables in the array.
I figured this out with the help of Knuth, who has several programs about sorting and searching.
data work.junk;
input CHANNEL $ A B C;
array vars A B C; /* list of variables with values to assign go here*/
do i = 1 to dim(vars);
if Channel eq vname(vars(i)) then do;
value = vars(i);
leave;
end ;
end ;
datalines;
B 1 2 3
A 4 5 6
A 7 8 9
C 10 11 12
;
run;
proc print data = &syslast;
Ron Fehd Donald E Knuth maven
I have written one solution before using Hash, but here is another one using Revolve Function, based on your simplified scenario:
data have;
input (CHANNEL A B C) (:$8.);
cards;
B 1 2 3
A 4 5 6
A 7 8 9
C 10 11 12
;
data want;
set have;
array var channel--c;
do over var;
call symputx(vname(var),var);
end;
value=resolve('&&&channel');
run;
It can be tweaked to adapt to your numeric variables if you have any.
Art, I think we have a winner: Reeza!
As for my code:
data want;
set have;
array var channel--c; /*This is to assume all of the variable are char, and fit them all in an array*/
do over var;
call symputx(vname(var),var);/*This is to assign the value to the same-name macro variable*/
end;
value=resolve('&&&channel');
/*Not like Symget(), Resolve () will resolve to the bottom of a macro variable for instance of first row:
&&&channel -> &B -> 2*/
run;
Use the VVALUEX function
data have;
input (CHANNEL A B C) (:$8.);
cards;
B 1 2 3
A 4 5 6
A 7 8 9
C 10 11 12
;
run;
data want;
set have;
value=vvaluex(channel);
run;
SAS does have an incredible deep bench for functions! Well done, Reeza!
Yay! What do I win ?
I actually presented a talk on avoiding macro's on Wednesday at our user group meeting - this was one of my examples, so fresh in my head.
*Presented as in wrote, unfortunately not able to actually attend and present
I don't think I've ever seen the call symput and resolve used that way before, its a good solution too
Ouch! Where's my styptic pencil? I think I've just been cut with Occam's Razer!
Minimally, you win both Haikuo's and my gratitude!
While your solution is clearly the best for this problem, I'm still interested in finding out more about Haikuo's approach.
I've always learned that macro variables CAN'T be used in the data step in which they are created.
Quote from Art :"I've always learned that macro variables CAN'T be used in the data step in which they are created."
True in general. Most of Macro elements will first be executed/resolved before go to the SAS compiler, unless you tell SAS not to. In above example,
value=resolve('&&&channel'); used SINGLE quote to skip the Macro facility and directly enter the SAS execution, and only resolved when that statement is hit.
Arthur.T ,
You can do it also. But I have to point out that way is NOT efficient ,including Bian HaiKuo's code .
data want;
set have;
array var channel--c;
do over var;
call symputx(vname(var),var);
end;
value=symgetn(CHANNEL);
run;
Xia Keshan
: I will try to be careful shaving the code but, seriously, I have never thought of these possibilities and usually throw in a CALL EXECUTE when I've needed to utilize variable values and couldn't do it directly.
The following is a rather useless example that simply transposes a file into one long record. I've always wanted a way to pass a value into an array declaration and the following method appears to work. This definitely gives me something extra to think about over the weekend:
data want (keep=all:);
set have nobs=numobs end=last;
array vars channel--c;
if _n_ eq 1 then do;
i=0;
call symputx('nmobs',numobs*4);
array all(&&nmobs.) $;
retain all:;
end;
do over vars;
i+1;
all(i)=catt(vars);
end;
if last then output;
run;
Art -
That is not going to work.
The value of the macro variable used in the ARRAY ALL statement is the value from before the data step starts executing.
Try adding %SYMDEL NMOBS ; statement before the DATA step and check the log for error messages.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.