Hi all,
I have a problem how to proceed in sas for the conversion of two dataset in one, based on a reference dataset.
These is my problem in simple words:
"Reference data"
G1 | G2 | |
var1 | A | B |
var2 | C | D |
var3 | E | F |
Letters in variable G1 are equivalent to '1' and variable G2 are equivalent to '0'. But are not the same letter for all var1--3
"Real data"
var1 | var2 | var3 | |
ID1 | AA | CD | FE |
ID2 | BA | CC | . |
ID3 | BB | . | FF |
ID4 | AB | CD | EF |
I have a combination of G1 and G2, and also missing values...
What I want:
var1 | var2 | var3 | |
ID1 | 2 | 1 | 1 |
ID2 | 1 | 2 | . |
ID3 | 0 | . | 0 |
ID4 | 1 | 1 | 1 |
I was thinking in counting letter one based on G1 of my reference dataset.. but I'm not sure about it.
I must to confess that I'm lost in how to solved it. Is very simple what I want, but I'm pretty sure that the solution is not so simple.
Thanks a lot
I very like this kind of question.
data x;
infile cards expandtabs;
input (vname G1 G2) ($);
cards;
var1 A B
var2 C D
var3 E F
;
run;
data have;
infile cards expandtabs;
input (id var1 var2 var3) ($);
cards;
ID1 AA CD FE
ID2 BA CC .
ID3 BB . FF
ID4 AB CD EF
;
run;
data key;
set x;
g=g1;value=1;output;
g=g2;value=0;output;
drop G1 G2;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('vname','g');
h.definedata('value');
h.definedone();
end;
set have;
array x{*} $ var1-var3;
array y{*} new_var1-new_var3;
do i=1 to dim(x);
do j=1 to length(x{i});
vname=vname(x{i});g=char(x{i},j);
if h.find()=0 then y{i}=sum(y{i},value);
end;
end;
keep id var1-var3 new_var1-new_var3;
run;
proc print;run;
Please post usable examples (tested data steps with datalines) for the reference and real datasets. Right now I can't make sense of your post.
Note that "G1=1" is not a valid SAS column name.
I very like this kind of question.
data x;
infile cards expandtabs;
input (vname G1 G2) ($);
cards;
var1 A B
var2 C D
var3 E F
;
run;
data have;
infile cards expandtabs;
input (id var1 var2 var3) ($);
cards;
ID1 AA CD FE
ID2 BA CC .
ID3 BB . FF
ID4 AB CD EF
;
run;
data key;
set x;
g=g1;value=1;output;
g=g2;value=0;output;
drop G1 G2;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('vname','g');
h.definedata('value');
h.definedone();
end;
set have;
array x{*} $ var1-var3;
array y{*} new_var1-new_var3;
do i=1 to dim(x);
do j=1 to length(x{i});
vname=vname(x{i});g=char(x{i},j);
if h.find()=0 then y{i}=sum(y{i},value);
end;
end;
keep id var1-var3 new_var1-new_var3;
run;
proc print;run;
Thank you very much! 🙂 It works perfectly...
Likes x 10000 !
Another suggestion:
data ref;
input varname $ (g1 g2)(:$1.);
cards;
var1 A B
var2 C D
var3 E F
;
data have;
input (id var1-var3)($);
cards;
1 AA CD FE
2 BA CC .
3 BB . FF
4 AB CD EF
;
proc transpose data=ref out=tref(drop=_:);
var g1;
run;
data want(drop=c:);
if _n_=1 then set tref;
set have(rename=(var1-var3=cvar1-cvar3));
array col[3];
array cvar[3];
array var[3];
do c=1 to dim(var);
var[c]=ifn(cvar[c]>' ',countc(cvar[c],col[c]),.);
end;
run;
data want;
set realdata;
array vars[3] $ var1-var3;
array resultvars[3] rvar1-rvar3;
do i = 1 to dim(vars);
if vars[i] ne "" then resultvars[i]=countc(vars[i],"ACE");
end;
drop var1-var3 i;
run;
Possibly the simplest solution. Don't bother transforming them just count the amount of times the letters ACE appear.
Edit: Ah. Just noticed that you have done the same but in a more dynamic way. oops.
Hi @DanielLangley Very trivial extension to your idea ,
a one extra pass will just macrotise it
data _null_;
set ref end=lr;
length t $10;
retain t;
t=cats(t,g1);
if lr;
call symputx('t',t);
run;
%put &t;
/*and then */
if vars[i] ne "" then resultvars[i]=countc(vars[i],"&t");
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.