BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CarolBarahona
Fluorite | Level 6

Hi all, 

 

I have a problem how to proceed in sas for the conversion of two dataset in one, based on a reference dataset.

These is my problem in simple words: 

 

"Reference data"

 G1 G2
var1AB
var2CD
var3EF

 

Letters in variable G1  are equivalent to '1' and variable G2 are equivalent to '0'. But are not the same letter for all var1--3

 

"Real data"

 var1var2var3
ID1AACDFE
ID2BACC.
ID3BB.FF
ID4ABCDEF

 

I have a combination of G1 and G2, and also missing values... 

 

What I want: 

 

 var1var2var3
ID1211
ID212.
ID30.0
ID4111

 

I was thinking in counting letter one based on G1 of my reference dataset.. but I'm not sure about it. 

 

I must to confess that I'm lost in how to solved it. Is very simple what I want, but I'm pretty sure that the solution is not so simple.

 

Thanks a lot

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

I very like this kind of question.

 

data x;
infile cards expandtabs;
input (vname	G1 	G2) ($);
cards;
var1	A	B
var2	C	D
var3	E	F
;
run;
data have;
infile cards expandtabs;
input (id	var1	var2	var3) ($);
cards;
ID1	AA	CD	FE
ID2	BA	CC	.
ID3	BB	.	FF
ID4	AB	CD	EF
;
run;

data key;
 set x;
 g=g1;value=1;output;
 g=g2;value=0;output;
 drop G1 G2;
run;

data want;
 if _n_=1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('vname','g');
  h.definedata('value');
  h.definedone();
 end;
set have;
array x{*} $ var1-var3;
array y{*} new_var1-new_var3;
do i=1 to dim(x);
 do j=1 to length(x{i});
  vname=vname(x{i});g=char(x{i},j);
  if h.find()=0 then y{i}=sum(y{i},value);
 end;
end;
keep id var1-var3 new_var1-new_var3;
run;
proc print;run;

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Please post usable examples (tested data steps with datalines) for the reference and real datasets. Right now I can't make sense of your post.

Note that "G1=1" is not a valid SAS column name.

Ksharp
Super User

I very like this kind of question.

 

data x;
infile cards expandtabs;
input (vname	G1 	G2) ($);
cards;
var1	A	B
var2	C	D
var3	E	F
;
run;
data have;
infile cards expandtabs;
input (id	var1	var2	var3) ($);
cards;
ID1	AA	CD	FE
ID2	BA	CC	.
ID3	BB	.	FF
ID4	AB	CD	EF
;
run;

data key;
 set x;
 g=g1;value=1;output;
 g=g2;value=0;output;
 drop G1 G2;
run;

data want;
 if _n_=1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('vname','g');
  h.definedata('value');
  h.definedone();
 end;
set have;
array x{*} $ var1-var3;
array y{*} new_var1-new_var3;
do i=1 to dim(x);
 do j=1 to length(x{i});
  vname=vname(x{i});g=char(x{i},j);
  if h.find()=0 then y{i}=sum(y{i},value);
 end;
end;
keep id var1-var3 new_var1-new_var3;
run;
proc print;run;

CarolBarahona
Fluorite | Level 6

Thank you very much! 🙂 It works perfectly...

Likes  x 10000 ! 

 

FreelanceReinh
Jade | Level 19

Another suggestion:

data ref;
input varname $ (g1 g2)(:$1.);
cards;
var1	A	B
var2	C	D
var3	E	F
;

data have;
input (id var1-var3)($);
cards;
1 AA CD FE
2 BA CC .
3 BB .  FF
4 AB CD EF
;

proc transpose data=ref out=tref(drop=_:);
var g1;
run;

data want(drop=c:);
if _n_=1 then set tref;
set have(rename=(var1-var3=cvar1-cvar3));
array col[3];
array cvar[3];
array var[3];
do c=1 to dim(var);
  var[c]=ifn(cvar[c]>' ',countc(cvar[c],col[c]),.);
end;
run;
DanielLangley
Quartz | Level 8
data want;
set realdata;
array vars[3] $ var1-var3;
array resultvars[3] rvar1-rvar3;

do i = 1 to dim(vars);
	if vars[i] ne "" then resultvars[i]=countc(vars[i],"ACE");
end;

drop var1-var3 i;
run;

Possibly the simplest solution. Don't bother transforming them just count the amount of times the letters ACE appear.

Edit: Ah. Just noticed that you have done the same but in a more dynamic way. oops.

novinosrin
Tourmaline | Level 20

Hi @DanielLangley Very trivial extension to your idea , 

 

a one extra pass will just macrotise it

data _null_;
set ref end=lr;
length t $10;
retain t;
t=cats(t,g1);
if lr;
call symputx('t',t);
run;


%put &t;

/*and then */


if vars[i] ne "" then resultvars[i]=countc(vars[i],"&t");

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 932 views
  • 3 likes
  • 6 in conversation