Hi everyone,
I have two datasets.
I want to be able to go through dataset 1, and look at each dx_n variable. If the value is, for example, 707.00, I want to assign that discharge a "1" for the variable group_1.
Essentially, I want to create these new indicator variables in dataset 1, but I am defining them based on group definitions from dataset 2. Should I do this in a datastep? Is there a SAS procedure that would make this easier?
Thanks!
Dealing with wide data structures involves more complications. Here is a way to get there:
data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;
data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;
proc transpose data=d1 out=d1List;
by id;
var dx:;
run;
proc sql;
create table d1Groups as
select
d1List.id,
d1List._name_,
d2.*
from
d1List left join
d2 on d1List.col1 = d2.diag
order by id, _name_;
create table diagList as
select
id,
_name_,
diag,
max(group_1) as group_1,
max(group_2) as group_2,
max(group_3) as group_3,
max(group_4) as group_4,
max(group_5) as group_5,
max(group_6) as group_6
from d1Groups
group by id;
quit;
proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;
proc print data=want noobs;
var id dx_: group_:;
run;
Can you show a bit of your example data sets and what the output should actually look like.
If the data in your dataset 2 is appropriate then it may be possible to create a group of informats that might terribly simplify and comparison syntax.
Would you feel comfortable transforming your second data set so that it ends up with just two variables: ICD9 and group (taking on values from 1 through 6). That would be step 1.
The next step would be to use that version of the data to create a format, and then applying the format to the first data set. I just wasn't sure if you would need help with step 1.
Dealing with wide data structures involves more complications. Here is a way to get there:
data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;
data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;
proc transpose data=d1 out=d1List;
by id;
var dx:;
run;
proc sql;
create table d1Groups as
select
d1List.id,
d1List._name_,
d2.*
from
d1List left join
d2 on d1List.col1 = d2.diag
order by id, _name_;
create table diagList as
select
id,
_name_,
diag,
max(group_1) as group_1,
max(group_2) as group_2,
max(group_3) as group_3,
max(group_4) as group_4,
max(group_5) as group_5,
max(group_6) as group_6
from d1Groups
group by id;
quit;
proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;
proc print data=want noobs;
var id dx_: group_:;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.