Hi everyone,
I have two datasets.
I want to be able to go through dataset 1, and look at each dx_n variable. If the value is, for example, 707.00, I want to assign that discharge a "1" for the variable group_1.
Essentially, I want to create these new indicator variables in dataset 1, but I am defining them based on group definitions from dataset 2. Should I do this in a datastep? Is there a SAS procedure that would make this easier?
Thanks!
Dealing with wide data structures involves more complications. Here is a way to get there:
data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;
data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;
proc transpose data=d1 out=d1List;
by id;
var dx:;
run;
proc sql;
create table d1Groups as
select
d1List.id,
d1List._name_,
d2.*
from
d1List left join
d2 on d1List.col1 = d2.diag
order by id, _name_;
create table diagList as
select
id,
_name_,
diag,
max(group_1) as group_1,
max(group_2) as group_2,
max(group_3) as group_3,
max(group_4) as group_4,
max(group_5) as group_5,
max(group_6) as group_6
from d1Groups
group by id;
quit;
proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;
proc print data=want noobs;
var id dx_: group_:;
run;
Can you show a bit of your example data sets and what the output should actually look like.
If the data in your dataset 2 is appropriate then it may be possible to create a group of informats that might terribly simplify and comparison syntax.
Would you feel comfortable transforming your second data set so that it ends up with just two variables: ICD9 and group (taking on values from 1 through 6). That would be step 1.
The next step would be to use that version of the data to create a format, and then applying the format to the first data set. I just wasn't sure if you would need help with step 1.
Dealing with wide data structures involves more complications. Here is a way to get there:
data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;
data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;
proc transpose data=d1 out=d1List;
by id;
var dx:;
run;
proc sql;
create table d1Groups as
select
d1List.id,
d1List._name_,
d2.*
from
d1List left join
d2 on d1List.col1 = d2.diag
order by id, _name_;
create table diagList as
select
id,
_name_,
diag,
max(group_1) as group_1,
max(group_2) as group_2,
max(group_3) as group_3,
max(group_4) as group_4,
max(group_5) as group_5,
max(group_6) as group_6
from d1Groups
group by id;
quit;
proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;
proc print data=want noobs;
var id dx_: group_:;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.