BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ly2105
Fluorite | Level 6

Hi everyone, 

 

I have two datasets.

  1. Dataset 1 is a discharge dataset. Each obsercation has a unique id (var name "ID") and10 diagnosis variables (var names dx_1, dx_2, dx_3, etc). The diagnosis variable values are ICD9 codes such as 707.00, 707.01, etc.  
  2. Dataset 2 contains a list of ICD9 codes and group-level variables. So, for each ICD9 code (unique, 1 observation each), I have 6 1/0 variables that indicate if the code is part of a larger group (var names group_1, group_2). I.e., ICD9 code 707.00 will get a "1" for group_1. 

 

I want to be able to go through dataset 1, and look at each dx_n variable. If the value is, for example, 707.00, I want to assign that discharge a "1" for the variable group_1. 

 

Essentially, I want to create these new indicator variables in dataset 1, but I am defining them based on group definitions from dataset 2. Should I do this in a datastep? Is there a SAS procedure that would make this easier? 

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Dealing with wide data structures involves more complications. Here is a way to get there:

 

data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;

data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;

proc transpose data=d1 out=d1List;
by id;
var dx:;
run;

proc sql;
create table d1Groups as
select 
    d1List.id,
    d1List._name_, 
    d2.*
from 
    d1List left join 
    d2 on d1List.col1 = d2.diag
order by id, _name_;

create table diagList as
select 
    id, 
    _name_,
    diag,
    max(group_1) as group_1,
    max(group_2) as group_2,
    max(group_3) as group_3,
    max(group_4) as group_4,
    max(group_5) as group_5,
    max(group_6) as group_6
from d1Groups
group by id;
quit;

proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;

proc print data=want noobs; 
var id dx_: group_:;
run;
PG

View solution in original post

4 REPLIES 4
ballardw
Super User

Can you show a bit of your example data sets and what the output should actually look like.

If the data in your dataset 2 is appropriate then it may be possible to create a group of informats that might terribly simplify and comparison syntax.

Astounding
PROC Star

Would you feel comfortable transforming your second data set so that it ends up with just two variables:  ICD9 and group (taking on values from 1 through 6). That would be step 1.

 

The next step would be to use that version of the data to create a format, and then applying the format to the first data set.  I just wasn't sure if you would need help with step 1.

PGStats
Opal | Level 21

Dealing with wide data structures involves more complications. Here is a way to get there:

 

data d1;
input id (dx_1-dx_3) (:$8.);
datalines;
1 707.00 707.01 707.02
2 707.00 707.04 .
;

data d2;
input diag :$8. group_1-group_6;
datalines;
707.00 1 0 0 0 0 0
707.01 1 0 0 0 0 1
707.02 0 0 0 1 0 0
707.03 1 0 1 0 0 0
707.04 1 0 1 0 0 0
;

proc transpose data=d1 out=d1List;
by id;
var dx:;
run;

proc sql;
create table d1Groups as
select 
    d1List.id,
    d1List._name_, 
    d2.*
from 
    d1List left join 
    d2 on d1List.col1 = d2.diag
order by id, _name_;

create table diagList as
select 
    id, 
    _name_,
    diag,
    max(group_1) as group_1,
    max(group_2) as group_2,
    max(group_3) as group_3,
    max(group_4) as group_4,
    max(group_5) as group_5,
    max(group_6) as group_6
from d1Groups
group by id;
quit;

proc transpose data=diagList out=want(drop=_name_);
by id group_:;
id _name_;
var diag;
run;

proc print data=want noobs; 
var id dx_: group_:;
run;
PG
ly2105
Fluorite | Level 6
Thank you both! Using sql to create a new transposed dataset 1 worked. I was trying to avoid that initially since my data set was very large, but in the end it was easiest to code.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1695 views
  • 0 likes
  • 4 in conversation