I have 40 variables (Drug1-40) with data in them being either 1 (drug given) or 0 (not given) or . (missing). I have created a variable NumofRx that sums up presence of a drug (sum(drug1--drug40)) for individual patient. The range of NumofRx is from 0 to 10, I would like to create a variable where I want to see what drugs ( drug1 or drug6 or drug 13 ...) comes under numofrx=2 , numofrx=3 ......numofrx=10.
Sample data:
Data pattern;
infile datalines;
input id year drug1 drug2 drug3 drug4 numofrx ;
datalines;
1 2000 1 0 1 0 2
2 2005 1 0 0 0 1
3 2006 0 1 1 1 3
4 2005 0 0 1 0 1
5 2005 0 1 1 0 2
6 2006 1 0 1 0 2
7 2000 . 0 1 0 1
8 2000 0 0 0 0
9 2001 1 1 1 1 4
I would like to have a dataset/table that looks like below
ID Numofrx Drugs
1 2 drug1+drug3
2 1 drug1
3 3 drug2+drug3+drug4
4 1 drug1
5 2 drug2+drug3
6 2 drug1+drug3
7 1 drug3
8 0 0
9 4 drug1+drug2+drug3+drug4
Yes. That's fairly easy using the VLABEL function.
Change the code as follows:
DATA Pattern_Details (KEEP=ID NumOfRx Drugs);
DROP _i;
SET Pattern;
ARRAY Drug [*] Drug1 - Drug4;
LENGTH Drugs $256;
IF MISSING(NumOfRx) THEN
NumOfRx = 0;
DO _i = 1 TO DIM(Drug);
IF Drug[_i] THEN
Drugs = CATS(Drugs, '+', VLABEL(Drug[_i]), _i);
END;
Drugs = SUBSTR(Drugs, 2);
RUN;
Jim
I think this should do it:
Data pattern;
infile datalines MISSOVER;
input ID year drug1 drug2 drug3 drug4 NumOfRx ;
datalines;
1 2000 1 0 1 0 2
2 2005 1 0 0 0 1
3 2006 0 1 1 1 3
4 2005 0 0 1 0 1
5 2005 0 1 1 0 2
6 2006 1 0 1 0 2
7 2000 . 0 1 0 1
8 2000 0 0 0 0
9 2001 1 1 1 1 4
;
RUN;
DATA Pattern_Details (KEEP=ID NumOfRx Drugs);
DROP _i;
SET Pattern;
ARRAY Drug [*] Drug1 - Drug4;
LENGTH Drugs $256;
IF MISSING(NumOfRx) THEN
NumOfRx = 0;
DO _i = 1 TO DIM(Drug);
IF Drug[_i] THEN
Drugs = CATS(Drugs, '+Drug', _i);
END;
Drugs = SUBSTR(Drugs, 2);
RUN;
Jim
Yes. That's fairly easy using the VLABEL function.
Change the code as follows:
DATA Pattern_Details (KEEP=ID NumOfRx Drugs);
DROP _i;
SET Pattern;
ARRAY Drug [*] Drug1 - Drug4;
LENGTH Drugs $256;
IF MISSING(NumOfRx) THEN
NumOfRx = 0;
DO _i = 1 TO DIM(Drug);
IF Drug[_i] THEN
Drugs = CATS(Drugs, '+', VLABEL(Drug[_i]), _i);
END;
Drugs = SUBSTR(Drugs, 2);
RUN;
Jim
@ihtishamsultan wrote:
There are labels for each drug, would it be possible to have the names?
Use the LABEL= option in the PROC TRANSPOSE statement.
As usual when there is a series of variables containing the basically same information, moving data from structure to content by transposing makes the job easier:
Data pattern;
infile datalines;
input id year drug1 drug2 drug3 drug4;
datalines;
1 2000 1 0 1 0
2 2005 1 0 0 0
3 2006 0 1 1 1
4 2005 0 0 1 0
5 2005 0 1 1 0
6 2006 1 0 1 0
7 2000 . 0 1 0
8 2000 0 0 0 .
9 2001 1 1 1 1
;
proc transpose
data=pattern
out=long (rename=(_name_=drug col1=yn))
;
by id;
var drug:;
run;
data want;
set long;
length numofrx 8 drugs $100;
retain numofrx drugs;
by id;
if first.id
then do;
drugs = "";
numofrx = 0;
end;
if yn
then do;
drugs = catx('+',drugs,drug);
numofrx + 1;
end;
if last.id;
keep id numofrx drugs;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.