Good afternoon all! Here is the sample dataset (real data has 9 variable conditions)
data have;
input Neuroses $ Psychoses $ SignsSimpt $ Antisocial $ Depression $ Anomalies_Neurological $ ;
datalines;
40891863A 51330023A 84501233A 77699061A 32596703A 21189153A
77425921A 80328413A 61073933A 81441794A 10418753A 14050203A
13885263A 57306061A 08707021A 58633031A 44606543A 22757783A
01247023A 42638553A 32907423A 47662551A 52508373A 30956893A
44606543A 58393081A 64904003A 28133791A 20982933A 75235783A
;
run;
So basically under each diagnosis condition, there are patient id's. A patient could have more than one condition. What i want is an output which tells that how many patients have 9 conditions at once, 8 conditions at once, 7 conditions at once and so on. So, essentially something like this
No of conditions 1 2 3 4 5 6 7 8 9
No of members 450 378 245 200 145 100 56 34 12
I tried playing around with simple proc transpose, but im not going anywhere. Thanks so much!
data HAVE;
input Neuroses $ Psychoses $ SignsSimpt $ Antisocial $ Depression $ Anomalies_Neurological $ ;
datalines;
40891863A 51330023A 84501233A 77699061A 32596703A 21189153A
77699061A 80328413A 61073933A 81441794A 10418753A 14050203A
13885263A 57306061A 08707021A 58633031A 44606543A 22757783A
01247023A 42638553A 32907423A 47662551A 52508373A 30956893A
44606543A 58393081A 64904003A 28133791A 20982933A 75235783A
;
run;
data have;
set have;
n+1;
run;
proc transpose data=have out=temp;
by n;
var _character_;
run;
proc sql;
create table temp1 as
select col1,count(distinct _name_) as n_condition
from temp
group by col1;
create table want as
select n_condition,count(*) as n_member
from temp1
group by n_condition;
quit;
Let convert file HAVE to HELP with two variables: Diagnosys and Patient_ID,
thus can be done in a code like:
data HELP;
set HAVE;
Patient_ID = Neuroses; Diagnosys = 'Neuroses'; output;
Patient_ID = Psychoses; Diagnosys = 'Psychoses'; output;
... etc all 9 kinds of diagnosys ....
run;
Then use PROC FREQ with the OUT option twice to get your wanted output:
1st time TABLE Patient_ID (assuming no patients with duplicates of same diagnosys)
2nd time TABLE Count (or _FREQ_ depends on SAS version)
Like this?
data HAVE;
input Neuroses $ Psychoses $ SignsSimpt $ Antisocial $ Depression $ Anomalies_Neurological $ ;
datalines;
40891863A 51330023A 84501233A 77699061A 32596703A 21189153A
77699061A 80328413A 61073933A 81441794A 10418753A 14050203A
13885263A 57306061A 08707021A 58633031A 44606543A 22757783A
01247023A 42638553A 32907423A 47662551A 52508373A 30956893A
44606543A 58393081A 64904003A 28133791A 20982933A 75235783A
run;
data NORMAL;
set HAVE;
length CONDITION $32;
do CONDITION='Neuroses','Psychoses','SignsSimpt','Antisocial','Depression','Anomalies_Neurological';
PATIENT=vvaluex(CONDITION); output;
end;
keep CONDITION PATIENT;
run;
proc sql;
create table COUNTS as select count(*) as NB_CONDITIONS from NORMAL group by PATIENT;
create table WANT as select NB_CONDITIONS, count(*) as NB_PATIENTS from COUNTS group by NB_CONDITIONS;
quit;
data HAVE;
input Neuroses $ Psychoses $ SignsSimpt $ Antisocial $ Depression $ Anomalies_Neurological $ ;
datalines;
40891863A 51330023A 84501233A 77699061A 32596703A 21189153A
77699061A 80328413A 61073933A 81441794A 10418753A 14050203A
13885263A 57306061A 08707021A 58633031A 44606543A 22757783A
01247023A 42638553A 32907423A 47662551A 52508373A 30956893A
44606543A 58393081A 64904003A 28133791A 20982933A 75235783A
;
run;
data have;
set have;
n+1;
run;
proc transpose data=have out=temp;
by n;
var _character_;
run;
proc sql;
create table temp1 as
select col1,count(distinct _name_) as n_condition
from temp
group by col1;
create table want as
select n_condition,count(*) as n_member
from temp1
group by n_condition;
quit;
Thanks Chriznz and Ksharp. Both the solutions worked, i understood Ksharp's better as im not aware of the function used by chrisnz. I will dig deeper into it to learn more.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.