Dear,
The following data values present. I have to calculate number of distinct subjects by variables (dos,sev)
data one;
input id term$ sev $ dos;
datalines;
1 vo mild 100
1 na moderate 200
1 ga moderate 100
1 pa moderate 200
2 fe severe 200
2 it mild 200
3 vo mild 100
3 na mild 100
4 ds moderate 200
4 ma severe 200
5 hg moderate 100
5 ty mild 200
6 fd mild 200
;
if a subject has sev=mild and sev=moderate for a dos, then that subject should be counted under sev=moderate for that dos.
Please help in my code2 as the output numbers are not that I need.
code1;
proc sql;
create table three as
select count(distinct id) as ns,dos
from one
group by dos;
quit;
code1 output;
dos ns
100 3
200 5
Code2:
proc sql;
create table two as
select count(distinct id) as ns,dos,sev
from one
group by dos,sev;
quit;
Code2 output:
dos sev ns
100 mild 2
100 moderate 2
200 mild 3
200 moderate 2
200 severe 2
Output needed; I am just dispalying the numbers in a tabular form to better expalin
dos(variable)
100 200
ns=3 ns=5
sev(variable)
mild 1 2
moderate 2 1
severe 2
You can try this proc freq
proc freq data=one;
table sev*dos / norow nocol nopct ;
run;
My take looks like this:
data have;
input id term$ sev $ dos;
datalines;
1 vo mild 100
1 na moderate 200
1 ga moderate 100
1 pa moderate 200
2 fe severe 200
2 it mild 200
3 vo mild 100
3 na mild 100
4 ds moderate 200
4 ma severe 200
5 hg moderate 100
5 ty mild 200
6 fd mild 200
;
run;
proc sort data=have;
by id dos sev;
run;
data want;
set have;
by id dos;
retain mild moderate severe;
if first.dos
then do;
mild = 0;
moderate = 0;
severe = 0;
end;
select (sev);
when ('mild') mild = 1;
when ('moderate') do;
mild = 0;
moderate = 1;
end;
when ('severe') severe = 1;
end;
if last.dos then output;
keep id dos mild moderate severe;
run;
proc summary data=want sum print;
class dos;
var mild moderate severe;
run;
but I get different numbers for dos=200:
The SUMMARY Procedure N dos Obs Variable Sum ----------------------------------------------- 100 3 mild 1.0000000 moderate 2.0000000 severe 0 200 5 mild 3.0000000 moderate 2.0000000 severe 2.0000000 -----------------------------------------------
Please check your rules against your data.
For dos=200, I see three ID's with mild without moderate (2, 5, 6), two ID's with moderate (1, 4) and two with severe (2, 4)
Or did you want that 'severe' overrides both 'mild' and 'moderate'?
First find max severity for each ID DOS. Then count those for each SEV DOS. You can use your SEV values directly to find the MAX(SEV) using PROC SUMMARY IDGROUP.
data ae;
input id term$ sev $ dos;
datalines;
1 vo mild 100
1 na moderate 200
1 ga moderate 100
1 pa moderate 200
2 fe severe 200
2 it mild 200
3 vo mild 100
3 na mild 100
4 ds moderate 200
4 ma severe 200
5 hg moderate 100
5 ty mild 200
6 fd mild 200
;;;;
run;
proc print;
run;
proc summary data=ae nway;
class id dos;
output out=ae1 idgroup(max(sev) out(sev)=);
run;
proc print;
run;
proc summary data=ae1 nway completetypes;
class sev dos;
output out=ae2(drop=_type_);
run;
proc print;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.