Hello everyone,
I have a dataset with variables year, patientid, Proc1 through proc15. PROC1-PROC15 have several character values. I would like to create a count variable which sums the number of times "J1000" or "J1001" or "J1201" or "J2201" appear within proc1-proc15 and would like to group by patientid and if possible by year as well.
Sample data set
Data have;
input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;
1 2000 9999
2 2005 999 989 921 J1000
2 2006 977 989 911 J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
I am hoping to get two tables in the format below
Table1 -Year wise count by patientid
PatientID Year Countbyyear
1 2000 2
1 2001 1
2 2005 1
2 2006 1
3 2005 2
3 2006 1
Table 2 Totalcount by patienid
PatientID TotalCount
1 3
2 2
3 3
Any help is appreciated.
Transpose to a long dataset layout, and it's a simple SQL count with a WHERE condition.
You can get both of the "want" outputs by using a couple of arrays like:
Data have;
INFILE DATALINES MISSOVER;
input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;
DATALINES;
1 2000 9999
2 2005 999 989 921 J1000
2 2006 977 989 911 J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;
RUN;
PROC SORT Data=HAVE;
BY PatientID Year;
RUN;
data Want1(KEEP=PatientID Year Proc_Cnt) want2(KEEP=PatientID Proc_Cnt);
ARRAY Target_Procs [4] $5 _TEMPORARY_ ("J1000" "J1001" "J1201" "J2201");
SET Have;
by PatientID Year;
ARRAY Procs [*] $5 Proc1 - Proc5;
DO _i = 1 TO DIM(Procs);
IF Procs[_i] IN Target_Procs THEN
Proc_Cnt + 1;
Ttl_Proc_Cnt + 1;
END;
IF LAST.Year THEN
DO;
OUTPUT Want1;
Proc_Cnt = 0;
END;
IF LAST.PatientID THEN
DO;
OUTPUT Want2;
Ttl_Proc_Cnt = 0;
DELETE;
END;
ELSE
DO;
DELETE;
END;
RUN;
Which would give you the following two outputs:
Want1
Want2
Jim
Create a view that performs an array based transposition. Feed the view to Proc SUMMARY.
Example:
Data have; infile cards missover; input patientid year (proc1-proc15) ($); datalines; 1 2000 9999 2 2005 999 989 921 J1000 2 2006 977 989 911 J1001 582 3 2005 J1201 921 910 954 987 3 2005 J1201 911 911 984 887 3 2006 J1201 9151 913 884 687 1 2000 J2201 921 910 954 987 1 2000 J2201 921 910 954 987 1 2001 J2201 921 910 954 987 ; data eachJ / view=eachJ; set have; array procs proc1-proc15; do _n_ = 1 to dim(procs); if procs(_n_) in ('J1000','J1001','J1201','J2201') then do; output; end; end; keep patientid year; run; proc summary data=eachJ; class patientid year; output out=freqJ; run;
Output
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.