Hello everyone,
I have a dataset with variables year, patientid, Proc1 through proc15. PROC1-PROC15 have several character values. I would like to create a count variable which sums the number of times "J1000" or "J1001" or "J1201" or "J2201" appear within proc1-proc15 and would like to group by patientid and if possible by year as well.
Sample data set
Data have;
input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;
1 2000 9999
2 2005 999 989 921 J1000
2 2006 977 989 911 J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
I am hoping to get two tables in the format below
Table1 -Year wise count by patientid
PatientID Year Countbyyear
1 2000 2
1 2001 1
2 2005 1
2 2006 1
3 2005 2
3 2006 1
Table 2 Totalcount by patienid
PatientID TotalCount
1 3
2 2
3 3
Any help is appreciated.
Transpose to a long dataset layout, and it's a simple SQL count with a WHERE condition.
You can get both of the "want" outputs by using a couple of arrays like:
Data have;
INFILE DATALINES MISSOVER;
input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;
DATALINES;
1 2000 9999
2 2005 999 989 921 J1000
2 2006 977 989 911 J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;
RUN;
PROC SORT Data=HAVE;
BY PatientID Year;
RUN;
data Want1(KEEP=PatientID Year Proc_Cnt) want2(KEEP=PatientID Proc_Cnt);
ARRAY Target_Procs [4] $5 _TEMPORARY_ ("J1000" "J1001" "J1201" "J2201");
SET Have;
by PatientID Year;
ARRAY Procs [*] $5 Proc1 - Proc5;
DO _i = 1 TO DIM(Procs);
IF Procs[_i] IN Target_Procs THEN
Proc_Cnt + 1;
Ttl_Proc_Cnt + 1;
END;
IF LAST.Year THEN
DO;
OUTPUT Want1;
Proc_Cnt = 0;
END;
IF LAST.PatientID THEN
DO;
OUTPUT Want2;
Ttl_Proc_Cnt = 0;
DELETE;
END;
ELSE
DO;
DELETE;
END;
RUN;
Which would give you the following two outputs:
Want1
Want2
Jim
Create a view that performs an array based transposition. Feed the view to Proc SUMMARY.
Example:
Data have; infile cards missover; input patientid year (proc1-proc15) ($); datalines; 1 2000 9999 2 2005 999 989 921 J1000 2 2006 977 989 911 J1001 582 3 2005 J1201 921 910 954 987 3 2005 J1201 911 911 984 887 3 2006 J1201 9151 913 884 687 1 2000 J2201 921 910 954 987 1 2000 J2201 921 910 954 987 1 2001 J2201 921 910 954 987 ; data eachJ / view=eachJ; set have; array procs proc1-proc15; do _n_ = 1 to dim(procs); if procs(_n_) in ('J1000','J1001','J1201','J2201') then do; output; end; end; keep patientid year; run; proc summary data=eachJ; class patientid year; output out=freqJ; run;
Output
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.