BookmarkSubscribeRSS Feed
ihtishamsultan
Obsidian | Level 7

Hello everyone,

I have a dataset with variables year, patientid, Proc1 through proc15. PROC1-PROC15 have several character values. I would like to create a count variable  which sums the number of times "J1000" or "J1001" or "J1201" or "J2201" appear within proc1-proc15 and would like to group by patientid and if possible by year as well.

 

Sample data set

Data have;

input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

1 2000 9999  

2 2005 999  989  921  J1000 

2 2006 977  989  911  J1001 582

3 2005 J1201 921 910 954 987

3 2005 J1201 911 911 984 887

3 2006 J1201 9151 913 884 687

1 2000 J2201 921 910 954 987

1 2000 J2201 921 910 954 987

1 2001 J2201 921 910 954 987

 

 

 

I am hoping to get two tables in the format below

Table1 -Year wise count by patientid

PatientID Year     Countbyyear

1               2000     2

1               2001     1

2              2005     1

2              2006     1

3              2005     2

3              2006     1

 

 

Table 2 Totalcount by patienid

PatientID     TotalCount

1                    3

2                    2

3                    3

 

Any help is appreciated.

4 REPLIES 4
jimbarbour
Meteorite | Level 14

You can get both of the "want" outputs by using a couple of arrays like:

Data have;
	INFILE	DATALINES	MISSOVER;
	input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

DATALINES;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;
RUN;

PROC	SORT	Data=HAVE;
	BY	PatientID	Year;
RUN;

data	Want1(KEEP=PatientID Year Proc_Cnt)	want2(KEEP=PatientID Proc_Cnt);
	ARRAY	Target_Procs	[4]	$5	_TEMPORARY_	("J1000" "J1001" "J1201" "J2201");
	SET	Have;
		by	PatientID	Year;
	ARRAY	Procs			[*]	$5	Proc1 - Proc5;

	DO	_i	=	1	TO	DIM(Procs);
		IF	Procs[_i]	IN	Target_Procs	THEN
		Proc_Cnt		+	1;
		Ttl_Proc_Cnt	+	1;
	END;

	IF	LAST.Year		THEN
		DO;
			OUTPUT	Want1;
			Proc_Cnt	=	0;
		END;

	IF	LAST.PatientID	THEN
		DO;
			OUTPUT	Want2;
			Ttl_Proc_Cnt	=	0;
			DELETE;
		END;
	ELSE
		DO;
			DELETE;
		END;
RUN;

Which would give you the following two outputs:

 

Want1

jimbarbour_0-1601848481537.png

 

Want2

jimbarbour_0-1601847930029.png

 

Jim

ihtishamsultan
Obsidian | Level 7
Thank you for the code, the WANT1 GIVES THE PERFECT DATASET BUT WANT 2 GIVES ME ALL O's in proc_cnt
RichardDeVen
Barite | Level 11

Create a view that performs an array based transposition.  Feed the view to Proc SUMMARY.

 

Example:

Data have;
infile cards missover;
input patientid year (proc1-proc15) ($);
datalines;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;

data eachJ / view=eachJ;
  set have;
  array procs proc1-proc15;
  do _n_ = 1 to dim(procs);
    if procs(_n_) in ('J1000','J1001','J1201','J2201') then do;
      output;
    end;
  end;
  keep patientid year;
run;

proc summary data=eachJ;
  class patientid year;
  output out=freqJ;
run;

Output

RichardADeVenezia_0-1601852986949.png

 

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1293 views
  • 1 like
  • 4 in conversation