BookmarkSubscribeRSS Feed
ihtishamsultan
Obsidian | Level 7

Hello everyone,

I have a dataset with variables year, patientid, Proc1 through proc15. PROC1-PROC15 have several character values. I would like to create a count variable  which sums the number of times "J1000" or "J1001" or "J1201" or "J2201" appear within proc1-proc15 and would like to group by patientid and if possible by year as well.

 

Sample data set

Data have;

input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

1 2000 9999  

2 2005 999  989  921  J1000 

2 2006 977  989  911  J1001 582

3 2005 J1201 921 910 954 987

3 2005 J1201 911 911 984 887

3 2006 J1201 9151 913 884 687

1 2000 J2201 921 910 954 987

1 2000 J2201 921 910 954 987

1 2001 J2201 921 910 954 987

 

 

 

I am hoping to get two tables in the format below

Table1 -Year wise count by patientid

PatientID Year     Countbyyear

1               2000     2

1               2001     1

2              2005     1

2              2006     1

3              2005     2

3              2006     1

 

 

Table 2 Totalcount by patienid

PatientID     TotalCount

1                    3

2                    2

3                    3

 

Any help is appreciated.

4 REPLIES 4
jimbarbour
Meteorite | Level 14

You can get both of the "want" outputs by using a couple of arrays like:

Data have;
	INFILE	DATALINES	MISSOVER;
	input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

DATALINES;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;
RUN;

PROC	SORT	Data=HAVE;
	BY	PatientID	Year;
RUN;

data	Want1(KEEP=PatientID Year Proc_Cnt)	want2(KEEP=PatientID Proc_Cnt);
	ARRAY	Target_Procs	[4]	$5	_TEMPORARY_	("J1000" "J1001" "J1201" "J2201");
	SET	Have;
		by	PatientID	Year;
	ARRAY	Procs			[*]	$5	Proc1 - Proc5;

	DO	_i	=	1	TO	DIM(Procs);
		IF	Procs[_i]	IN	Target_Procs	THEN
		Proc_Cnt		+	1;
		Ttl_Proc_Cnt	+	1;
	END;

	IF	LAST.Year		THEN
		DO;
			OUTPUT	Want1;
			Proc_Cnt	=	0;
		END;

	IF	LAST.PatientID	THEN
		DO;
			OUTPUT	Want2;
			Ttl_Proc_Cnt	=	0;
			DELETE;
		END;
	ELSE
		DO;
			DELETE;
		END;
RUN;

Which would give you the following two outputs:

 

Want1

jimbarbour_0-1601848481537.png

 

Want2

jimbarbour_0-1601847930029.png

 

Jim

ihtishamsultan
Obsidian | Level 7
Thank you for the code, the WANT1 GIVES THE PERFECT DATASET BUT WANT 2 GIVES ME ALL O's in proc_cnt
RichardDeVen
Barite | Level 11

Create a view that performs an array based transposition.  Feed the view to Proc SUMMARY.

 

Example:

Data have;
infile cards missover;
input patientid year (proc1-proc15) ($);
datalines;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;

data eachJ / view=eachJ;
  set have;
  array procs proc1-proc15;
  do _n_ = 1 to dim(procs);
    if procs(_n_) in ('J1000','J1001','J1201','J2201') then do;
      output;
    end;
  end;
  keep patientid year;
run;

proc summary data=eachJ;
  class patientid year;
  output out=freqJ;
run;

Output

RichardADeVenezia_0-1601852986949.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1206 views
  • 1 like
  • 4 in conversation