BookmarkSubscribeRSS Feed
ihtishamsultan
Obsidian | Level 7

Hello everyone,

I have a dataset with variables year, patientid, Proc1 through proc15. PROC1-PROC15 have several character values. I would like to create a count variable  which sums the number of times "J1000" or "J1001" or "J1201" or "J2201" appear within proc1-proc15 and would like to group by patientid and if possible by year as well.

 

Sample data set

Data have;

input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

1 2000 9999  

2 2005 999  989  921  J1000 

2 2006 977  989  911  J1001 582

3 2005 J1201 921 910 954 987

3 2005 J1201 911 911 984 887

3 2006 J1201 9151 913 884 687

1 2000 J2201 921 910 954 987

1 2000 J2201 921 910 954 987

1 2001 J2201 921 910 954 987

 

 

 

I am hoping to get two tables in the format below

Table1 -Year wise count by patientid

PatientID Year     Countbyyear

1               2000     2

1               2001     1

2              2005     1

2              2006     1

3              2005     2

3              2006     1

 

 

Table 2 Totalcount by patienid

PatientID     TotalCount

1                    3

2                    2

3                    3

 

Any help is appreciated.

4 REPLIES 4
jimbarbour
Meteorite | Level 14

You can get both of the "want" outputs by using a couple of arrays like:

Data have;
	INFILE	DATALINES	MISSOVER;
	input patientid year proc1$ proc2$ proc3$ proc4$ proc5$;

DATALINES;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;
RUN;

PROC	SORT	Data=HAVE;
	BY	PatientID	Year;
RUN;

data	Want1(KEEP=PatientID Year Proc_Cnt)	want2(KEEP=PatientID Proc_Cnt);
	ARRAY	Target_Procs	[4]	$5	_TEMPORARY_	("J1000" "J1001" "J1201" "J2201");
	SET	Have;
		by	PatientID	Year;
	ARRAY	Procs			[*]	$5	Proc1 - Proc5;

	DO	_i	=	1	TO	DIM(Procs);
		IF	Procs[_i]	IN	Target_Procs	THEN
		Proc_Cnt		+	1;
		Ttl_Proc_Cnt	+	1;
	END;

	IF	LAST.Year		THEN
		DO;
			OUTPUT	Want1;
			Proc_Cnt	=	0;
		END;

	IF	LAST.PatientID	THEN
		DO;
			OUTPUT	Want2;
			Ttl_Proc_Cnt	=	0;
			DELETE;
		END;
	ELSE
		DO;
			DELETE;
		END;
RUN;

Which would give you the following two outputs:

 

Want1

jimbarbour_0-1601848481537.png

 

Want2

jimbarbour_0-1601847930029.png

 

Jim

ihtishamsultan
Obsidian | Level 7
Thank you for the code, the WANT1 GIVES THE PERFECT DATASET BUT WANT 2 GIVES ME ALL O's in proc_cnt
RichardDeVen
Barite | Level 11

Create a view that performs an array based transposition.  Feed the view to Proc SUMMARY.

 

Example:

Data have;
infile cards missover;
input patientid year (proc1-proc15) ($);
datalines;
1 2000 9999  
2 2005 999  989  921  J1000 
2 2006 977  989  911  J1001 582
3 2005 J1201 921 910 954 987
3 2005 J1201 911 911 984 887
3 2006 J1201 9151 913 884 687
1 2000 J2201 921 910 954 987
1 2000 J2201 921 910 954 987
1 2001 J2201 921 910 954 987
;

data eachJ / view=eachJ;
  set have;
  array procs proc1-proc15;
  do _n_ = 1 to dim(procs);
    if procs(_n_) in ('J1000','J1001','J1201','J2201') then do;
      output;
    end;
  end;
  keep patientid year;
run;

proc summary data=eachJ;
  class patientid year;
  output out=freqJ;
run;

Output

RichardADeVenezia_0-1601852986949.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 907 views
  • 1 like
  • 4 in conversation