BookmarkSubscribeRSS Feed
mounikag
Obsidian | Level 7

Hi Team,

data x;
infile;
datalines;
subj    astdt(NUm)    inj1(char)        inj2(char)        inj3(char)        inj4(char)      No.of injs before astdt
101    2020-06-03    2019-07-18     2019-08-15     2019-09-10     2019-11-14      4
102    2019-08-07    2019-06-12     2019-07-08     2019-08-12     2019-09-30      2
103    2020-01-09    2019-11-13     2019-12-16     2020-01-08      2020-03-04      3
104    2019-12-13    2019-11-13     2019-12-16     2020-01-08      2020-03-04      1
105    2019-05-15    2019-05-13                                                                             1
106    2019-08-05    2019-07-17     2019-08-14                                                      1
107    2019-08-21    2019-07-17     2019-08-14                                                      2
108    2020-06-30    2020-05-27     2020-06-24     2020-07-22                               3
;
run;

I have data as above so, I would like to have help in coding.
I have four injections(inj1, inj2, inj3, inj4)which is in character format and i have astdt(ae start date) in numeric format.
i would like to know number of prior injections before astdt for each subject.
so, it should count the number of injections(non-missing injs) prior to the astdt.
please provide me a code it would be helpful .Thanks in advance.

 

 

Best,
Mounika.

 

 

 

 

 

 

 

 

3 REPLIES 3
qoit
Pyrite | Level 9

Below should do it:

data have;
	infile datalines missover truncover;
	input subj $ astdt yymmdd10. inj1 $11. inj2 $11. inj3 $11. inj4 $11.;
	format astdt date9.;
	datalines;
101 2020-06-03 2019-07-18 2019-08-15 2019-09-10 2019-11-14
102 2019-08-07 2019-06-12 2019-07-08 2019-08-12 2019-09-30
103 2020-01-09 2019-11-13 2019-12-16 2020-01-08 2020-03-04
104 2019-12-13 2019-11-13 2019-12-16 2020-01-08 2020-03-04
105 2019-05-15 2019-05-13                                 
106 2019-08-05 2019-07-17 2019-08-14                      
107 2019-08-21 2019-07-17 2019-08-14                      
108 2020-06-30 2020-05-27 2020-06-24 2020-07-22           
;
run;

data want;
	set have;
	array inj{4} $;
	array cnt{4};

	do i = 1 to dim(inj);
		if input(inj{i},yymmdd10.) < astdt and ^missing(inj{i}) then
			cnt{i} = 1;
		else cnt{i} = 0;
	end;

	total = sum(of cnt1 - cnt4);
	drop cnt1 - cnt4 i;
run;
mkeintz
PROC Star

If your INJ values are really arranged as

  1. All actual dates are to the left (i.e. missing dates are always on the right)
  2. INJ dates are in ascending order

Then you can use a WHILE expression and/or a leave statement to stop the loop - you don't always have to go from 1 to dim(INJ):

 

data have;
	infile datalines missover truncover;
	input subj $ astdt yymmdd10. inj1 $11. inj2 $11. inj3 $11. inj4 $11.;
	format astdt date9.;
	datalines;
101 2020-06-03 2019-07-18 2019-08-15 2019-09-10 2019-11-14
102 2019-08-07 2019-06-12 2019-07-08 2019-08-12 2019-09-30
103 2020-01-09 2019-11-13 2019-12-16 2020-01-08 2020-03-04
104 2019-12-13 2019-11-13 2019-12-16 2020-01-08 2020-03-04
105 2019-05-15 2019-05-13                                 
106 2019-08-05 2019-07-17 2019-08-14                      
107 2019-08-21 2019-07-17 2019-08-14                      
108 2020-06-30 2020-05-27 2020-06-24 2020-07-22           
;
run;

data want (drop=i);
	set have;
	array inj  $ inj: ;

	do i = 1 to dim(inj) while (inj{i}^=' ');
      if input(inj{i},yymmdd10.)<astdt then total=sum(total,1);
      else leave;
	end;
run;

Even more compact, but likely more obtuse:

data want (rename=(i=total));
  set have;
  array inj $ inj: ;
  do i=0 to dim(inj)-1 while (input(inj{i+1},yymmdd10.)<astdt and inj{i+1}^=' ');
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
LeonidBatkhan
Lapis Lazuli | Level 10

Hi mounikag,

You can do it using the following approach:

data x;
   length subj astdt 8 inj1-inj4 $10;
   infile datalines truncover;
   input subj astdt yymmdd10. inj1-inj4 $;
   format astdt yymmdd10.;
   datalines;
101 2020-06-03    2019-07-18     2019-08-15     2019-09-10     2019-11-14
102 2019-08-07    2019-06-12     2019-07-08     2019-08-12     2019-09-30
103 2020-01-09    2019-11-13     2019-12-16     2020-01-08     2020-03-04
104 2019-12-13    2019-11-13     2019-12-16     2020-01-08     2020-03-04  
105 2019-05-15    2019-05-13
106 2019-08-05    2019-07-17     2019-08-14 
107 2019-08-21    2019-07-17     2019-08-14 
108 2020-06-30    2020-05-27     2020-06-24     2020-07-22
;

DATA WANT;
   set x;
   array inj $ inj1-inj4;
   array injn injn1-injn4;
   no_injs_before_astdt = 0;
   do i=1 to dim(inj);
      injn[i] = input(inj[i],yymmdd10.);
      no_injs_before_astdt + (astdt > injn[i] > .);
   end;
run;

You can drop any variables you don't like to keep. Also, in your data last line you have an error, No.of injs before astdt should be equal 2, not 3.

 

This code essentially converts you character dates into numeric, and then loops through array elements and increments no_injs_before_astdt every time when (astdt > injn[i] > .)

 

Hope, this helps.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 677 views
  • 1 like
  • 4 in conversation