Solved: Re: If multiple values within timeframe, then make a row of all these ...

andyec · Posted 05-19-2020 12:06 PM

I'm working with a dataset of lab values, and if an ID has at least 2 lab values within 7 days of each other, I would like to create a row of those values and associated dates. That way, I can look at the rows that have non-missing value2, value3, etc. Please see a simulated dataset below and the dataset I would like to create.

Table 1. Have

obs	id	date	value
1	1	5/30/19	20
2	1	5/31/19	20
3	1	6/1/19	19
4	1	6/10/19	177
5	1	7/7/19	350
6	1	7/10/19	200
7	2	1/3/19	100
8	3	1/4/19	99
9	3	3/6/19	19
10	3	3/8/19	20
11	3	12/8/19	1500

Table 2. Want

obs	id	date1	date2	date3	value1	value2	value3
1	1	5/30/19	5/31/19	6/1/19	20	20	19
2	1	6/10/19			177
3	1	7/7/19	7/10/19		350	200
4	2	1/3/19			100
5	3	1/4/19			99
6	3	3/6/19	3/8/19		19	20
6	3	12/8/19			1500

novinosrin · Posted 05-19-2020 12:54 PM

Hi @andyec The idea is basically to group the intervals and transpose the contents of the group to wide structure.


data have;
	infile datalines ;
	input obs id date:MMDDYY10. value;
	format date MMDDYY10.;
	drop obs;
	datalines;
1	1	5/30/19	20
2	1	5/31/19	20
3	1	6/1/19	19
4	1	6/10/19	177
5	1	7/7/19	350
6	1	7/10/19	200
7	2	1/3/19	100
8	3	1/4/19	99
9	3	3/6/19	19
10	3	3/8/19	20
11	3	12/8/19	1500
;
run;

data temp;
 set have;
 by id;
 if dif(date)>7 then grp+1;
 if first.id then grp=1;
run;

proc sql noprint;
select max(m) into :m trimmed
from (select max(grp) as m from temp group by id);
quit;

%put &=m;

data want;
 do _n_= 1 by 1 until(last.grp);
  set temp;
  by id grp;
  array dt  date1-date&m;
  array v value1-value&m;
  dt(_n_)=date;
  v(_n_)=value;
 end;
 format date: mmddyy10.;
 drop date value grp;
run;

View solution in original post

ed_sas_member · Posted 05-19-2020 12:28 PM

Hi @andyec

You can try this:

data have;
	infile datalines dlm="09"x;
	input obs id date:MMDDYY10. value;
	format date MMDDYY10.;
	datalines;
1	1	5/30/19	20
2	1	5/31/19	20
3	1	6/1/19	19
4	1	6/10/19	177
5	1	7/7/19	350
6	1	7/10/19	200
7	2	1/3/19	100
8	3	1/4/19	99
9	3	3/6/19	19
10	3	3/8/19	20
11	3	12/8/19	1500
;
run;

data have_flag;
	do i=1 until(last.id);
		set have;
		format date7 MMDDYY10.;
		by id;
		retain date7;
		if first.id then date7 = date;
		if date-date7 > 7 then date7 = date;
	end;
run;

proc transpose data=have_flag out=have_tr1 prefix=date;
	var date;
	by id date7;
run;

proc transpose data=have_flag out=have_tr2 prefix=value;
	var value;
	by id date7;
run;

data want;
	merge have_tr1 have_tr2;
	by id date7;
	drop _name_ date7;
run;

Best,

novinosrin · Posted 05-19-2020 12:54 PM

Hi @andyec The idea is basically to group the intervals and transpose the contents of the group to wide structure.


data have;
	infile datalines ;
	input obs id date:MMDDYY10. value;
	format date MMDDYY10.;
	drop obs;
	datalines;
1	1	5/30/19	20
2	1	5/31/19	20
3	1	6/1/19	19
4	1	6/10/19	177
5	1	7/7/19	350
6	1	7/10/19	200
7	2	1/3/19	100
8	3	1/4/19	99
9	3	3/6/19	19
10	3	3/8/19	20
11	3	12/8/19	1500
;
run;

data temp;
 set have;
 by id;
 if dif(date)>7 then grp+1;
 if first.id then grp=1;
run;

proc sql noprint;
select max(m) into :m trimmed
from (select max(grp) as m from temp group by id);
quit;

%put &=m;

data want;
 do _n_= 1 by 1 until(last.grp);
  set temp;
  by id grp;
  array dt  date1-date&m;
  array v value1-value&m;
  dt(_n_)=date;
  v(_n_)=value;
 end;
 format date: mmddyy10.;
 drop date value grp;
run;

andyec · Posted 05-19-2020 01:23 PM

@ed_sas_member and @novinosrin thank you very much for your replies. I was able to get both to work with the simulated data provided.

@novinosrin when I use your code on the real dataset, I get an excessive number of value and date variables, and it seems to be the maximum amount of times an ID appears (not the maximum amount of times an ID appears to have a certain amount of values within 7 days). Any idea why this might be?

Lastly, I was wondering how I would add in one more additional variable that corresponds to every value. The variable is psource and can be seen in this post Count number of times within a timeframe by id, multiple occurrences by id possible .

ed_sas_member · Posted 05-19-2020 01:42 PM

Hi @andyec

Would the additional variable psource be splitted in psource1, psource2, psource3 as well?

andyec · Posted 05-19-2020 01:43 PM

@ed_sas_member that's what I'm aiming for.

ed_sas_member · Posted 05-19-2020 01:55 PM

Hi @andyec

I have added a proc transpose step at the end of the program, just before the merge:

data have_flag;
	do i=1 until(last.id);
		set have;
		format date7 MMDDYY10.;
		by id;
		retain date7;
		if first.id then date7 = date;
		if date-date7 > 7 then date7 = date;
	end;
run;

proc transpose data=have_flag out=have_tr1 prefix=date;
	var date;
	by id date7;
run;

proc transpose data=have_flag out=have_tr2 prefix=value;
	var value;
	by id date7;
run;

proc transpose data=have_flag out=have_tr3 prefix=psource;
	var psource;
	by id date7;
run;

data want;
	merge have_tr1 have_tr2 have_tr3;
	by id date7;
	drop _name_ date7;
run;

Best,

novinosrin · Posted 05-19-2020 01:58 PM

Hi @andyec My apologies, I overlooked something and that's very silly of me. First try this change and test in full and let me know if this works, and then I shall delve into your additional requirement. Sorry again

proc sql noprint;
select max(m) into :m trimmed
from (select count(grp) as m from temp group by id,grp);
quit;

%put &=m;

andyec · Posted 05-19-2020 02:48 PM

@novinosrin no problem - that fixes it.

novinosrin · Posted 05-19-2020 02:54 PM

Okay good @andyec Are you able to modify the code to get the additional variables too. Methinks, another array and assignment statement like the following should do

array p psource1-psource&m;
p(_n_)=psource;

should do. 🙂

If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Re: If multiple values within timeframe, then make a row of all these values from a long dataset

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away