Hello All,
I have this following sample data set as per the excel screenshot below (also attached as excel document ).
The data can be read in using the code below (for some reason, "date" isnt getting read in using datalines):
data have;
infile datalines dsd truncover;
input Patient 2. Code $2. Index 2. Date mmddyy10.;
format date MMDDYY10.;
datalines;
1 AA 0 1/1/2015
1 BB 0 02/12/2016
1 CC 1 3/5/2016
1 DD 0 4/16/2016
1 EE 1 5/28/2016
1 FF 1 6/19/2016
1 GG 0 7/11/2016
1 HH 0 8/12/2016
1 AA 1 9/23/2016
1 JJ 1 12/15/2016
2 CC 1 11/26/2017
2 EE 0 12/1/2017
2 CC 0 1/29/2018
2 AA 1 2/20/2018
2 BB 1 3/14/2018
2 JJ 1 6/5/2018
2 AA 0 5/17/2018
2 HH 0 6/28/2018
;
I want the frequency of Code for the prior 6 months corresponding to Index=1
So whenever index=1 say on 5th March 2016, I want the frequency of 'Code' for 6 months prior to 6th March 2016.
In the above excel screenshot, the bold font are the variables that are in the dataset. The Index1, DiffDates: are just for demonstration which shows the difference between date (corresponding to Index=1) and the dates that occur prior to that per patient. Whenever this difference is <=180, I want to include such records in the frequency calculation of Code.
I want this frequency across all patients per screenshot below.
I can see there is an INTNX function that allows me to output dates 180 days prior to the begin_date whenever index=1 but I think what I am trying to achieve is much more complicated than that. Any ideas how to produce such a frequency table would be greatly appreciated!!
Regards,
Tina
I've added a few rows to your sample data. You're asking for a look-back period of 6 months but in your sample data all gaps are bigger within a patient and code.
Does below return what you're after? And if not then can you explain how the desired result should look like using exactly the sample data posted.
data have;
infile datalines dsd dlm=' ' truncover;
input Patient :$8. Code :$2. Index :$1. Date :mmddyy10.;
format date date9.;
datalines;
1 AA 0 1/1/2015
1 AA 1 3/1/2015
1 AA 1 5/1/2015
1 AA 0 7/1/2015
1 AA 1 9/1/2015
1 AA 1 9/2/2015
1 AA 1 9/3/2015
1 BB 0 02/12/2016
1 CC 1 3/5/2016
1 DD 0 4/16/2016
1 EE 1 5/28/2016
1 FF 1 6/19/2016
1 GG 0 7/11/2016
1 HH 0 8/12/2016
1 AA 1 9/23/2016
1 JJ 1 12/15/2016
2 CC 1 11/26/2017
2 EE 0 12/1/2017
2 CC 0 1/29/2018
2 AA 1 2/20/2018
2 BB 1 3/14/2018
2 JJ 1 6/5/2018
2 AA 0 5/17/2018
2 HH 0 6/28/2018
;
proc sql;
/* create table counts as*/
select l.patient, l.code, l.date, count(*) as CountCodes
from
have l left join have r
on l.patient=r.patient and l.code=r.code
and r.date between intnx('month',l.date,-6,'s') and l.date
where l.index='1'
group by l.patient, l.date, l.code
;
quit;
I just wanted to add that here is my progress code which is not exactly getting the output but is almost there:
proc sql;
create table counts as
select h1.patient, h1.date, h1.index, count(Code) as CountCodes
from have as h1
INNER JOIN
have as h2
ON h2.date between intnx("day",h1.date,-180) and h1.date
group by patient, date ;
select * from count_codes
;quit;
I think I am making a mistake in group by ...Any ideas?
I've added a few rows to your sample data. You're asking for a look-back period of 6 months but in your sample data all gaps are bigger within a patient and code.
Does below return what you're after? And if not then can you explain how the desired result should look like using exactly the sample data posted.
data have;
infile datalines dsd dlm=' ' truncover;
input Patient :$8. Code :$2. Index :$1. Date :mmddyy10.;
format date date9.;
datalines;
1 AA 0 1/1/2015
1 AA 1 3/1/2015
1 AA 1 5/1/2015
1 AA 0 7/1/2015
1 AA 1 9/1/2015
1 AA 1 9/2/2015
1 AA 1 9/3/2015
1 BB 0 02/12/2016
1 CC 1 3/5/2016
1 DD 0 4/16/2016
1 EE 1 5/28/2016
1 FF 1 6/19/2016
1 GG 0 7/11/2016
1 HH 0 8/12/2016
1 AA 1 9/23/2016
1 JJ 1 12/15/2016
2 CC 1 11/26/2017
2 EE 0 12/1/2017
2 CC 0 1/29/2018
2 AA 1 2/20/2018
2 BB 1 3/14/2018
2 JJ 1 6/5/2018
2 AA 0 5/17/2018
2 HH 0 6/28/2018
;
proc sql;
/* create table counts as*/
select l.patient, l.code, l.date, count(*) as CountCodes
from
have l left join have r
on l.patient=r.patient and l.code=r.code
and r.date between intnx('month',l.date,-6,'s') and l.date
where l.index='1'
group by l.patient, l.date, l.code
;
quit;
Thank you very much Patrick! I was able to get it to work finally. I really appreciate it..
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.