Hello all. I am working with data similar to the following
(this is the individual row level data with many other columns for a given lab test received):
labname received_date
Quest 2/1/17
Quest 2/4/17
Quest 2/4/17
Labcorp 2/1/17
Labcorp 2/2/17
and I would like to produce a final summary table that looks like this, which includes rows for all days in a range, not only
the days that had data associated with them for that lab, i.e.,
labname date count
Quest 2/1/17 1
Quest 2/2/17 0
Quest 2/3/17 0
Quest 2/4/17 2
Labcorp 2/1/17 1
Labcorp 2/2/17 1
Labcorp 2/3/17 0
Labcorp 2/4/17 0
...
I am able to get the counts in a table from proc freq where there were records for a given lab on a particular day. But I would like to produce the final table/report to include all days within a range, so that we can look at the report and see when there were no records sent on particular days.
I am not sure how to approach this. Something in my brain says I could create a separate table of all dates of interest, then somehow join or merge to have them included in the data, but I don't know how to do that nor am I sure that's the path to go down.
Any ideas are appreciated, thank you.
Assuming:
then:
data want;
do count=1 by 1 until (last.received_date);
set have ;
by labname received_date notsorted;
if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
end;
output;
lastdate=ifn(last.labname,'04feb2017'd,nxtdate-1);
count=0;
if lastdate>received_date then do received_date=received_date+1 to lastdate;
output;
end;
run;
If not all lab's begin on 2/1/2017, then you could modify to:
%let begdate=01feb2017;
%let enddate=04feb2017;
data want;
do count=1 by 1 until (last.received_date);
set have ;
by labname received_date notsorted;
if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
end;
/* Fill any gaps at the beginning of a lab */
if lag(labname)^=labname and received_date> "&begdate"d then do;
store_count=count; store_date=received_date;
count=0;
do received_date="&begdate"d to store_date-1;
output;
end;
count=store_count; received_date=store_date;
end;
/* Output current count */
output;
/* Take care of any gaps following this date */
lastdate=ifn(last.labname,"&enddate"d,nxtdate-1);
count=0;
if lastdate>received_date then do received_date=received_date+1 to lastdate;
output;
end;
run;
In this program only the expected date range is known in advance (see the %let begdate= and %let enddate= statements). You could also run a PROC SQL with "select into :" to gets those dates, as in:
proc sql noprint;
select min(received_date) format=date9., max(received_date) format=date9.
into :begdate , :enddate
from have;
quit;
Hi UCDCRush,
So many firsts today! Had my first question and my first (hopefully helpful) answer.
I've quickly written some code you may find useful. In general, proc sql works great for these kinds of issues. The only downside is I don't really know how to generate 0's for days where you have no data for a particular lab.
*Input data;
data test;
input lab: $
date: mmddyy10.;
format date mmddyy10.;
datalines;
Quest 2/1/17
Quest 2/4/17
Quest 2/4/17
Labcorp 2/1/17
Labcorp 2/2/17
;
run;
*Print to make sure results are as expected;
proc print data=test;
attrib _all_ format=;
run;
*Use SQL to create a table. You want the count of total entries for each day grouped by both the date and the lab;
proc sql;
create table sums as
select lab as lab, date as date, count(lab) as count
from test
group by date, lab;
quit;
*Print new table, check for accuracy;
proc print data=sums;
run;
You can use "where" statements after you generate your count data to limit to the dates you want. So, if you wanted to limit to dates between 2/1/17 and 2/4/17, you could use this code:
*Print new table, check for accuracy;
proc print data=sums;
where 20851<=date<=20854;
run;
You just have to be familiar with how SAS codes dates. using "attrib _all_ format=.;" is a statement you can add to almost any proc to see the unformatted values of variables.
Hope this helps!
Sorry for the double post. Here is the table you get when you print the "sums" dataset:
Obs | lab | date | count |
1 | Labcorp | 02/01/2017 | 1 |
2 | Quest | 02/01/2017 | 1 |
3 | Labcorp | 02/02/2017 | 1 |
4 | Quest | 02/04/2017 | 2 |
Assuming:
then:
data want;
do count=1 by 1 until (last.received_date);
set have ;
by labname received_date notsorted;
if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
end;
output;
lastdate=ifn(last.labname,'04feb2017'd,nxtdate-1);
count=0;
if lastdate>received_date then do received_date=received_date+1 to lastdate;
output;
end;
run;
If not all lab's begin on 2/1/2017, then you could modify to:
%let begdate=01feb2017;
%let enddate=04feb2017;
data want;
do count=1 by 1 until (last.received_date);
set have ;
by labname received_date notsorted;
if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
end;
/* Fill any gaps at the beginning of a lab */
if lag(labname)^=labname and received_date> "&begdate"d then do;
store_count=count; store_date=received_date;
count=0;
do received_date="&begdate"d to store_date-1;
output;
end;
count=store_count; received_date=store_date;
end;
/* Output current count */
output;
/* Take care of any gaps following this date */
lastdate=ifn(last.labname,"&enddate"d,nxtdate-1);
count=0;
if lastdate>received_date then do received_date=received_date+1 to lastdate;
output;
end;
run;
In this program only the expected date range is known in advance (see the %let begdate= and %let enddate= statements). You could also run a PROC SQL with "select into :" to gets those dates, as in:
proc sql noprint;
select min(received_date) format=date9., max(received_date) format=date9.
into :begdate , :enddate
from have;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.