DATA Step, Macro, Functions and more

Including dates/rows with no frequency counts

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

Including dates/rows with no frequency counts

Hello all. I am working with data similar to the following

(this is the individual row level data with many other columns for a given lab test received):

labname     received_date    

Quest            2/1/17              

Quest            2/4/17                 

Quest            2/4/17

Labcorp         2/1/17

Labcorp        2/2/17

 

and I would like to produce a final summary table that looks like this, which includes rows for all days in a range, not only

the days that had data associated with them for that lab, i.e.,

 

labname        date     count 

Quest           2/1/17       1

Quest           2/2/17       0

Quest           2/3/17       0

Quest           2/4/17       2

Labcorp       2/1/17        1  

Labcorp       2/2/17        1  

Labcorp       2/3/17        0

Labcorp       2/4/17        0

...

 

I am able to get the counts in a table from proc freq where there were records for a given lab on a particular day. But I would like to produce the final table/report to include all days within a range, so that we can look at the report and see when there were no records sent on particular days.

 

I am not sure how to approach this. Something in my brain says I could create a separate table of all dates of interest, then somehow join or merge to have them included in the data, but I don't know how to do that nor am I sure that's the path to go down.

 

Any ideas are appreciated, thank you.


Accepted Solutions
Solution
‎02-08-2017 06:09 PM
Valued Guide
Posts: 797

Re: Including dates/rows with no frequency counts

Assuming:

 

  1. data are grouped (not necessarily sorted) by labname
  2. data are sorted by received_date within labname
  3. every labname begins with are received_date=2/1/17
  4. the maximum date is known to be 2/4/17

then:

 

data want;
  do count=1 by 1 until (last.received_date);
    set have ;
    by labname received_date notsorted;
    if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
  end;
  output;

  lastdate=ifn(last.labname,'04feb2017'd,nxtdate-1);
  count=0;
  if lastdate>received_date then do received_date=received_date+1 to lastdate;
    output;
  end;
run;

 

 

If not all lab's begin on 2/1/2017, then you could modify to:

 

%let begdate=01feb2017;
%let enddate=04feb2017;

data want;
  do count=1 by 1 until (last.received_date);
    set have ;
    by labname received_date notsorted;
    if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
  end;

  /* Fill any gaps at the beginning of a lab */
  if lag(labname)^=labname and received_date> "&begdate"d then do;
    store_count=count; store_date=received_date;
    count=0;
    do received_date="&begdate"d to store_date-1;
      output;
    end;
    count=store_count; received_date=store_date;
  end;

  /* Output current count */
  output;

  /* Take care of any gaps following this date */
  lastdate=ifn(last.labname,"&enddate"d,nxtdate-1);
  count=0;
  if lastdate>received_date then do received_date=received_date+1 to lastdate;
    output;
  end;
run;

 

In this program only the expected date range is known in advance (see the %let begdate=  and %let enddate= statements).  You could also run a PROC SQL with "select into :" to gets those dates, as in:

 

proc sql noprint;

  select min(received_date) format=date9., max(received_date) format=date9.

  into :begdate , :enddate

  from have;

quit;

View solution in original post


All Replies
New Contributor
Posts: 4

Re: Including dates/rows with no frequency counts

[ Edited ]

Hi UCDCRush,

 

So many firsts today! Had my first question and my first (hopefully helpful) answer.

 

I've quickly written some code you may find useful. In general, proc sql works great for these kinds of issues. The only downside is I don't really know how to generate 0's for days where you have no data for a particular lab.

 

*Input data;
data test;
	input lab: $ 
		  date: mmddyy10.;
		format date mmddyy10.;
	datalines;
Quest	2/1/17
Quest	2/4/17
Quest	2/4/17
Labcorp	2/1/17
Labcorp	2/2/17
;

run;

*Print to make sure results are as expected;
proc print data=test;
	attrib _all_ format=;
run;

*Use SQL to create a table. You want the count of total entries for each day grouped by both the date and the lab;
proc sql;
	create table sums as
	select lab as lab,  date as date, count(lab) as count
	from test
	group by date, lab;
quit;

*Print new table, check for accuracy;
proc print data=sums;
run;

You can use "where" statements after you generate your count data to limit to the dates you want. So, if you wanted to limit to dates between 2/1/17 and 2/4/17, you could use this code:

 

*Print new table, check for accuracy;
proc print data=sums;
	where 20851<=date<=20854;
run;

You just have to be familiar with how SAS codes dates. using "attrib _all_ format=.;" is a statement you can add to almost any proc to see the unformatted values of variables.

 

 

Hope this helps!

New Contributor
Posts: 4

Re: Including dates/rows with no frequency counts

Sorry for the double post. Here is the table you get when you print the "sums" dataset:

 

Obs

lab

date

count

1

Labcorp

02/01/2017

1

2

Quest

02/01/2017

1

3

Labcorp

02/02/2017

1

4

Quest

02/04/2017

2

Solution
‎02-08-2017 06:09 PM
Valued Guide
Posts: 797

Re: Including dates/rows with no frequency counts

Assuming:

 

  1. data are grouped (not necessarily sorted) by labname
  2. data are sorted by received_date within labname
  3. every labname begins with are received_date=2/1/17
  4. the maximum date is known to be 2/4/17

then:

 

data want;
  do count=1 by 1 until (last.received_date);
    set have ;
    by labname received_date notsorted;
    if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
  end;
  output;

  lastdate=ifn(last.labname,'04feb2017'd,nxtdate-1);
  count=0;
  if lastdate>received_date then do received_date=received_date+1 to lastdate;
    output;
  end;
run;

 

 

If not all lab's begin on 2/1/2017, then you could modify to:

 

%let begdate=01feb2017;
%let enddate=04feb2017;

data want;
  do count=1 by 1 until (last.received_date);
    set have ;
    by labname received_date notsorted;
    if eod=0 then set have (firstobs=2 keep=received_date rename=(received_date=nxtdate)) end=eod;
  end;

  /* Fill any gaps at the beginning of a lab */
  if lag(labname)^=labname and received_date> "&begdate"d then do;
    store_count=count; store_date=received_date;
    count=0;
    do received_date="&begdate"d to store_date-1;
      output;
    end;
    count=store_count; received_date=store_date;
  end;

  /* Output current count */
  output;

  /* Take care of any gaps following this date */
  lastdate=ifn(last.labname,"&enddate"d,nxtdate-1);
  count=0;
  if lastdate>received_date then do received_date=received_date+1 to lastdate;
    output;
  end;
run;

 

In this program only the expected date range is known in advance (see the %let begdate=  and %let enddate= statements).  You could also run a PROC SQL with "select into :" to gets those dates, as in:

 

proc sql noprint;

  select min(received_date) format=date9., max(received_date) format=date9.

  into :begdate , :enddate

  from have;

quit;

Contributor
Posts: 38

Re: Including dates/rows with no frequency counts

mkeintz - thank you very much, and for providing the alternate methods to specify the date range. It works perfectly!
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 153 views
  • 0 likes
  • 3 in conversation