Hello all,
Please see the dataset below:
ID | SVCDATE |
1 | 22016 |
1 | 22016 |
1 | 22016 |
1 | 32016 |
1 | 82016 |
2 | 72016 |
2 | 72016 |
2 | 82016 |
2 | 102016 |
What I want is:
ID | SVCDATE | COUNT |
1 | 22016 | 3 |
1 | 30216 | 1 |
1 | 82016 | 1 |
2 | 72016 | 2 |
2 | 82016 | 1 |
2 | 102016 | 1 |
This is best done in SQL:
data have;
input ID $ 1 SVCDATE;
cards;
1 22016
1 22016
1 22016
1 32016
1 82016
2 72016
2 72016
2 82016
2 102016
;
run;
proc sql;
create table want as
select ID
,SVCDATE
,count(*) as Count
from have
group by ID
,SVCDATE
;
quit;
proc freq data=have;
tables id*svdate/list;
run;
Please, in the future do not provide data as screen captures. Provide the data according to these instructions.
This is best done in SQL:
data have;
input ID $ 1 SVCDATE;
cards;
1 22016
1 22016
1 22016
1 32016
1 82016
2 72016
2 72016
2 82016
2 102016
;
run;
proc sql;
create table want as
select ID
,SVCDATE
,count(*) as Count
from have
group by ID
,SVCDATE
;
quit;
Thanks! that worked.
Are your data already arranged such that each id/svcdate appears in consecutive records? If so, then
data have;
input ID SVCDATE;
datalines;
1 22016
1 22016
1 22016
1 32016
1 82016
2 72016
2 72016
2 82016
2 102016
run;
data want;
set have;
by id svcdate notsorted;
if last.svcdate;
count=dif(_n_);
if count=. then count=_n_;
run;
The automatic variable _N_ is the "iteration number" of the data step - in this case equivalent to the observation number. The DIF function is defined as DIF(_N_)=_N_-lag(_N_).
Since the dif function is only executed at the end of each id/svcdate group, due to the "subsetting IF" statement
if last.svcdate;
the lagged value of _N_ will always be the _N_ at the end of the prior group - i.e. you are calculating
COUNT= _N_(at end of current group) - _N_(of end of prior group).
For the first group COUNT will be missing, so it has to be corrected.
@raajdesaii wrote:
This does not get me what I have in the output.
I want to know how many unique date counts are by each id. For e.g. ID1 has 3 counts for 22016, ID1 has 1 count for 32016. I want to do this in a datastep.
It works for me.
Doing it in a DATA step is the hard way to do it. You have to write your own code, debug it, and so on, whereas using PROC FREQ, SAS has already done the hard work of programming it, testing it, and debugging it.
data have;
input ID SVCDATE;
cards;
1 22016
1 22016
1 22016
1 32016
1 82016
2 72016
2 72016
2 82016
2 102016
;
data want;
do until(last.svcdate);
set have;
by id svcdate notsorted;
count=sum(count,1);
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.