In Analytics, Data speaks better than words ! So please have a look at the following test data which I arranged using sort command.
Number | Date |
1234 | 12-Sep- 15 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
2254 | 12-Sep-15 |
2254 | 13-Sep-15 |
2235 | 13-Sep-15 |
2510 | 13-Sep-15 |
2510 | 13-Sep-15 |
I have sorted the data by using sort number, date and now I want to insert a new column called counter as below
Number | Date | Counter |
1234 | 12-Sep-15 | 1 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
2254 | 12-Sep-15 | 1 |
2254 | 13-Sep-15 | 2 |
2235 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
This counter gives the number of dates for particular group of numbers. Example in the first group namely '1234' has two dates so counter is 1 for first date and 2 for second date irrespective of its number of occurences.
After adding this column, I also want to see the statistics of this new column called 'counter' like maximum value, average value etc., if there is some predefined procedure for the same, I will be grateful for knowledge sharing.
Thanks in advance !
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
Like this?
data HAVE;
input Number Date date7.;
cards;
1234 12Sep15
1234 12Jan16
1234 12Jan16
1234 12Jan16
2254 12Sep15
2254 13Sep15
2265 13Sep15
2510 13Sep15
2510 13Sep15
run;
data WANT;
set HAVE; by NUMBER ;
if first.NUMBER then COUNTER=0;
if first.NUMBER | DATE ne lag(DATE) then COUNTER+1;
run;
proc print data=WANT noobs;
run;
proc means data=WANT max mean;
class NUMBER;
var COUNTER;
run;
Number | Date | COUNTER |
---|---|---|
1234 | 20343 | 1 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
2254 | 20343 | 1 |
2254 | 20344 | 2 |
2265 | 20344 | 1 |
2510 | 20344 | 1 |
2510 | 20344 | 1 |
Analysis Variable : COUNTER | |||
---|---|---|---|
Number | N Obs | Maximum | Mean |
1234 | 4 | 2.0000000 | 1.7500000 |
2254 | 2 | 2.0000000 | 1.5000000 |
2265 | 1 | 1.0000000 | 1.0000000 |
2510 | 2 | 1.0000000 | 1.0000000 |
data want;
do until(first.date);
set have;
by number date;
retain count;
if first.number then count=1;
else count+1;
end;
do until(last.date);
set have;
by number date;
output;
end;
run;
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
@Ksharp why 'not sorted' is used here ? Is it because the file is already sorted ?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.