In Analytics, Data speaks better than words ! So please have a look at the following test data which I arranged using sort command.
| Number | Date |
| 1234 | 12-Sep- 15 |
| 1234 | 12-Jan-16 |
| 1234 | 12-Jan-16 |
| 1234 | 12-Jan-16 |
| 2254 | 12-Sep-15 |
| 2254 | 13-Sep-15 |
| 2235 | 13-Sep-15 |
| 2510 | 13-Sep-15 |
| 2510 | 13-Sep-15 |
I have sorted the data by using sort number, date and now I want to insert a new column called counter as below
| Number | Date | Counter |
| 1234 | 12-Sep-15 | 1 |
| 1234 | 12-Jan-16 | 2 |
| 1234 | 12-Jan-16 | 2 |
| 1234 | 12-Jan-16 | 2 |
| 2254 | 12-Sep-15 | 1 |
| 2254 | 13-Sep-15 | 2 |
| 2235 | 13-Sep-15 | 1 |
| 2510 | 13-Sep-15 | 1 |
| 2510 | 13-Sep-15 | 1 |
This counter gives the number of dates for particular group of numbers. Example in the first group namely '1234' has two dates so counter is 1 for first date and 2 for second date irrespective of its number of occurences.
After adding this column, I also want to see the statistics of this new column called 'counter' like maximum value, average value etc., if there is some predefined procedure for the same, I will be grateful for knowledge sharing.
Thanks in advance !
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
Like this?
data HAVE;
input Number Date date7.;
cards;
1234 12Sep15
1234 12Jan16
1234 12Jan16
1234 12Jan16
2254 12Sep15
2254 13Sep15
2265 13Sep15
2510 13Sep15
2510 13Sep15
run;
data WANT;
set HAVE; by NUMBER ;
if first.NUMBER then COUNTER=0;
if first.NUMBER | DATE ne lag(DATE) then COUNTER+1;
run;
proc print data=WANT noobs;
run;
proc means data=WANT max mean;
class NUMBER;
var COUNTER;
run;
| Number | Date | COUNTER |
|---|---|---|
| 1234 | 20343 | 1 |
| 1234 | 20465 | 2 |
| 1234 | 20465 | 2 |
| 1234 | 20465 | 2 |
| 2254 | 20343 | 1 |
| 2254 | 20344 | 2 |
| 2265 | 20344 | 1 |
| 2510 | 20344 | 1 |
| 2510 | 20344 | 1 |
| Analysis Variable : COUNTER | |||
|---|---|---|---|
| Number | N Obs | Maximum | Mean |
| 1234 | 4 | 2.0000000 | 1.7500000 |
| 2254 | 2 | 2.0000000 | 1.5000000 |
| 2265 | 1 | 1.0000000 | 1.0000000 |
| 2510 | 2 | 1.0000000 | 1.0000000 |
data want;
do until(first.date);
set have;
by number date;
retain count;
if first.number then count=1;
else count+1;
end;
do until(last.date);
set have;
by number date;
output;
end;
run;
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
@Ksharp why 'not sorted' is used here ? Is it because the file is already sorted ?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.