In Analytics, Data speaks better than words ! So please have a look at the following test data which I arranged using sort command.
Number | Date |
1234 | 12-Sep- 15 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
2254 | 12-Sep-15 |
2254 | 13-Sep-15 |
2235 | 13-Sep-15 |
2510 | 13-Sep-15 |
2510 | 13-Sep-15 |
I have sorted the data by using sort number, date and now I want to insert a new column called counter as below
Number | Date | Counter |
1234 | 12-Sep-15 | 1 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
2254 | 12-Sep-15 | 1 |
2254 | 13-Sep-15 | 2 |
2235 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
This counter gives the number of dates for particular group of numbers. Example in the first group namely '1234' has two dates so counter is 1 for first date and 2 for second date irrespective of its number of occurences.
After adding this column, I also want to see the statistics of this new column called 'counter' like maximum value, average value etc., if there is some predefined procedure for the same, I will be grateful for knowledge sharing.
Thanks in advance !
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
Like this?
data HAVE;
input Number Date date7.;
cards;
1234 12Sep15
1234 12Jan16
1234 12Jan16
1234 12Jan16
2254 12Sep15
2254 13Sep15
2265 13Sep15
2510 13Sep15
2510 13Sep15
run;
data WANT;
set HAVE; by NUMBER ;
if first.NUMBER then COUNTER=0;
if first.NUMBER | DATE ne lag(DATE) then COUNTER+1;
run;
proc print data=WANT noobs;
run;
proc means data=WANT max mean;
class NUMBER;
var COUNTER;
run;
Number | Date | COUNTER |
---|---|---|
1234 | 20343 | 1 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
2254 | 20343 | 1 |
2254 | 20344 | 2 |
2265 | 20344 | 1 |
2510 | 20344 | 1 |
2510 | 20344 | 1 |
Analysis Variable : COUNTER | |||
---|---|---|---|
Number | N Obs | Maximum | Mean |
1234 | 4 | 2.0000000 | 1.7500000 |
2254 | 2 | 2.0000000 | 1.5000000 |
2265 | 1 | 1.0000000 | 1.0000000 |
2510 | 2 | 1.0000000 | 1.0000000 |
data want;
do until(first.date);
set have;
by number date;
retain count;
if first.number then count=1;
else count+1;
end;
do until(last.date);
set have;
by number date;
output;
end;
run;
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
@Ksharp why 'not sorted' is used here ? Is it because the file is already sorted ?
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.