In Analytics, Data speaks better than words ! So please have a look at the following test data which I arranged using sort command.
Number | Date |
1234 | 12-Sep- 15 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
1234 | 12-Jan-16 |
2254 | 12-Sep-15 |
2254 | 13-Sep-15 |
2235 | 13-Sep-15 |
2510 | 13-Sep-15 |
2510 | 13-Sep-15 |
I have sorted the data by using sort number, date and now I want to insert a new column called counter as below
Number | Date | Counter |
1234 | 12-Sep-15 | 1 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
1234 | 12-Jan-16 | 2 |
2254 | 12-Sep-15 | 1 |
2254 | 13-Sep-15 | 2 |
2235 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
2510 | 13-Sep-15 | 1 |
This counter gives the number of dates for particular group of numbers. Example in the first group namely '1234' has two dates so counter is 1 for first date and 2 for second date irrespective of its number of occurences.
After adding this column, I also want to see the statistics of this new column called 'counter' like maximum value, average value etc., if there is some predefined procedure for the same, I will be grateful for knowledge sharing.
Thanks in advance !
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
Like this?
data HAVE;
input Number Date date7.;
cards;
1234 12Sep15
1234 12Jan16
1234 12Jan16
1234 12Jan16
2254 12Sep15
2254 13Sep15
2265 13Sep15
2510 13Sep15
2510 13Sep15
run;
data WANT;
set HAVE; by NUMBER ;
if first.NUMBER then COUNTER=0;
if first.NUMBER | DATE ne lag(DATE) then COUNTER+1;
run;
proc print data=WANT noobs;
run;
proc means data=WANT max mean;
class NUMBER;
var COUNTER;
run;
Number | Date | COUNTER |
---|---|---|
1234 | 20343 | 1 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
1234 | 20465 | 2 |
2254 | 20343 | 1 |
2254 | 20344 | 2 |
2265 | 20344 | 1 |
2510 | 20344 | 1 |
2510 | 20344 | 1 |
Analysis Variable : COUNTER | |||
---|---|---|---|
Number | N Obs | Maximum | Mean |
1234 | 4 | 2.0000000 | 1.7500000 |
2254 | 2 | 2.0000000 | 1.5000000 |
2265 | 1 | 1.0000000 | 1.0000000 |
2510 | 2 | 1.0000000 | 1.0000000 |
data want;
do until(first.date);
set have;
by number date;
retain count;
if first.number then count=1;
else count+1;
end;
do until(last.date);
set have;
by number date;
output;
end;
run;
data HAVE; input Number Date date7.; cards; 1234 12Sep15 1234 12Jan16 1234 12Jan16 1234 12Jan16 2254 12Sep15 2254 13Sep15 2265 13Sep15 2510 13Sep15 2510 13Sep15 run; data WANT; set HAVE; by NUMBER Date notsorted; if first.NUMBER then COUNT=0; count+first.Date; run;
@Ksharp why 'not sorted' is used here ? Is it because the file is already sorted ?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.