I am trying to create a count variable based on a numeric date variable (format yyq6.), but I am unable to get what I want where each new date value is ennumerated
DATE TCOUNT
2013Q1 1
2013Q1 1
2013Q2 2
2013Q2 2
2013Q2 2
2013Q3 3
2013Q3 3
2013Q3 3
2013Q4 4
2013Q4 4
2014Q1 5
2014Q1 5
2014Q2 6
2014Q2 6
Below is the code that I have:
DATA WORK.TEST1;
SET HAVE;
LABEL TCOUNT = "Time Count";
TCOUNT = 0;
RUN;
PROC SORT DATA = WORK.TEST1;
BY DATE ;
RUN;
DATA WORK.TEST1;
SET WORK.TEST1;
BY DATE;
IF FIRST.DATE THEN TCOUNT = TCOUNT + 1;
RUN;
The output is the following:
DATE TCOUNT
2013Q1 1
2013Q1 0
2013Q1 0
2013Q1 0
2013Q1 0
2013Q2 0
2013Q2 0
2013Q2 0
2013Q3 0
2013Q3 0
2013Q4 0
2013Q4 0
2013Q4 0
2013Q4 0
2014Q1 0
2014Q1 0
2014Q1 0
... 0
You are assigning the value zero to TCOUNT for every observation, but that is the problem. Also, RETAIN TCOUNT; is not wrong, but it is also not necessary.
data work.test1;
set have;
label tcount = "time count";
/*tcount = 0;*/
run;
proc sort data = work.test1;
by date ;
run;
data work.test1;
set work.test1;
/* retain tcount; */
by date;
if first.date then tcount + 1;
run;
Thus, the code can be simplified to
proc sort data = have;
by date ;
run;
data work.test1;
set have;
by date;
if first.date then tcount + 1;
run;
Have you tried putting a RETAIN on TCOUNT? Or, alternatively, change it to TCOUNT + 1 instead of TCOUNT = TCOUNT + 1. TCOUNT + 1 syntax causes an automatic retain.
I would also get rid of the Data step before the Sort. Apply the label in the final step. If you create TCOUNT before the final Data step, TCOUNT will not be a counter. TCOUNT will get read in as zero each time.
I would fix it like this:
DATA Have;
FORMAT DATE YYQ6.;
INFILE DATALINES TRUNCOVER;
INPUT DATE : YYQ6.;
DATALINES;
2013Q1
2013Q1
2013Q2
2013Q2
2013Q2
2013Q3
2013Q3
2013Q3
2013Q4
2013Q4
2014Q1
2014Q1
2014Q2
2014Q2
;
RUN;
PROC SORT DATA = Have
OUT = WORK.TEST1;
BY DATE;
RUN;
DATA WORK.TEST1;
SET WORK.TEST1;
BY DATE;
LABEL TCOUNT = "Time Count";
IF FIRST.DATE THEN
TCOUNT + 1;
RUN;
Results:
Jim
You are assigning the value zero to TCOUNT for every observation, but that is the problem. Also, RETAIN TCOUNT; is not wrong, but it is also not necessary.
data work.test1;
set have;
label tcount = "time count";
/*tcount = 0;*/
run;
proc sort data = work.test1;
by date ;
run;
data work.test1;
set work.test1;
/* retain tcount; */
by date;
if first.date then tcount + 1;
run;
Thus, the code can be simplified to
proc sort data = have;
by date ;
run;
data work.test1;
set have;
by date;
if first.date then tcount + 1;
run;
Yes, actually quite the opposite. By adding TCOUNT to the dataset before the Data step counting TCOUNT, you caused TCOUNT to be read in each time the SET statement was executed, wiping out any count and replacing it with the initial value (zero). You could have initialized TCOUNT in the same Data step as the counting, but it would not have been necessary.
Glad we got you working,
Jim
Read my response in detail. I added an example after I initially posted it.
This Data step is screwing things up. It needs to be deleted.
DATA WORK.TEST1;
SET HAVE;
LABEL TCOUNT = "Time Count";
TCOUNT = 0;
RUN;
Jim
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.