BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

In the following code my understanding is to sort the input dataset by the provided variables. Curve_day value will be missing for the first curve value for the same combination of the variables PO, CURVE and POSTING_DT.

 

Can someone clarify  how the values of CURVE_DAY are 2,3 and 4 in the results as the POSTING_DT values are different in each record?

 

DATA WORK.JAXMERGE;
     SET WORK.JAXMERGE;
     BY PO CURVE POSTNG_DT;
     IF FIRST.CURVE THEN CURVE_DAY = .;
     CURVE_DAY+1;
RUN;

Results:

 

First..jpg

6 REPLIES 6
mkeintz
PROC Star

In the program 

DATA WORK.JAXMERGE;
     SET WORK.JAXMERGE;
     BY PO CURVE POSTNG_DT;
     IF FIRST.CURVE THEN CURVE_DAY = .;
     CURVE_DAY+1;
RUN

you set CURVE_DAY to missing at the start of a CURVE

but then you immediately incremented it by 1.

 

Your syntax for the increment was to use a "summing statement", i.e.

CURVE_DAY+1

which will treat a missing value for CURVE_DAY as a zero.

 

Now if you want a missing, followed by 1,2,3 etc, you could

DATA WORK.JAXMERGE;
     SET WORK.JAXMERGE;
     BY PO CURVE POSTNG_DT;
     IF FIRST.CURVE THEN CURVE_DAY = .;
     else CURVE_DAY+1;
RUN

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Astounding
PROC Star

You could answer these questions and more by adding to your DATA step.

DATA WORK.JAXMERGE;
     SET WORK.JAXMERGE;
     BY PO CURVE POSTNG_DT;
put 'Start:  ' curve_day=;
     IF FIRST.CURVE THEN CURVE_DAY = .;
put 'Middle:  ' curve_day=;
     CURVE_DAY+1;
put 'End:  ' curve_day=;
RUN;

That way, you can inspect the value of CURVE_DAY at various points, as the programming statements execute.

Tom
Super User Tom
Super User

It is only missing until the SUM statement executes.  Then it becomes 1 because SUM(.,1) is 1.

 

POSTING_DT does not impact the group, just the ordering.

So the generated CURVE_DAY values are generated in POSTING_DT order. 

 

Note that if you have multiple observations for the same POSTING_DT value (within the other two grouping variables) they will each get a different value of CURVE_DAY.  Also if there are gaps in the values or POSTING_DT so that one or more dates are skipped the numbers assigned to CURVE_DAY will not reflect that gap.  So CURVE_DAY is not really a relative day value, but just a number you can use to order the values.

David_Billa
Rhodochrosite | Level 12

@Tom @Kurt_Bremser So in the following code we're doing operations as shown for first PO value and if it is last PO value, we just need to result it in output?

 

DATA WORK.HAVE;
     SET WORK.HAVE;
     BY FISC_YR FISC_PD FISC_WK PO;
     IF FIRST.PO THEN DO;
           TOT = 0;
           CS_TOT = 0;
           LB_TOT = 0;
     END;
           ZNL_TOT +ZNL;
           CS_TOT +CS;
           LB_TOT +LB;
     IF LAST.PO;
RUN;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1203 views
  • 5 likes
  • 5 in conversation