BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

I want to create serial number in such way that all rows for same VAR get same serial number.

This code is working 100%

However, I dont understand how SAS calculate it and how it works.

Is it possible to see the SAS iterations step by step to understand how sas perform the calculation ?

As I understand  in first line of sex the value is 1  (0+1)

then I dont understand why second line of sex  is also 1 because   1+1 is 2

I also dont undersdtand why first row of VAR age  is 2?

 

Data have;
input VAR $ category  estimate;
cards;
sex 1 0.7257
sex 2 0
age 1 0.8356
age 2 0.6093
age 3 0
;
run;
proc sort data=have;by  VAR category;Run;
Data want;
set have;
by VAR;
retain n  0;
if first.VAR then n=n+1;;
Run;

 

 

 

4 REPLIES 4
Patrick
Opal | Level 21

Below modified code shows you for which rows first.var returns 1. I hope this will help you understand the result you're getting. 

Data want;
	set have;
	by VAR category;
	retain n 0;
	if first.VAR then n=n+1;
	first_var_flg=first.var;
	first_cat_flg=first.category;
Run;

proc print data=want;
run;

Patrick_0-1737806162555.png

 

Is it possible to see the SAS iterations step by step to understand how sas perform the calculation ?

There is a data step debugger that works with "PC SAS" that's now also available for recent SAS EG and SAS Studio clients. I haven't used it often myself so can't tell how useful it is.

quickbluefish
Lapis Lazuli | Level 10

Yes, and since

first.<your variable name>

is just a temporary (meaning, it only exists during the execution of the current data step) 0/1 variable, you can simply add its value to n each time and get the same result:

data want;
set have;
by VAR;
n+first.var;
run;

For every row, it's adding either a 0 or a 1 to the current value of n.  And because we're using the syntax like "n+1" instead of "n=n+1", the variable n is automatically retained.  

Tom
Super User Tom
Super User

@Ronein wrote:

As I understand  in first line of sex the value is 1  (0+1)

then I dont understand why second line of sex  is also 1 because   1+1 is 2

I also dont undersdtand why first row of VAR age  is 2?

 


Read this line out loud:

if first.VAR then n=n+1;

If first dot var then n gets n+1.

That code adds one to N only on the FIRST observation for a particular value of VAR.  

So obviously the value of N does not change on the SECOND observation with VAR=SEX but it will change on the FIRST observation with VAR=AGE.

 

Note if you run your code you should get N=1 when VAR=age and N=2 when VAR=sex because age comes before sex in alphabetical order.  If you are seeing sex before age then check the actual values of the variables.  Perhaps you have SEX and age?  Lowercase letters sort after uppercase letters.  Or perhaps you have " sex" and "age".  Space sorts before letters.

quickbluefish
Lapis Lazuli | Level 10

I think the confusion is coming from the (mistaken) idea that, e.g., FIRST.VAR is the first *value* of VAR within a group -- so for example, if the first value of SEX is 1, then first.VAR would be equal to 1 when var=sex.  And if the first value of age were 38, then first.VAR would equal 38 when var=age.  

 

But that is NOT what first.VAR or last.VAR are doing at all.  first.VAR is simply an indicator of whether or not we are at the first row within that BY group -- 1 if true, 0 if false.  So first.VAR is a separate variable.

 

It is also tied to the BY statement -- first.<variable> and last.<variable> do not exist if you do not have a corresponding BY statement.  For example, let's say you have the following dataset, which we sort by state, and within state, by year, and within year, by month. 

 

That allows us to have a BY statement that says  BY state year month;  (order has to match the sorting).  In order to see the values of first.state, first.year and first.month, we're just saving them in permanent variables (normally, there's not a need to do this - you can just use the first. or last. variables directly and be done with them).  

data test;
infile cards dsd truncover firstobs=1 dlm=',';
length state $2 year month 3 rate 8;
input state year month rate;
cards;
NY,2020,1,0.37
NY,2020,2,0.39
NY,2020,3,0.32
NY,2020,4,0.36
NY,2021,1,0.09
NY,2021,2,0.14
NY,2021,3,0.14
NY,2021,4,0.20
MA,2020,1,0.29
MA,2020,2,0.41
MA,2020,3,0.35
MA,2020,4,0.33
MA,2021,1,0.18
MA,2021,2,0.17
MA,2021,3,0.22
MA,2021,4,0.27
;
run;

proc sort data=test; by state year month; run;

data test2;
set test;
by state year month;
is_first_dot_state=first.state;
is_first_dot_year=first.year;
is_first_dot_month=first.month;
run;

proc print data=test2 noobs; run;

Result is the following:

quickbluefish_1-1737836904357.png

 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3480 views
  • 0 likes
  • 4 in conversation