BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Hello,

I would like to assign 1,2,3,4... by counting obs, I use the code below.  I found the repeater_ID starting with 38 instead of 1, how could this happened?  Thanks.

 

repeat_ID=_N_+0;
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You are using the subsetting IF statement.  This means that every observation in the dataset gets passed on by the data engine to the data step. The data step will increment _N_ for each such incoming record, PRIOR to the filtering by IF.   Apparently the first 37 records don't satisfy the condition, but that's only after _N_ is incremented.

 

But if you replace the IF statement with a WHERE statement the filtering task is offloaded to the data engine.  Those 37 records don't even make it as far as the data step processing, and therefore _N_ is not incremented for them.   So using WHERE will allow your first record satisfying the conditions to have _N_=1, no matter how many non-qualifying observatinos precede it.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
Reeza
Super User
_n_ starts at 38 somehow...what does the rest of your code look like?
Do you have a WHERE/subsetting IF?
ybz12003
Rhodochrosite | Level 12

I have if statement in front of +0.

 

if caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';
Reeza
Super User

So you are only accessing the _n_ conditional and it increments in between. Either change your logic so you don't use the automatic _n_ variable or change it to be an unconditional execution.

 

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 


@ybz12003 wrote:

I have if statement in front of +0.

 

if caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';

 

 

mkeintz
PROC Star

You are using the subsetting IF statement.  This means that every observation in the dataset gets passed on by the data engine to the data step. The data step will increment _N_ for each such incoming record, PRIOR to the filtering by IF.   Apparently the first 37 records don't satisfy the condition, but that's only after _N_ is incremented.

 

But if you replace the IF statement with a WHERE statement the filtering task is offloaded to the data engine.  Those 37 records don't even make it as far as the data step processing, and therefore _N_ is not incremented for them.   So using WHERE will allow your first record satisfying the conditions to have _N_=1, no matter how many non-qualifying observatinos precede it.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ybz12003
Rhodochrosite | Level 12

 I changed if to where, I still have the same result.  The repeat_ID started with 38.

 

where caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';
mkeintz
PROC Star

The time has come to show the data step code, and the log.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ybz12003
Rhodochrosite | Level 12
It's strange that when I use where to ran today, it works!
Tom
Super User Tom
Super User

_N_ is NOT an observation counter. It is a data step iteration counter.

The easiest way to create a counter variable is to use a SUM statement.

data want;
   set have;
   obs_no+1;
run;

From the name of your variable however it looks like you are instead trying to count the observations by some grouping variable.

data want;
   set have;
   by id;
   repeat_id+1;
   if first.id then repeat_id=1;
run;

Note: Make sure that you use this to make a NEW variable.  If the variable already exists in the source dataset then the counting will not work right since the value from the previous iteration will be overwritten by the value read in from the source dataset.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1004 views
  • 1 like
  • 4 in conversation