Hello,
I would like to assign 1,2,3,4... by counting obs, I use the code below. I found the repeater_ID starting with 38 instead of 1, how could this happened? Thanks.
repeat_ID=_N_+0;
You are using the subsetting IF statement. This means that every observation in the dataset gets passed on by the data engine to the data step. The data step will increment _N_ for each such incoming record, PRIOR to the filtering by IF. Apparently the first 37 records don't satisfy the condition, but that's only after _N_ is incremented.
But if you replace the IF statement with a WHERE statement the filtering task is offloaded to the data engine. Those 37 records don't even make it as far as the data step processing, and therefore _N_ is not incremented for them. So using WHERE will allow your first record satisfying the conditions to have _N_=1, no matter how many non-qualifying observatinos precede it.
I have if statement in front of +0.
if caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';
So you are only accessing the _n_ conditional and it increments in between. Either change your logic so you don't use the automatic _n_ variable or change it to be an unconditional execution.
https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/
@ybz12003 wrote:
I have if statement in front of +0.
if caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';
You are using the subsetting IF statement. This means that every observation in the dataset gets passed on by the data engine to the data step. The data step will increment _N_ for each such incoming record, PRIOR to the filtering by IF. Apparently the first 37 records don't satisfy the condition, but that's only after _N_ is incremented.
But if you replace the IF statement with a WHERE statement the filtering task is offloaded to the data engine. Those 37 records don't even make it as far as the data step processing, and therefore _N_ is not incremented for them. So using WHERE will allow your first record satisfying the conditions to have _N_=1, no matter how many non-qualifying observatinos precede it.
I changed if to where, I still have the same result. The repeat_ID started with 38.
where caseid_1 ^=' ' or caseid_2 ^=' ' or caseid_3 ^=' ' or caseid_4 ^=' ' or caseid_5 ^=' ';
The time has come to show the data step code, and the log.
_N_ is NOT an observation counter. It is a data step iteration counter.
The easiest way to create a counter variable is to use a SUM statement.
data want;
set have;
obs_no+1;
run;
From the name of your variable however it looks like you are instead trying to count the observations by some grouping variable.
data want;
set have;
by id;
repeat_id+1;
if first.id then repeat_id=1;
run;
Note: Make sure that you use this to make a NEW variable. If the variable already exists in the source dataset then the counting will not work right since the value from the previous iteration will be overwritten by the value read in from the source dataset.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.