Assume I have a variable called FLAG (character variable) for which I have to pass values as follows.
A. First 10 obs should have value 'O'
B. Next 10 obs should have value 'MODO'
C. Last 10 instead should have missing values ' '
E.g.
FLAG
O
O
O
..
.
MODO
MODO
..
.
.
Any help in providing me the data step for this?
Hi @David_Billa
In your example, you are not passing any variables to a new variable. You are creating a new variable not depending on any variables present in the input data set. only the current observation's position, so a different sorting of the same input would yield different results.
as @Kurt_Bremser says, the position can be obtained from the input counter _N_. Note that it is necessary to declare the FLAG variable with a length = 4, because it is given the value 'O' with a length = 1 in the first iteration, so the value 'MODO' would be truncated to 'M' in obs. 11-20.
I would recommend that you get yourself a copy of "The little SAS book". It is an excellent introduction that takes you all the way from newbie to a SAS programmer ready to take on most problems in day-to-day work. Excerpt here:
data have;
do dummy = 1 to 50; output; end;
run;
data want; set have;
length flag $4;
if _N_ <= 10 then flag = 'O';
else if _N_ <= 20 then flag = 'MODO';
else flag = '';
run;
Use the automatic variable _N_ to determine the number of the current observation. Take care to set a proper length for your new variable.
You can use a series of if-then-else-if, a select() block, or a format. If you have lots of values to load dynamically from data, a hash object can be the tool of choice.
Hi @David_Billa
In your example, you are not passing any variables to a new variable. You are creating a new variable not depending on any variables present in the input data set. only the current observation's position, so a different sorting of the same input would yield different results.
as @Kurt_Bremser says, the position can be obtained from the input counter _N_. Note that it is necessary to declare the FLAG variable with a length = 4, because it is given the value 'O' with a length = 1 in the first iteration, so the value 'MODO' would be truncated to 'M' in obs. 11-20.
I would recommend that you get yourself a copy of "The little SAS book". It is an excellent introduction that takes you all the way from newbie to a SAS programmer ready to take on most problems in day-to-day work. Excerpt here:
data have;
do dummy = 1 to 50; output; end;
run;
data want; set have;
length flag $4;
if _N_ <= 10 then flag = 'O';
else if _N_ <= 20 then flag = 'MODO';
else flag = '';
run;
Variables are not stored independent from datasets. Are taking about creating a NEW dataset? Or adding a new variable to an existing dataset?
Here is straight forward translation of your rules into a data step to create a dataset named WANT with a variable named FLAG.
data want;
length flag $4 ;
flag='O';
do i=1 to 10; output; end;
flag='MONO';
do i=1 to 10; output; end;
flag=' ';
do i=1 to 10; output; end;
drop i;
run;
@David_Billa wrote:
I talked about adding a new variable in the existing dataset.
So if you are starting with an existing dataset then you will want to make a new dataset that has the new values.
data want;
set have;
row+1;
length flag $4;
if 1 <= row <=10 then flag='O';
else if row <=20 then flag='MONO';
else if row <=30 then flag=' ';
else flag=' ';
run;
What do you do if the dataset has 50 observations instead of 30? What if it has only 10 observations? Is there some variable in the existing dataset that can be used to determine how to set the new FLAG variable?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.