Hi,
I have the following dataset,
Col1/Col2/Co3:
A/1/0;
A/3/1;
A/5/0;
A/8/0;
B/1/0;
B/2/1;
B/3/0;
B/4/1;
B/5/0;
Col1/2/3 are strings, numeric and boolean(1 for true and 0 for false).
Data are ordered by Col1 and 2.
Now I would like to add an extra column (Col4), beginning with 1, based on the following rules:
If Col3=1 then Col4=1 else Col4 is set to the Col3 value of the previous Col2 for the same Col1 plus+1.
Hence, Col4 is a counter and Col3 serves as an indicator to reset the counter.
The new dataset looks like this:
Col1/Col2/Col3/Col4:
A/1/0/1;
A/3/1/1;
A/5/0/2;
A/8/0/3;
B/1/0/1;
B/2/1/1;
B/3/0/2;
B/4/1/1;
B/5/0/2;
I am newbie to SAS so appreciate if anyone can help me out wth this.
Many thank in advance.
The new dataset looks like this: Col1/Col2/Col3/Col4: A/1/0/1; A/3/1/1; A/5/0/2; A/8/0/3;
Does not match your description. You state "If Col3=1 then Col4=1 else Col4 is set to the Col3 value of the previous Col2". There is no "previous Col2 for the first observation. So where does the 1 for Col4 come from?
Same for the first Col1 = B. There is no previous Col1=B so where does that 1 come from?
Please provide data in the form of working data step code pasted into a text box opened with the </> icon such as:
data have; input Col1 $ Col2 Col3; datalines; A 1 0 A 3 1 A 5 0 A 8 0 B 1 0 B 2 1 B 3 0 B 4 1 B 5 0 ;
This seems to do what you show for the given example:
data want; set have; by col1; retain col4; if first.col1 then col4=1; else col4+1; if col3=1 then col4=1; run;
The By statement creates automatic variables that are not added to the data set that indicate whether an observation is the first or last of the by group. These values are boolean. So we can reset based on the First of a group.
Retain establishes a new variable to keep values across the data step boundary so is the most common tool for counting like this.
If I understand correctly, here's one way:
data want;
set have;
by col1;
col4 + 1;
if first.col1 or col3=1 then col4=1;
run;
As a newbie, learning about the effects of a BY statement in a DATA step is an important topic. There are many online resources available.
The new dataset looks like this: Col1/Col2/Col3/Col4: A/1/0/1; A/3/1/1; A/5/0/2; A/8/0/3;
Does not match your description. You state "If Col3=1 then Col4=1 else Col4 is set to the Col3 value of the previous Col2". There is no "previous Col2 for the first observation. So where does the 1 for Col4 come from?
Same for the first Col1 = B. There is no previous Col1=B so where does that 1 come from?
Please provide data in the form of working data step code pasted into a text box opened with the </> icon such as:
data have; input Col1 $ Col2 Col3; datalines; A 1 0 A 3 1 A 5 0 A 8 0 B 1 0 B 2 1 B 3 0 B 4 1 B 5 0 ;
This seems to do what you show for the given example:
data want; set have; by col1; retain col4; if first.col1 then col4=1; else col4+1; if col3=1 then col4=1; run;
The By statement creates automatic variables that are not added to the data set that indicate whether an observation is the first or last of the by group. These values are boolean. So we can reset based on the First of a group.
Retain establishes a new variable to keep values across the data step boundary so is the most common tool for counting like this.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.