data have;
input
name : $1.
id : 8.
code : $8.
num_1 : 8.
num_2 : 8.
;
datalines;
A 10 12345678 1 0
A 10 09876543 0 1
A 34 23456789 1 1
B 9 88997700 1 0
B 9 88997700 0 0
B 9 88997700 0 1
C 28 11223344 1 1
C 28 34343434 1 0
C 28 67676767 0 0
C 30 78679870 1 0
C 30 78679870 0 1
;
What I want:
A 10 12345678 1 0
A 10 12345678 0 1
A 34 23456789 1 1
B 9 88997700 1 0
B 9 88997700 0 0
B 9 88997700 0 1
C 28 11223344 1 1
C 28 11223344 1 0
C 28 11223344 0 0
C 30 78679870 1 0
C 30 78679870 0 1
You might say exactly which variable(s) you want to set. Because your want does not make "all obs in a group equal" because there are still some variables not equal to the first obs.
My try:
data want; set have; by notsorted name id; length firstcode $ 8; retain firstcode; if first.id then firstcode=code; else code=firstcode; drop firstcode; run;
Assumes your data is at least grouped by the Name and Id values.
Retain keeps a value of a variable across the data step boundary.
The BY statement creates SAS automatic variables First. and Last. (do note the dot) that are 1/0 (true/false values) to tell you if a current observation is the first of a group. Assuming grouped by name then we use the ID value to test and set the retained variable to the first value. Otherwise assign the retained value to the other observations.
Group by name and id
You might say exactly which variable(s) you want to set. Because your want does not make "all obs in a group equal" because there are still some variables not equal to the first obs.
My try:
data want; set have; by notsorted name id; length firstcode $ 8; retain firstcode; if first.id then firstcode=code; else code=firstcode; drop firstcode; run;
Assumes your data is at least grouped by the Name and Id values.
Retain keeps a value of a variable across the data step boundary.
The BY statement creates SAS automatic variables First. and Last. (do note the dot) that are 1/0 (true/false values) to tell you if a current observation is the first of a group. Assuming grouped by name then we use the ID value to test and set the retained variable to the first value. Otherwise assign the retained value to the other observations.
@ballardw 's solution is the most efficacious if you really are trying to carry forward just one or two variables. But if you have lots of variables then there is more coding overhead to generate. In that case a simple DROP= list of variables to carry forward may be your best solution.
In your case, you might use:
data want;
if 0 then set have;
set have (drop=code);
by name id;
if first.id then set have point=_n_;
run;
Just list the variables to be carried forward in the DROP= parameter. Those variables will be read only when NAME or ID changes, and the IF condition is met.
Note the "if 0 then set have;" statement is not neccessary. I use it here, so that variable order is preserved. Without it, the variables in the DROP list, which could be anywhere in the list of variables, will be output at the right end of the new data set.
My take:
data want;
set have (rename=(code=_code));
by name id;
retain code;
if first.id then code = _code;
drop _code;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.