Hi,
I have a new data set where I want to normalize. For example I want to get the most updated Code identifier for John Smith of that particular rank. So I want to just maintain data where Smith, John Code AC and rank junior or 2nd row in a new data set.
I want to go through each entries, and make sure the right rank is with the right code per the most current date.
How do i write a sas program to do this?
Assuming your data is properly sorted by ID and RANK and DATE
/* UNTESTED CODE */
data want;
set have;
by id rank;
if last.rank;
run;
If you want tested code, you need to supply data according to the instructions given.
Many of us will not download Excel files as they are security threat. Instead of file attachments, include your data in your reply as SAS data step code (instructions).
Oh okay, here is the data set
Name ID Code Rank Date
Smith, John 1 AB Junior 20170801
Smith, John 1 AC Junior 20170901
Smith, John 1 AD Senior 20180101
Smith, Peter 2 TT Manager 20170801
Smith, Peter 2 TP Manager 20171001
Ask, Teom 3 AB Junior 20170801
Powell, H 4 TT Manager 20170801
Powell, H 4 TP Senior 20190801
Assuming your data is properly sorted by ID and RANK and DATE
/* UNTESTED CODE */
data want;
set have;
by id rank;
if last.rank;
run;
If you want tested code, you need to supply data according to the instructions given.
thanks for the solution. That worked!!! I do have a follow up question if you dont mind. I am now merging this data set to another data set. So this new data set has entries with more columns. One of the column im checking for in this data set is with another column in the data set you just showed me. So lets say the want data set has a column titled correct_a and i have an original data set that the corresponding column a. I want to make sure the column a has the correct values according to correct_a. Some are equal and some may be different. WOuld this code work?
data new;
merge original(in=a)
want(in=b);
by id rank;
if a=1;
if a ne correct_a then a=correct_a;
run;
@kmin87 wrote:
thanks for the solution. That worked!!! I do have a follow up question if you dont mind. I am now merging this data set to another data set. So this new data set has entries with more columns. One of the column im checking for in this data set is with another column in the data set you just showed me. So lets say the want data set has a column titled correct_a and i have an original data set that the corresponding column a. I want to make sure the column a has the correct values according to correct_a. Some are equal and some may be different. WOuld this code work?
data new;
merge original(in=a)
want(in=b);
by id rank;
if a=1;
if a ne correct_a then a=correct_a;
run;
I doubt it depending on what you expect. If you have a variable named A in your data when you use the IN=A dataset option it means that unexpected things can happen as shown here:
64 data junk; 65 set sashelp.class (in=age); 66 run; WARNING: The variable age exists on an input data set and is also set by an I/O statement option. The variable will not be included on any output data set and unexpected results can occur. NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.JUNK has 19 observations and 4 variables.
Sashelp.class started with 5 variables.
You likely should use a different variable for your IN variable. Typically I use In<datasetname>, such as InOrginal so the code documents what the use of that variable is doing.
If I understand what you are attempting I would try something like:
data new;
merge original(in=InOriginal)
want(in=b)
;
by id rank;
if InOriginal=1;
a = coalesce(correct_a,a);
run;
The above assumes a and correct_a are numeric values. If the variables are character then use CoalesceC instead. The Coalesce function returns the first non-missing value in the list compared left to right.
Note that the following is going to attempt to execute when Correct_a is missing as missing value is not equal to any actual value. So you might end up removing values of A when there isn't any actual observation from the Want data set with matching Id Rank.
if a ne correct_a then a=correct_a;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.