BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kmin87
Fluorite | Level 6

Hi,

 

I have a new data set where I want to normalize. For example I want to get the most updated Code identifier for John Smith of that particular rank. So I want to just maintain data where Smith, John Code AC and rank junior or 2nd row in a new data set.

I want to go through each entries, and make sure the right rank is with the right code per the most current date.

How do i write a sas program to do this?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Assuming your data is properly sorted by ID and RANK and DATE

/* UNTESTED CODE */
data want;
    set have;
    by id rank;
    if last.rank;
run;

If you want tested code, you need to supply data according to the instructions given.

 

--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Many of us will not download Excel files as they are security threat. Instead of file attachments, include your data in your reply as SAS data step code (instructions).

--
Paige Miller
kmin87
Fluorite | Level 6

Oh okay, here is the data set

Name	ID	Code	Rank	Date
Smith, John	1	AB	Junior	20170801
Smith, John	1	AC	Junior	20170901
Smith, John	1	AD	Senior	20180101
Smith, Peter	2	TT	Manager	20170801
Smith, Peter	2	TP	Manager	20171001
Ask, Teom	3	AB	Junior	20170801
Powell, H	4	TT	Manager	20170801
Powell, H	4	TP	Senior	20190801
​

 

PaigeMiller
Diamond | Level 26

Assuming your data is properly sorted by ID and RANK and DATE

/* UNTESTED CODE */
data want;
    set have;
    by id rank;
    if last.rank;
run;

If you want tested code, you need to supply data according to the instructions given.

 

--
Paige Miller
kmin87
Fluorite | Level 6

thanks for the solution. That worked!!! I do have a follow up question if you dont mind. I am now merging this data set to another data set. So this new data set has entries with more columns. One of the column im checking for in this data set is with another column in the data set you just showed me. So lets say the want data set has a column titled correct_a and i have an original data set that the corresponding column a. I want to make sure the column a has the correct values according to correct_a. Some are equal and some may be different. WOuld this code work?

 

data new;
merge original(in=a)
want(in=b);
by id rank;
if a=1;
if a ne correct_a then a=correct_a;
run;


 

ballardw
Super User

@kmin87 wrote:

thanks for the solution. That worked!!! I do have a follow up question if you dont mind. I am now merging this data set to another data set. So this new data set has entries with more columns. One of the column im checking for in this data set is with another column in the data set you just showed me. So lets say the want data set has a column titled correct_a and i have an original data set that the corresponding column a. I want to make sure the column a has the correct values according to correct_a. Some are equal and some may be different. WOuld this code work?

 

data new;
merge original(in=a)
want(in=b);
by id rank;
if a=1;
if a ne correct_a then a=correct_a;
run;


 


I doubt it depending on what you expect. If you have a variable named A in your data when you use the IN=A dataset option it means that unexpected things can happen as shown here:

64   data junk;
65      set sashelp.class (in=age);
66   run;

WARNING: The variable age exists on an input data set and is also set by an I/O statement option.
          The variable will not be included on any output data set and unexpected results can
         occur.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.JUNK has 19 observations and 4 variables.

Sashelp.class started with 5 variables.

You likely should use a different variable for your IN variable. Typically I use In<datasetname>, such as InOrginal so the code documents what the use of that variable is doing.

 

If I understand what you are attempting I would try something like:

data new;
merge original(in=InOriginal)
want(in=b)
;
by id rank;
if InOriginal=1;
a = coalesce(correct_a,a);

run;

The above assumes a and correct_a are numeric values. If the variables are character then use CoalesceC instead. The Coalesce function returns the first non-missing value in the list compared left to right.

 

Note that the following is going to attempt to execute when Correct_a is missing as missing value is not equal to any actual value. So you might end up removing values of A when there isn't any actual observation from the Want data set with matching Id Rank.

if a ne correct_a then a=correct_a;

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1061 views
  • 0 likes
  • 3 in conversation