Help using Base SAS procedures

How to replace the dataset

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

How to replace the dataset

Hello everyone,

I have a large dataset containing many missing values. Below I provide a simple example of dataset:

Institution_NameIdUnique_Numbrdate
A11183/14/14
A1113/15/14
A11293/16/14
A1133/17/14
B21113/18/14
B2113/19/14
C31143/20/14
C3113/21/14
C3113/22/14
C31223/23/14

I need to replace the missing values with the unique number that has exactly the same Institution_name and id, otherwise let it to be missing value. So the output table will be like below:

Institution_NameIdUnique_Numbrdate
A11183/14/14
A11183/15/14
A11293/16/14
A1133/17/14
B21113/18/14
B21113/19/14
C31143/20/14
C31143/21/14
C31143/22/14
C3122

3/23/14

I appreciate your suggestions.


Accepted Solutions
Solution
‎08-06-2015 08:49 AM
Super User
Posts: 9,681

Re: How to replace the dataset

Code: Program

data have;
infile cards expandtabs;
input (Institution_Name Id Unique_Numbr date) ($);
cards;
A 111 8 3/14/14
A 111 . 3/15/14
A 112 9 3/16/14
A 113 . 3/17/14
B 211 1 3/18/14
B 211 . 3/19/14
C 311 4 3/20/14
C 311 . 3/21/14
C 311 . 3/22/14
C 312 2 3/23/14
;
run;
data want;
set have;
by Institution_Name Id;
retain _Unique_Numbr;
if first.Id then _Unique_Numbr=Unique_Numbr;
if not missing(Unique_Numbr) then _Unique_Numbr=Unique_Numbr;
drop Unique_Numbr ;
run;

View solution in original post


All Replies
Frequent Contributor
Posts: 85

Re: How to replace the dataset

Here is an SQL solution:

proc sql ;

  create table want as

  select a.institution_name, a.id, b.unique_numbr, a.date

  from have a

    left join

  (select distinct institution_name, id, unique_numbr

     from have

     where unique_numbr is not missing)  b

  on a.institution_name = b.institution_name

  and a.id = b.id;

Frequent Contributor
Posts: 85

Re: How to replace the dataset

And here's a solution using hash object - which would be more efficient.

data want2;
  if _n_ = 0 then  set have;

  declare hash h(dataset:'have(where=(unique_numbr is not missing))');
    h.definekey('institution_name', 'id');
    h.definedata('unique_numbr');
    h.definedone();

  do until (the_end);
     set have end=the_end;
     if unique_numbr = . then
        rc = h.find();
     output;
  end;
  stop;
  drop rc;
  run;

Super User
Super User
Posts: 7,401

Re: How to replace the dataset

Hi,

There are many topics on here like this, if you want to update the dataset inplace then use the update method. 

Solution
‎08-06-2015 08:49 AM
Super User
Posts: 9,681

Re: How to replace the dataset

Code: Program

data have;
infile cards expandtabs;
input (Institution_Name Id Unique_Numbr date) ($);
cards;
A 111 8 3/14/14
A 111 . 3/15/14
A 112 9 3/16/14
A 113 . 3/17/14
B 211 1 3/18/14
B 211 . 3/19/14
C 311 4 3/20/14
C 311 . 3/21/14
C 311 . 3/22/14
C 312 2 3/23/14
;
run;
data want;
set have;
by Institution_Name Id;
retain _Unique_Numbr;
if first.Id then _Unique_Numbr=Unique_Numbr;
if not missing(Unique_Numbr) then _Unique_Numbr=Unique_Numbr;
drop Unique_Numbr ;
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 358 views
  • 6 likes
  • 4 in conversation