BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ElodieT
Fluorite | Level 6

Hello,

 

I would like to complete de empty cells of column Email with those already existing if they have the same name and stay in the same state.

I tried with If then do to create another column and fill it out, but it doesn't work. 

 

Please, is there a way to do it?

 

ElodieT_1-1627679583926.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This code builds a lookup table in a hash memory-resident object.  Then it uses it to populate missing emails for a given full_name/state.

 

It's untested in the absence of sample data in the form of a working data step:  

 

data want (drop=_:);
  set have;
  if _n_=1 then do;
    declare hash h (dataset:'have (where=(not missing(email)) keep=full_name state email)');
      h.definekey('full_name','state');
      h.definedata('email');
      h.definedone();
  end;
  if email=' ' then _rc=h.find();
run;

If a given FULL_NAME/STATE has multiple observations with non-blank emails, this code retains the first one in the hash object (default behavior of the hash object).  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

In the future, do not provide your data as a screen capture, because we can't work with screen captures and provide tested code. Instead, please provide data as SAS data step code, instructions are here: https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/ and then we can actually use your data and create tested code.

 

Here is a possible solution (UNTESTED)

proc sql;
    create table has_email as select * from have where not missing(email);
    create table want as select a.*,coalescec(a.email,b.email) from have a left join has_email b
         on a.full_name=b.full_name;
quit;
--
Paige Miller
mkeintz
PROC Star

And what if a given name/state combination has more than one non-blank email value?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

Below one way to go (in case of multiple email addresses will only pick one per name and state).

data have;
  infile datalines truncover dsd;
  input (name state email) ($);
  datalines;
name1,state1,email1
name1,state2,
name1,state1,
name1,state2,email2
;

proc sort data=have out=have_sorted;
  by name state DESCENDING email ;
run;

data want;
  set have_sorted;
  by name state;
  retain r_email;
  if first.state then r_email=email;
  else email=r_email;
  drop r_email;
run;
mkeintz
PROC Star

This code builds a lookup table in a hash memory-resident object.  Then it uses it to populate missing emails for a given full_name/state.

 

It's untested in the absence of sample data in the form of a working data step:  

 

data want (drop=_:);
  set have;
  if _n_=1 then do;
    declare hash h (dataset:'have (where=(not missing(email)) keep=full_name state email)');
      h.definekey('full_name','state');
      h.definedata('email');
      h.definedone();
  end;
  if email=' ' then _rc=h.find();
run;

If a given FULL_NAME/STATE has multiple observations with non-blank emails, this code retains the first one in the hash object (default behavior of the hash object).  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1981 views
  • 4 likes
  • 4 in conversation