BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ismahero2
Obsidian | Level 7

Hi,

 

I have a dataset with two variables; one is named "Names" and the other one is named "Descriptions".  I have 71 rows in this dataset.   I have another dataset with multiple variables and one variable named "Names".   I want to load dataset #1 into an array.  Then I want to match the variable "names" in dataset #2 to the array "Name", if it matches, then create a new variable with the content of the array "Description".  If it doesn't match, then leave the new variable empty. 

 

Based on the put statement in the second data step, I am creating the arrays correctly (With the right content), but the problem comes i the last data step.  Any help?

 

 

Data _null_;
Set DS1 nobs=num_obs;
call symput('num_obs_array',num_obs);
stop;
run;
%Put  Observations= &num_obs_array.;

 
Data _null_;
Set DS1 end=eof;

if _N_=1 then do;
     array Name[&num_obs_array.] $32.;
     array Description[&num_obs_array.] $32000;
End;

Name[_N_]=Names;
Description[_N_]=Descriptions;

 if eof then do;
     do i = 1 to &num_obs_array.;
           put 'Name['i'] = ' name[i];
           put 'Description['i'] = ' Description[i];
     end;
end;

retain Name Description;
run;


Data DS_Want;
Set DS2;

if _N_=1 then do;
     array Name[&num_obs_array.] $32.;
     array Description[&num_obs_array.] $32000;
end;

 Length Description_New $32000.;

do i = 1 to &num_obs_array.;

     if strip(names)=strip(Name[i]) then do;
           Description_New=Description[i];
           leave;
     end;
end;
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

In the future if you're after actual code please provide sample data via tested SAS data step code the way I've done below for tables lookup and have. This allows us to spend the time answering your question instead of preparing sample data.

data lookup;
  infile datalines dsd truncover;
  input Names :$20. Descriptions :$40.;
  datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;

data have;
  infile datalines dsd truncover;
  input ID :$10. Names :$20.;
  datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;

data want;
  set have;
  if _n_=1 then
    do;
      if 0 then set lookup(keep=Descriptions);
      dcl hash h1(dataset:'lookup');
      h1.defineKey('Names');
      h1.defineData('Descriptions');
      h1.defineDone();
    end;
  if h1.find() ne 0 then call missing(Descriptions);
run;

proc print data=want;
run;

 

 

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@ismahero2 wrote:

Hi,

 

I have a dataset with two variables; one is named "Names" and the other one is named "Descriptions".  I have 71 rows in this dataset.   I have another dataset with multiple variables and one variable named "Names".   I want to load dataset #1 into an array.  Then I want to match the variable "names" in dataset #2 to the array "Name", if it matches, then create a new variable with the content of the array "Description".  If it doesn't match, then leave the new variable empty. 

 

Based on the put statement in the second data step, I am creating the arrays correctly (With the right content), but the problem comes i the last data step.  Any help?


It would be extremely helpful if you could show us a small example of this problem, so we can see the datasets (or small examples of these datasets) that you are working with. Please follow these examples and instructions to provide us example SAS data sets we can work with.

 

It would be extremely helpful if you could show us the output you are getting, and then next show us the output you want to get. Saying that "the problem comes in the last data step" doesn't mean anything to anyone, you haven't even described what is wrong, and you haven't described the desired result.

--
Paige Miller
Patrick
Opal | Level 21

Why does this need to be an array. The usual (and efficient) two approaches for such lookups of values over a key are:

Option 1: Create a format from the table with key/value pairs, add an other case that return a blank, use the format in the data step.

Option 2: Load the table into a data step hash lookup table and lookup the values via the find() method.

ismahero2
Obsidian | Level 7
Could you give me a sample code? I have never used hash lookup.
Patrick
Opal | Level 21

And here sample code for creating a format. Creating a format has in your case the advantage that once created you can use it in multiple data steps or even directly in Proc's without even having to create a new variable.

data lookup;
  infile datalines dsd truncover;
  input Names :$20. Descriptions :$40.;
  datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;

data v_lookup /view=v_lookup;
  set lookup(keep=names descriptions rename=(names=start descriptions=label)) end=last;
  retain fmtname '$name_desc' type 'c';
  output;
  if last then
    do;
      hlo='O';
      label=' ';
      output;
    end;
run;

proc format cntlin=v_lookup;
run;

data have;
  infile datalines dsd truncover;
  input ID :$10. Names :$20.;
  datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;

data want;
  set have;
  descriptions=put(names,$name_desc.);
run;

proc print data=want;
run;

proc print data=have;
  format names $name_desc.;
run;
ismahero2
Obsidian | Level 7

 

This is a sample of dataset #1 (DS1)

Names Descriptions
John Tall guy
Eclipse Red Car
Toby Back and white dog
Eliazar Refugee
Jose Naturalized

 

This is an example of the second dataset (DS2)

ID Names
5842 Jose
6359 Toby
8852 Jaime
7821 Eclipse
5636 Brenda

 

 

This is what I would like to have:

ID Names Descriptions
5842 Jose Naturalized
6359 Toby Back and white dog
8852 Jaime  
7821 Eclipse Red Car
5636 Brenda  

 

 

The results I am receiving from the last data set have the third column empty.

 

Thank you for your help.

 

 

 

 

 

 

Patrick
Opal | Level 21

In the future if you're after actual code please provide sample data via tested SAS data step code the way I've done below for tables lookup and have. This allows us to spend the time answering your question instead of preparing sample data.

data lookup;
  infile datalines dsd truncover;
  input Names :$20. Descriptions :$40.;
  datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;

data have;
  infile datalines dsd truncover;
  input ID :$10. Names :$20.;
  datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;

data want;
  set have;
  if _n_=1 then
    do;
      if 0 then set lookup(keep=Descriptions);
      dcl hash h1(dataset:'lookup');
      h1.defineKey('Names');
      h1.defineData('Descriptions');
      h1.defineDone();
    end;
  if h1.find() ne 0 then call missing(Descriptions);
run;

proc print data=want;
run;

 

 

ismahero2
Obsidian | Level 7

Thank you so much!

ballardw
Super User

Examples of providing your data in the form of data step code. Then using a data step merge, another way to match  up data on the value of one or more variables after sorting. Only tricky bit is the data set option to indicate which names come from DS2 to create a boolean (true/false 1/0) temporary variable that can be tested to filter on.

data ds1;
  infile datalines dlm=',';
  input Names $ Descriptions :$25.;
datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;

 

data ds2;
   input ID $	Names $;
datalines;
5842 	Jose
6359 	Toby
8852 	Jaime
7821 	Eclipse
5636 	Brenda
;

proc sort data=ds1;
  by names;
run;
proc sort data=ds2;
  by names;
run;

data want;
  merge ds2 (in=inds2)
        ds1
  ;
  by names;
  if inds2;
run;

Or Proc sql left join:

proc sql;
   create table want2 as
   select ds2.id, ds2.names, ds1.descriptions
   from ds2
        left join
        ds1
        on ds2.names=ds1.names
   ;
quit;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1617 views
  • 1 like
  • 4 in conversation