Hi,
I have a dataset with two variables; one is named "Names" and the other one is named "Descriptions". I have 71 rows in this dataset. I have another dataset with multiple variables and one variable named "Names". I want to load dataset #1 into an array. Then I want to match the variable "names" in dataset #2 to the array "Name", if it matches, then create a new variable with the content of the array "Description". If it doesn't match, then leave the new variable empty.
Based on the put statement in the second data step, I am creating the arrays correctly (With the right content), but the problem comes i the last data step. Any help?
Data _null_; Set DS1 nobs=num_obs; call symput('num_obs_array',num_obs); stop; run; %Put Observations= &num_obs_array.; Data _null_; Set DS1 end=eof; if _N_=1 then do; array Name[&num_obs_array.] $32.; array Description[&num_obs_array.] $32000; End; Name[_N_]=Names; Description[_N_]=Descriptions; if eof then do; do i = 1 to &num_obs_array.; put 'Name['i'] = ' name[i]; put 'Description['i'] = ' Description[i]; end; end; retain Name Description; run; Data DS_Want; Set DS2; if _N_=1 then do; array Name[&num_obs_array.] $32.; array Description[&num_obs_array.] $32000; end; Length Description_New $32000.; do i = 1 to &num_obs_array.; if strip(names)=strip(Name[i]) then do; Description_New=Description[i]; leave; end; end; Run;
In the future if you're after actual code please provide sample data via tested SAS data step code the way I've done below for tables lookup and have. This allows us to spend the time answering your question instead of preparing sample data.
data lookup;
infile datalines dsd truncover;
input Names :$20. Descriptions :$40.;
datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;
data have;
infile datalines dsd truncover;
input ID :$10. Names :$20.;
datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;
data want;
set have;
if _n_=1 then
do;
if 0 then set lookup(keep=Descriptions);
dcl hash h1(dataset:'lookup');
h1.defineKey('Names');
h1.defineData('Descriptions');
h1.defineDone();
end;
if h1.find() ne 0 then call missing(Descriptions);
run;
proc print data=want;
run;
@ismahero2 wrote:
Hi,
I have a dataset with two variables; one is named "Names" and the other one is named "Descriptions". I have 71 rows in this dataset. I have another dataset with multiple variables and one variable named "Names". I want to load dataset #1 into an array. Then I want to match the variable "names" in dataset #2 to the array "Name", if it matches, then create a new variable with the content of the array "Description". If it doesn't match, then leave the new variable empty.
Based on the put statement in the second data step, I am creating the arrays correctly (With the right content), but the problem comes i the last data step. Any help?
It would be extremely helpful if you could show us a small example of this problem, so we can see the datasets (or small examples of these datasets) that you are working with. Please follow these examples and instructions to provide us example SAS data sets we can work with.
It would be extremely helpful if you could show us the output you are getting, and then next show us the output you want to get. Saying that "the problem comes in the last data step" doesn't mean anything to anyone, you haven't even described what is wrong, and you haven't described the desired result.
Why does this need to be an array. The usual (and efficient) two approaches for such lookups of values over a key are:
Option 1: Create a format from the table with key/value pairs, add an other case that return a blank, use the format in the data step.
Option 2: Load the table into a data step hash lookup table and lookup the values via the find() method.
And here sample code for creating a format. Creating a format has in your case the advantage that once created you can use it in multiple data steps or even directly in Proc's without even having to create a new variable.
data lookup;
infile datalines dsd truncover;
input Names :$20. Descriptions :$40.;
datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;
data v_lookup /view=v_lookup;
set lookup(keep=names descriptions rename=(names=start descriptions=label)) end=last;
retain fmtname '$name_desc' type 'c';
output;
if last then
do;
hlo='O';
label=' ';
output;
end;
run;
proc format cntlin=v_lookup;
run;
data have;
infile datalines dsd truncover;
input ID :$10. Names :$20.;
datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;
data want;
set have;
descriptions=put(names,$name_desc.);
run;
proc print data=want;
run;
proc print data=have;
format names $name_desc.;
run;
This is a sample of dataset #1 (DS1)
Names | Descriptions |
John | Tall guy |
Eclipse | Red Car |
Toby | Back and white dog |
Eliazar | Refugee |
Jose | Naturalized |
This is an example of the second dataset (DS2)
ID | Names |
5842 | Jose |
6359 | Toby |
8852 | Jaime |
7821 | Eclipse |
5636 | Brenda |
This is what I would like to have:
ID | Names | Descriptions |
5842 | Jose | Naturalized |
6359 | Toby | Back and white dog |
8852 | Jaime | |
7821 | Eclipse | Red Car |
5636 | Brenda |
The results I am receiving from the last data set have the third column empty.
Thank you for your help.
In the future if you're after actual code please provide sample data via tested SAS data step code the way I've done below for tables lookup and have. This allows us to spend the time answering your question instead of preparing sample data.
data lookup;
infile datalines dsd truncover;
input Names :$20. Descriptions :$40.;
datalines;
John,Tall guy
Eclipse,Red Car
Toby,Back and white dog
Eliazar,Refugee
Jose,Naturalized
;
data have;
infile datalines dsd truncover;
input ID :$10. Names :$20.;
datalines;
5842,Jose
6359,Toby
8852,Jaime
7821,Eclipse
5636,Brenda
;
data want;
set have;
if _n_=1 then
do;
if 0 then set lookup(keep=Descriptions);
dcl hash h1(dataset:'lookup');
h1.defineKey('Names');
h1.defineData('Descriptions');
h1.defineDone();
end;
if h1.find() ne 0 then call missing(Descriptions);
run;
proc print data=want;
run;
Thank you so much!
Examples of providing your data in the form of data step code. Then using a data step merge, another way to match up data on the value of one or more variables after sorting. Only tricky bit is the data set option to indicate which names come from DS2 to create a boolean (true/false 1/0) temporary variable that can be tested to filter on.
data ds1; infile datalines dlm=','; input Names $ Descriptions :$25.; datalines; John,Tall guy Eclipse,Red Car Toby,Back and white dog Eliazar,Refugee Jose,Naturalized ; data ds2; input ID $ Names $; datalines; 5842 Jose 6359 Toby 8852 Jaime 7821 Eclipse 5636 Brenda ; proc sort data=ds1; by names; run; proc sort data=ds2; by names; run; data want; merge ds2 (in=inds2) ds1 ; by names; if inds2; run;
Or Proc sql left join:
proc sql; create table want2 as select ds2.id, ds2.names, ds1.descriptions from ds2 left join ds1 on ds2.names=ds1.names ; quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.