Hello,
I want to merge 2 tables using Hash Object by keyA and keyB that have the same values(but not the same name in the tables), but in my output I want just a few values from both tables,
Example
table A : keyA, Cl_A_1, Cl_A_2, Cl_A_3, Cl_A_4, CL_A5
table B : keyB, Cl_B_1,Cl_B_2,Cl_B_3,Cl_B_4,Cl_B_5,Cl_B_6
------------------------------------------------------------------
Output : KeyA, keyB, Cl_A_1, Cl_A_2, Cl_B_1,Cl_B_2,Cl_B_3
Thanks!
Below how I would code for this. Make sure that you load the table with the lower volume into the hash. That's from a memory consumption and performance perspective much more important than if you've got a few more variables than you would need in the PDV. As long as you only write variables to the output table that you need (drop/keep) the difference to performance is negligible.
data Set1;
if _n_ = 1 then
do;
if 0 then set class(keep=name sex age);
declare Hash Match1 (dataset:'class');
Match1.DefineKey ('Name');
Match1.DefineData ('Name','Sex','Age');
Match1.DefineDone ();
end;
call missing(of _all_);
set cities(keep=Persona department);
if Match1.find(key:Persona) = 0;
/* potentially more statements */
drop persona;
run;
Also be aware that at least in your sample data column Persona has a length of 10 but Name only a length of 8. Should the string length stored in Persona exceed 8 characters then the hash lookup will return an error like: ERROR: Argument length greater than length of key variable Name
Plesse provide sample data in the form of a working DATA step, and a working DATA step with the desired result.
With that clarification, we'll be able to determine a way to get from sample input to desired output, with tested code.
Help us help you.
Easiest is to just rename the key variable to what's in your base table when loading into the hash table.
dcl hash h1 (dataset:'your_table(rename=(keyB=KeyA))');
"but in my output I want just a few values from both tables"
You need to further specify what this is about AND you need to provide sample data and desired result. Please provide the sample data via a working SAS datastep that creates such data.
Here is an example :
Add these two statements to your IF N_ = 1 block:
length name $8 sex $8 age 8;
call missing(name,sex,age);
You have to tell the data step compiler about the new variables, so it can compile them into the PDV. The second statement is just there to avoid the "uninitialized" NOTE.
Hello,
In fact, in my output I just want de keep Name Sex Age and department and not (zipcode, City, Persona)
I dont even want that the variables (zipcode, City, Persona) went to PVD
Thanks
Use either the DROP= or KEEP= dataset option in your SET statement.
In this case it will be process in PDV, but I want to avoid that for efficiency because my data are very big!
Thanks
When you exclude variables in the SET statement, which controls the reading of a dataset, those variables will not be part of the PDV. You can verify this with a PUT _ALL_; statement.
BTW since the whole dataset file will be p?hysically read anyway, the number of variables in the PDV won't have an influence on processing time, only on memory used, and with so few as you describe, the effect of dropping them will be nil.
I put a keep just after If 0 (I dont know if it is that you mean by Set reading statment)
but still processing all the variables in PVD (please see log below)
Code :
log :
Hello,
It's what I did but still not solve the problem!
Thanks!
@ssafmed wrote:
Hello,
It's what I did but still not solve the problem!
Thanks!
if 0 then set class cities;keep Name Sex Age Persona Department;if _n_ = 1 then do;declare Hash Match1 (dataset:'class');
You are still using the statement instead of the dataset option. Please read the documentation I linked for you.
Let's see what your code does:
if 0 then set class cities;
Since SAS will evaluate zero as FALSE the SET statement will never run. But the data step compiler will see it and add all of the variables from both the CLASS and CITIES datasets.
keep Name Sex Age Persona Department;
This says to only output these 5 specific variables. If CLASS is like SASHELP.CLASS then the first three will be defined as in CLASS and the second two will be defined as in CITIES. If any of those five variables do not exist in either CLASS or CITIES and are not later defined explicitly or implicitly by other code in the step then SAS not create them and instead write a WARNING message to the SAS log.
Perhaps you meant to do something like:
if 0 then set class(keep=Name Sex Age) cities(keep=Persona Department);
Now the data step compile step will only add those specific 5 variables. But you will have to also check other places where you use either CLASS or CITIES in another SET or MERGE statement to make sure that does not bring in other variables.
Below how I would code for this. Make sure that you load the table with the lower volume into the hash. That's from a memory consumption and performance perspective much more important than if you've got a few more variables than you would need in the PDV. As long as you only write variables to the output table that you need (drop/keep) the difference to performance is negligible.
data Set1;
if _n_ = 1 then
do;
if 0 then set class(keep=name sex age);
declare Hash Match1 (dataset:'class');
Match1.DefineKey ('Name');
Match1.DefineData ('Name','Sex','Age');
Match1.DefineDone ();
end;
call missing(of _all_);
set cities(keep=Persona department);
if Match1.find(key:Persona) = 0;
/* potentially more statements */
drop persona;
run;
Also be aware that at least in your sample data column Persona has a length of 10 but Name only a length of 8. Should the string length stored in Persona exceed 8 characters then the hash lookup will return an error like: ERROR: Argument length greater than length of key variable Name
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.