Hello team,
Would would be the context left and right join in SAS without using in proc sql?
Regards,
Blue blue
Thanks for sending the link.
It is very hard to understand from this document. I have read it several times!
Regards,
blueblue
For an easy entry into the documentation, start with the examples first. Copy the code and run it, then play around with it.
You add dataset option IN= to each data set to identify if the current observation comes from the desired set. The select for output those that you want.
Dummy code:
Data want; merge data1 (in=inone) data2 (in=intwo) ; by keyvariable; /*in a given data step likely only want one of the following for most cases. */ /* select records where data1 contributed*/ if inone; /* or select records where data2 contributed*/ if intwo; /* or select records where both data1 and data2*/ if inone and intwo; /* select records from data1 that had no match in data2*/ if inone and not intwo; run;
The IN variables are basically boolean: 1/0 for Yes/no so can be used in if statements. (Or summed to identify how many data sets contributed to given record since a MERGE statement can have many data sets ).
However the Merge is still not the same as a "join on" when by values are duplicated in both sets.
Basically there isn't, because SQL works different when a many-to-many relationship is encountered.
A DATA step can only replicate a left or right join when there is either a one-to-one, one-to-many or many-to-one relationship.
Once that is verified, read the documentation of the MERGE Statement
As @Kurt_Bremser pointed out, if you have a 1:many, many:1, or 1:1 for every key value, then a MERGE statement can easily work like a left join.
But many-to-many takes more effort, and use of a hash object.
data want;
set right (in=inR) left (in=inL);
by key;
if _n_=1 then do;
declare hash r (dataset:'right (obs=0)',multidata:'Y');
r.definekey('key');
r.definedata(all:'Y');
r.definedone();
end;
if first.key then r.clear();
if inR then r.add();
if inL ;
if r.num_items=0 then output;
else do rc=r.find() by 0 until (r.find_next()^=0);
output;
end;
call missing(of _all_);
run;
Basically, for each KEY, all the right-side observations are read first and loaded into a hash object in memory. Then for each subsequent left observation, the program iterates through the hash object, retrieving values from the right-side and outputting for each iteration.
Hello,
Basically, for each KEY, all the right-side observations are read first and loaded into a hash object in memory. Then for each subsequent left observation, the program iterates through the hash object, retrieving values from the right-side and outputting for each iteration.
Does this
set right (in=inR) left (in=inL);
by key;
determine which table should be read first?
If we had:
set Left (in=inL) right(in=inR)
Is the top reads the right table first even if it is the second table in a row?
Respectfully,
blueblue
@GN0001 wrote:
Hello,
Basically, for each KEY, all the right-side observations are read first and loaded into a hash object in memory. Then for each subsequent left observation, the program iterates through the hash object, retrieving values from the right-side and outputting for each iteration.
Does this
set right (in=inR) left (in=inL); by key;
determine which table should be read first?
If we had:
set Left (in=inL) right(in=inR)
Is the top reads the right table first even if it is the second table in a row?
Respectfully,
blueblue
This is one of those questions that can well be answered by
More generally, since you want to do a left join, you first want to load the RIGHT data into memory, so that it can be repeatedly and inexpensively looped over for each su7bsequent matching LEFT observation. This in turn means you have to read the RIGHT prior to the LEFT, in order to put it in memory. Thus "set right left". The opposite ("set left right") would yield NO data from RIGHT, since it isn't in memory when processing the LEFT dataset.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: