i am using the below step to get obs with origin which are not in _assignments_ and only present in _values_ but not able to get.
proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column;
quit;
I am not sure where i am wrong as i am using left join.
any help?
With a left join, you will always get all observations from the first table, and where a match is found in the other tables, their data will be added (left missing otherwise).
What you want might be achieved with
proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;
or
data sv1;
merge
_values (in=a)
_assignments (in=b keep=origin column)
;
by origin column;
if a and not b;
run;
(proper sorting required)
With a left join, you will always get all observations from the first table, and where a match is found in the other tables, their data will be added (left missing otherwise).
What you want might be achieved with
proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;
or
data sv1;
merge
_values (in=a)
_assignments (in=b keep=origin column)
;
by origin column;
if a and not b;
run;
(proper sorting required)
Thanks Kurt.
just a small clarification
does
where b_origin is missing;
in proc sql gives origin which is missing ?
i need if origin and column is present in both then i do not need that observation
When you do a left join on a condition, and no match is found from the right, then the variables from the "right" table will all be set to missing. If a match is found, the variable(s) used in the condition can not be missing, by definition.
Edit: your post called my attention to a typo I made (corrected in the meantime). The condition should be
where b.origin is missing;
Thanks once again kurt.
For my final clarification i will get only observations which will not have a common origin and column observations in both the datasets(which is what i need). i.e if i have same origin and column name in both of my datasets then i will ommit that record and only take records which has _values_ where origin and column are different and not present in _assignments_.
Sorry for this. its just for my understanding
If in doubt, do a test run and look at the results. If puzzled, inquire further.
(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).
@Kurt_Bremser wrote:
If in doubt, do a test run and look at the results. If puzzled, inquire further.
(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).
@Kurt_Bremser do you have more for that list? My google-fu only returns the Mercenary list (which I strongly suspect was an inspiration to your response).
And now I need some Ovalqwik!
@ballardw wrote:
@Kurt_Bremser wrote:
If in doubt, do a test run and look at the results. If puzzled, inquire further.
(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).
@Kurt_Bremser do you have more for that list? My google-fu only returns the Mercenary list (which I strongly suspect was an inspiration to your response).
And now I need some Ovalqwik!
I already have more for that list, and I plan to release it here as an ongoing project, as I expect that many would have valid maxims to contribute. And you are, of course, completely right where the inspiration came from. Tagon's Toughs to the rescue!
A first draft of the MoMESP has been saved at https://communities.sas.com/t5/SAS-Communities-Library/Maxims-of-Maximally-Efficient-SAS-Programmers...
Hi Kurt,
I tried the same but still facing problems as the variable origin has multiple observations and due the NULL condition it is removing all and i only want to remove if it is same with origin and column and originwhere.
Attached is the sample datasets.
Can you guide me if possible
These are not SAS datasets, but Excel spreadsheets. Use the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your example datasets into data steps for posting as text here.
Those data steps will preserve attributes (lengths, formats), and can be easily copied into SAS. MS Office files are often blocked by firewalls for security reasons.
Also post the code you used, and a quick representation of the expected result (no data step code needed there), as it differs from what you got with the code.
I am using the below code
proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;
i need only observations which are in _values_ and not in _assignments_. so if i have a common observation in bothe the datasets with origin,originwhere i do not need them.
Currently it is not working as there are multiple observations of origin with diffirent column and originwhere with the code.
So you just need originwhere as an additional condition for the join?
Yes, i tried adding it in and a.originwhere=b.originwhere but i get few observations which are in _assignments_ coming in my dataset
Which of the 230 observations you get with:
proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column and a.originwhere=b.originwhere
where b.origin is missing;
quit;
should not be in the result dataset (just provide observation numbers)?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.