DATA Step, Macro, Functions and more

proc sql - help needed

Accepted Solution Solved
Reply
Contributor
Posts: 41
Accepted Solution

proc sql - help needed

i am using the below step to get obs with origin which are not in _assignments_ and only present in _values_ but not able to get.

 

proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column;
quit;

 

I am not sure where i am wrong as i am using left join.

 

any help?


Accepted Solutions
Solution
‎04-20-2017 08:12 AM
Super User
Posts: 7,863

Re: proc sql - help needed

[ Edited ]
Posted in reply to alexdsa310

With a left join, you will always get all observations from the first table, and where a match is found in the other tables, their data will be added (left missing otherwise).

What you want might be achieved with

proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;

or

data sv1;
merge
  _values (in=a)
  _assignments (in=b keep=origin column)
;
by origin column;
if a and not b;
run;

(proper sorting required)

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎04-20-2017 08:12 AM
Super User
Posts: 7,863

Re: proc sql - help needed

[ Edited ]
Posted in reply to alexdsa310

With a left join, you will always get all observations from the first table, and where a match is found in the other tables, their data will be added (left missing otherwise).

What you want might be achieved with

proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;

or

data sv1;
merge
  _values (in=a)
  _assignments (in=b keep=origin column)
;
by origin column;
if a and not b;
run;

(proper sorting required)

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 41

Re: proc sql - help needed

Posted in reply to KurtBremser

Thanks Kurt.

 

just a small clarification

does 

where b_origin is missing;

in proc sql gives origin which is missing ?

i need if origin and column is present in both then i do not need that observation

Super User
Posts: 7,863

Re: proc sql - help needed

[ Edited ]
Posted in reply to alexdsa310

When you do a left join on a condition, and no match is found from the right, then the variables from the "right" table will all be set to missing. If a match is found, the variable(s) used in the condition can not be missing, by definition.

 

Edit: your post called my attention to a typo I made (corrected in the meantime). The condition should be

where b.origin is missing;

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 41

Re: proc sql - help needed

Posted in reply to KurtBremser

Thanks once again kurt.

 

For my final clarification i will get only observations which will not have a common origin and column observations in both the datasets(which is what i need). i.e if i have same origin and column name in both of my datasets then i will ommit that record and only take records which has _values_  where origin and column are different and not present in _assignments_.

 

Sorry for this. its just for my understanding

Super User
Posts: 7,863

Re: proc sql - help needed

[ Edited ]
Posted in reply to alexdsa310

If in doubt, do a test run and look at the results. If puzzled, inquire further.

(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 11,343

Re: proc sql - help needed

Posted in reply to KurtBremser

KurtBremser wrote:

If in doubt, do a test run and look at the results. If puzzled, inquire further.

(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).


@KurtBremser do you have more for that list? My google-fu only returns the Mercenary list (which I strongly suspect was an inspiration to your response).

And now I need some Ovalqwik!

Super User
Posts: 7,863

Re: proc sql - help needed


ballardw wrote:

KurtBremser wrote:

If in doubt, do a test run and look at the results. If puzzled, inquire further.

(Maxim 4 from the Maxims of Maximally Efficient SAS Programmers).


@KurtBremser do you have more for that list? My google-fu only returns the Mercenary list (which I strongly suspect was an inspiration to your response).

And now I need some Ovalqwik!


I already have more for that list, and I plan to release it here as an ongoing project, as I expect that many would have valid maxims to contribute. And you are, of course, completely right where the inspiration came from. Tagon's Toughs to the rescue!

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 7,863

Re: proc sql - help needed

A first draft of the MoMESP has been saved at https://communities.sas.com/t5/SAS-Communities-Library/Maxims-of-Maximally-Efficient-SAS-Programmers...

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 41

Re: proc sql - help needed

Posted in reply to KurtBremser

Hi Kurt,

 

I tried the same but still facing problems as the variable origin has multiple observations and due the NULL condition it is removing all and i only want to remove if it is same with origin and column and originwhere.

Attached is the sample datasets.

 

Can you guide me if possible

Super User
Posts: 7,863

Re: proc sql - help needed

Posted in reply to alexdsa310

These are not SAS datasets, but Excel spreadsheets. Use the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your example datasets into data steps for posting as text here.

Those data steps will preserve attributes (lengths, formats), and can be easily copied into SAS. MS Office files are often blocked by firewalls for security reasons.

Also post the code you used, and a quick representation of the expected result (no data step code needed there), as it differs from what you got with the code.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 41

Re: proc sql - help needed

Posted in reply to KurtBremser

I am using the below code

proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column
where b.origin is missing;
quit;

i need only observations which are in _values_ and not in _assignments_. so if i have a common observation in bothe the datasets with origin,originwhere i do not need them.

 

Currently it is not working as there are multiple observations of origin with diffirent column and originwhere with the code.

 

Attachment
Attachment
Super User
Posts: 7,863

Re: proc sql - help needed

Posted in reply to alexdsa310

So you just need originwhere as an additional condition for the join?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 41

Re: proc sql - help needed

Posted in reply to KurtBremser

Yes, i tried adding it in and a.originwhere=b.originwhere but i get few observations which are in _assignments_ coming in my dataset 

Super User
Posts: 7,863

Re: proc sql - help needed

Posted in reply to alexdsa310

Which of the 230 observations you get with:

proc sql;
create table sv1 as
select a.* from _values_ a left join _assignments_ b
on a.origin=b.origin and a.column=b.column and a.originwhere=b.originwhere
where b.origin is missing;
quit;

should not be in the result dataset (just provide observation numbers)?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 341 views
  • 3 likes
  • 3 in conversation