Solved: Re: Merge and proc sql

Pooja98 · Posted 01-06-2020 06:14 AM

Difference between merge and proc sql

unison · Posted 01-06-2020 09:11 AM

Try this

data have;
	input name :$10.;
	datalines;
Ram
Seetha
Radha
Bheem
Arjun
Seetha
Raj
Ram
;
run;

data desired_output;
	input name :$10.;
	datalines;
Ram
Seetha
Radha
Bheem
Arjun
Raj
;
run;

data have_with_rownum;
	set have;
	rownum=_n_;
run;

proc sort data=have_with_rownum nodupkey out=dedup;
	by name;
run;

proc sort data=dedup out=want(keep=name);
	by rownum;
run;

/*Check that want=desired_output*/
proc compare base=desired_output compare=want;
run;

-unison

View solution in original post

ed_sas_member · Posted 01-06-2020 06:40 AM

Hi @Pooja98

Both DATA step merges and PROC SQL joins enable to join two or more tables.

Sometimes they can produce the same outputs.

Here are some of the main differences:

- contrary to DATA step merges, you don't need to sort your data when you use PROC SQL joins

- you need to add a "create table xxx as" clause in the PROC SQL as the default output is a report

- common columns need to have the same name in a DATA step merge, which is not the case for PROC SQL.

- PROC SQL does not overlay common columns by default -> you need to mention one explicitly in the select clause or use the COALESCE option

- Another thing is that you can use another operator that = in you joint expression.

- in case your input datasets have not a unique "key" for merging, it is useful to use PROC SQL as by design, it will create the Cartesian product of input datasets as the result and then delete observations that don't meet the join expression condition(s) -> so depending on the method you choose, you can have different results.

PROC SQL can be also more efficient, especially if you have large datasets.

Hope this help!

Best,

Pooja98 · Posted 01-06-2020 07:53 AM

Thanks a lot!!

Pooja98 · Posted 01-06-2020 07:53 AM

What is %include statement??

ed_sas_member · Posted 01-06-2020 07:55 AM

Hi @Pooja98

The %include statement allow you to bring a SAS program stored in an external file in your current program.

Here is the documentation about this statement: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214504.htm

ChrisHemedinger · Posted 01-06-2020 07:58 AM

Long-time SAS user and community member @Rick_Andrews summarizes the differences and techniques in this article:

Traditional SAS Programming versus PROC SQL

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

Pooja98 · Posted 01-06-2020 08:21 AM

Hi, can anyone help me with this???
I need to remove duplicates and at the same time i don't want to change my order

Name

Ram
Seetha
Radha
Bheem
Arjun
Seetha
Raj
Ram

The output should be :

Ram
Seetha
Radha
Bheem
Arjun
Raj

I tried both nodupkey and distinct, i got the results but name variable displaying in ascending order

unison · Posted 01-06-2020 09:11 AM

Try this

data have;
	input name :$10.;
	datalines;
Ram
Seetha
Radha
Bheem
Arjun
Seetha
Raj
Ram
;
run;

data desired_output;
	input name :$10.;
	datalines;
Ram
Seetha
Radha
Bheem
Arjun
Raj
;
run;

data have_with_rownum;
	set have;
	rownum=_n_;
run;

proc sort data=have_with_rownum nodupkey out=dedup;
	by name;
run;

proc sort data=dedup out=want(keep=name);
	by rownum;
run;

/*Check that want=desired_output*/
proc compare base=desired_output compare=want;
run;

-unison

Pooja98 · Posted 01-07-2020 08:30 AM

SAS Training: Just a Click Away