BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
buechler66
Barite | Level 11

 

I would like to merge these two tables. Both tables have the exact same variable names. Is there any way to append a string to the front of each variable name? Like having all variables coming from table aliased as 'A' would have 'A_' as part of the variable name.  And columns coming from the table aliased as 'B' would be prefixed with 'B_'.  Or is there a built-in way to handle this?  I would be doing this so that I could later subset based on variable comparisons rules between like variables.  For example:  

 

if A_DELIVERY > B_DELIVERY;

 

 

proc sql;
	create table MergedData as 
		select  a.delivery, b.delivery, a.type_cd, b.type_cd, ...more
	        from iv_ora.ods_iv_recon_selected_mp a,  bids_ora.ods_bi_recon_selected_mp b                                                                                                                                                                                                                                                                                                                                                                                                                                                   
		where a.imb_dlvry_zip_5 = '14221'
		  and a.imb_code = b.imb_code;
quit;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

To get the IDs where a.DELIVERY > b.DELIVERY

 

proc sql;
create table A_gt_B_IDs as
select a.id
from
	have as a inner join 
	have as b on 
		a.id = b.id and 
		a.DELIVERY > b.DELIVERY
where a.source="A" and b.source="B";
quit;

/*Then you can, for example, extract all records for those IDs with: */

proc sql;
create table A_gt_B as
select *
from have
where id in (select id from A_gt_B_IDs);
quit;

 

PG

View solution in original post

12 REPLIES 12
gamotte
Rhodochrosite | Level 12
Hi,

SELECT a.delivery AS A_delivery, b.delivery AS B_delivery, ,...
buechler66
Barite | Level 11
yah, but there are hundreds of columns so I was hoping for a less manual way of doing this. Any thoughts?
gamotte
Rhodochrosite | Level 12

If you want to do this for all columns of two datasets having the same structure, you can use proc contents with the out parameter. Then you can use the generated output table to create a macrovariable contaning the list of variables in your select :

 

[not tested]

SELECT cat("a.",NAME," AS A_",NAME), cat("b.",NAME," AS B_",NAME)

INTO :a_cols SEPARATED BY ',', :b_cols SEPARATED BY ','

FROM contents_a; /* Table generated by proc contents on table A */

 

Then

 

SELECT &a_cols., &_bcols

 

If the structure differ, you can adapt by creating intermediary tables containing only the relevant columns

ballardw
Super User

Minor recomendation JOIN ON the imb_code instead of selecting WHERE after doing the join of all records. It doesn't take many records in the two sets to start seeing performance hits : 1000(records in A) * 1000 (records in B) get combined for 1,000,000 records that you then select from.

Also use

 

from (select * from iv_ora.ods_iv_recon_selected_mp where a.imb_dlvry_zip_5 = '14221') as a

to further reduce the size of the comparison sets.

buechler66
Barite | Level 11

So if I understand you correctly it would look like this?

 

	proc sql;
	create table QueryData as 
		select  b.actual_dlvry_date as AD_DT, 
			b.imb_code length = 31, 
			b.spm_calc_batch_date
	         from (select * from iv_ora.ods_iv_recon_selected_mp where imb_dlvry_zip_5 = '14221') a
                 inner join (select * from bids_ora.ods_bi_recon_selected_mp where imb_dlvry_zip_5 = '14221') b                                                                                                                                                                                                                                                                                                                                                                                                                                                   
		         on a.imb_code = b.imb_code
		where &rule;
	quit;
PGStats
Opal | Level 21

Insteat of creating a wide data structure

 

A_id B_id A_x B_x ...

 

i.e. multiple columns with the same type of information, you should try to work with a long structure

 

'A' id x ...

'B' id x ...

 

that will allow simpler data manipulation and scaling.

 

For example, your two tables could be concatenated with

 

data want;

set iv_ora.ods_iv_recon_selected_mp(in=a) bids_ora.ods_bi_recon_selected_mp;

if a then source="A" else source="B";

run;

 

or

 

proc sql;

create table want as

select "A" as source, * from iv_ora.ods_iv_recon_selected_mp

union all corresponding

select "B" as source, * from bids_ora.ods_bi_recon_selected_mp;

quit;

PG
buechler66
Barite | Level 11

Interesting, but then how would I do my rule comparison in the long structure? In the wide structure the variables are on the same record and the comparison would look like:

if A_DELIVERY > B_DELIVERY then do...

 

How would this look in the long structure?

PGStats
Opal | Level 21

With SQL, you would do:

 

select a.id, a.DELIVERY > b.DELIVERY as comparison

from have as a inner join have as b on a.id=b.id

where a.source="A" and b.source="B"

 

in a data step, you would do:

 

merge

    have(where=(source="A") rename=delivery=a_delivery)

    have(where=(source="B") rename=delivery=b_delivery);

by id;

comparison = a_delivery > b_delivery;

 

But, in most cases, source is not going to be pertinent information. Fields such as Year, Location, or Product will more likely distinguish information from different sources.

PG
buechler66
Barite | Level 11

I keep coming back to your example, but how does creating a variable named Comparison (e.g. a.DELIVERY > b.DELIVERY as comparison) help me to actually select the records where a.DELIVERY is greater than b.DELIVERY?  I need to actually create a ds of records where a.DELIVERY > b.DELIVERY.

PGStats
Opal | Level 21

To get the IDs where a.DELIVERY > b.DELIVERY

 

proc sql;

create table AgtB as

select a.id

from

have as a inner join have as b on a.id=b.id and a.DELIVERY > b.DELIVERY

where a.source="A" and b.source="B"

PG
PGStats
Opal | Level 21

To get the IDs where a.DELIVERY > b.DELIVERY

 

proc sql;
create table A_gt_B_IDs as
select a.id
from
	have as a inner join 
	have as b on 
		a.id = b.id and 
		a.DELIVERY > b.DELIVERY
where a.source="A" and b.source="B";
quit;

/*Then you can, for example, extract all records for those IDs with: */

proc sql;
create table A_gt_B as
select *
from have
where id in (select id from A_gt_B_IDs);
quit;

 

PG
buechler66
Barite | Level 11

I've got it working! Thanks so much for taking the time to help. I really appreciate it.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 26226 views
  • 4 likes
  • 4 in conversation