BookmarkSubscribeRSS Feed
imanojkumar1
Quartz | Level 8

 

What I want to do is to find an alternative to the following code:

   

PROC SQL;
       CREATE TABLE WORK.XX AS
       SELECT DISTINCT t2.WC, t2.CWC
          FROM WORK.YY t1
               INNER JOIN WORK.ZZ t2 ON (t1.MC = t2.WC)
    ;
QUIT;



Could someone please help in doing the same thing using hash or any other method?

11 REPLIES 11
Kurt_Bremser
Super User

If there is a 1(YY) to n(ZZ) relation, then

data work.xx;
merge
  work.yy (in=a keep=mc rename=(mc=wc))
  work.zz (in=b keep=wc cwc)
;
by wc;
if a and b;
run;

proc sort data=work.xx nodupkey;
by wc cwc;
run;

should deliver the same result.

imanojkumar1
Quartz | Level 8

Many thanks for your help.

 

unfortunately, it returned me an error:

 

ERROR: BY variables are not properly sorted on data set WORK.YY
a=1 b=1 WC=7507 CWC=  FIRST.WC=0 LAST.WC=1 _ERROR_=1 _N_=585904753

imanojkumar1
Quartz | Level 8
These are relatively large tables. sorting might take a long time. I am providing a sample data in a reply in the trailing communication (please see below). Thanks.
Kurt_Bremser
Super User

The solution depends very much on the size(s). If one of the tables fits into MEMSIZE, creating a format or using a hash object can speed your process up.

If that is not the case, you will have to sort the tables for the merge. In my experience, using explicit sorts and a merge will out-perform SQL when the contributing tables grow large

PeterClemmensen
Tourmaline | Level 20

Do the variables WC and CWC also exist in WORK.YY? 🙂

imanojkumar1
Quartz | Level 8

XX Table has following vars:
MC LC MCC MCN TLC DD

YY:
WC CWC TS MS Y X

 

Where MC and WC are same datatype (char) and common values but only var names are different i.e. in XX it is MC and in YY it is WC

 

CWC is only in YY and not in XX

 

LinusH
Tourmaline | Level 20
That's a quite neat SQL, why do you need an alternative?
Data never sleeps
imanojkumar1
Quartz | Level 8

I have the following tables:

 

data have01;
infile cards truncover expandtabs;
input MC $ LC $ MCC $ MCN $ TLC $ DD $ ODS_TimeStamp ODS_LUpd zTPl $ PuD $;
cards;
1853 DR14 1 Vetu SM3 . 24SEP2013:10:06:53.580 20JUL2016:12:55:39.240 0 .
1856 DR14 1 Vetu SM3 . 24SEP2013:10:06:53.580 20JUL2016:12:55:39.240 0 .
1869 DR14 1 Vetu SM3 . 24SEP2013:10:06:53.580 20JUL2016:12:55:39.240 0 .
2024 DV16 1 Vetu SM3 2008-01-31 24SEP2013:10:06:53.580 20JUL2016:12:55:39.240 47 .
2025 DV16 1 Vetu SM3 2008-01-31 24SEP2013:10:06:53.580 20JUL2016:12:55:39.240 47 .
run;

 

You might have to format date column in the above table.

 

data have02;
infile cards truncover expandtabs;
input WPMVId ToSTimeStamp TId ASN WC $ CWC $ TSide $ MNo Y X;
cards;
1 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 1 -82140 2468
2 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 2 -81940 2466
3 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 3 -81739 2463
4 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 4 -81539 2459
5 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 5 -81339 2456
6 21AUG2012:17:57:39.000 20949 1 7604 HPUS230 R 6 -81139 2453
run;


 

You might have to format date column in the above table.

 

Please help me using some alternative to SQL code above, specifically when I have issue that my Table 2 above is almost 0.8 billion rows data and it takes hell a lot time to run SQL query as above.

LinusH
Tourmaline | Level 20
But how large is t1?
Are the tables by any chance indexed on MC/WC respectively?
And how large/small does t1 subsets t2 aproxmately?
Unless small sub with indexes hash table or user def format are your best shots.
Data never sleeps

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2594 views
  • 3 likes
  • 4 in conversation