BookmarkSubscribeRSS Feed
LaurieF
Barite | Level 11

The main difference when you've only got one source dataset is that the Extract transformation doesn't support having. But the extra flexibility (ease of changing procedure and dataset options, for example) make it more superior again. I still use Extract, but only when I know that the requirements for using it are simple and will (in all likelihood) not change.

 

Personally, when I'm writing SAS by hand I tend towards using a data step, because of the control and often sheer speed it gives me. 

 

When joining two or more tables together, I chop and change between an SQL join and a dataset merge/update. It depends on the structure (especially width) and overall size of the datasets. I do have a generic DI transformation that I've written that does a dataset merge - it requires that the two source datasets are sorted. Even with massively denormalised tables on both sides, it screams through - something that SQL has trouble doing.

jwhite
Quartz | Level 8

@LaurieF, how would this logic work of the rows have different values in the grouped columns?

 

For example:

 

A B C D 10
A B C D 10
A B C D 30
A B C D 50 
A B H D 10
A B H D 10
A B H D 10
A B H D 20
F F F D 10
F F F D 30

 With the above and doing a max() on the fifth column...the value in the 5th column for all rows will be 50, which is the highest. Seemingly even when I group the first four. I get:

A B C D 50

A B H D 50

F F F D 50

 

What I desire would be:

A B C D 50

A B H D 20

F F F D 30

 

I guess I would think that the max() would evaluate for the highest value within the group.

 

Is that not correct?

 

Thanks!

jwhite
Quartz | Level 8
Nevermind. I think I've got it.

I appreciate all the help and insight!
ballardw
Super User

If the other columns are numeric and amenable to statistic such as max or min (and few others) as an alternate approach:

 

proc summary data=have nway;

   class col1 col2 col3 col4 ;

   var var1 var2 var3;

   output out=want(drop = _type_ _freq_) max= ;

run;

LinusH
Tourmaline | Level 20
I think that @LaurieF nailed it with "I use Extract when I know that the transformation will remain a simple one".
Extract is easier to use and understand, so as long as Join is not simplified I think it has its place.
If you have only 1-1 mappings it's not that much of penalty to switch to join if you later find out that the Extract isn't enough.
Data never sleeps

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 7154 views
  • 1 like
  • 7 in conversation