SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Extract versus SQL Join in DI Studio: Best practices?

Frequent Contributor
Posts: 89

Extract versus SQL Join in DI Studio: Best practices?

I'd appreciate it if someone can please inform me of, or direct me to a place where I can learn about, best practices on when/why to use an Extract instead of an SQL Join, and vice versa. In example, if I have an Extract with a very long and complex Where-statement, is it best practise to switch to an SQL Join to make it look more orderly even though there's only one input table and no need for joins?

I ask because I remember many jobs where people have used SQL Joins instead of Extracts despite there only being one input table.

Thanks. Smiley Happy

Super Contributor
Posts: 644

Re: Extract versus SQL Join in DI Studio: Best practices?

In this situation it would be my preference to use SQL rather than a datastep purely because SQL syntax is widely used and understood, by programmers from a variety of application backgrounds.  This can be important if (1) your code will be audited by a third party, or (2) if [wash your mouth!] management decides to move away from using SAS.  So unless particular datastep functionality (first. or last. processing, or arrays) I would advocate SQL if you have the choice.

Richard in Oz

Super User
Posts: 5,254

Re: Extract versus SQL Join in DI Studio: Best practices?

Both Extract and SQL Join uses SQL, so there is no issue here to use a data step...

As you said yourself, SQL Join comes with a lot more possibilities in the GUI, and metadata driven syntax, than Extract.

I can't say the one or the other is best practice, as long you have "ordinary" where clauses. But if you feel that the where clause builder is better than the advanced expression builder in Extract, use SQL Join.

If you intend to use more complex things like sub-query and stuff like that, SQL Join would the prefered transformation.

Data never sleeps
Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation