SAS Programming

Geo- · Posted 11-17-2018 01:23 AM

Hi~Anyone who is good at sas please trun sql in data step.

create table tableD nologging as
select a.acct,
b.app_dt,
case when a.acct in (select acct from tableC) then 1 else 0 end prom_ind
from tableA a
inner join tableB b
on a.application_no = b.application_no
where to_char(b.app_dt,'yyyymmdd') between '20150101' and '20150630'
;
quit;

Kurt_Bremser · Posted 11-17-2018 02:30 AM

Please provide examples (data steps with datalines) for tablea, tableb and tablec.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Geo- · Posted 11-17-2018 03:07 AM

hi expert,only looking for code the right with SAS data step syntax is fine.

Kurt_Bremser · Posted 11-17-2018 03:10 AM

@Geo- wrote:

hi expert,only looking for code the right with SAS data step syntax is fine.

Code is driven by the data. Help us to help you.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

LinusH · Posted 11-17-2018 06:29 AM

Why do you want to translate the code?
Any particular problem with existing one..?

Data never sleeps

VDD · Posted 11-17-2018 08:34 AM

Why do you want to change a working process into a possible POS.

If it works leave it alone. Why make a sports car run like a VW.

mkeintz · Posted 11-17-2018 07:06 PM

@VDD

There may be many good reasons for converting proc sql code to data step code.

One of the most common is to take advantage of situations in which tables to be joined based on equality of variables arise from data sets sorted on the respective join variables. This would almost always be a speed enhancer for large data sets.

The data step might be the sports car.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

VDD · Posted 11-17-2018 11:26 PM

thank you @mkeintz that makes since. In the programming that I do I find myself using the datastep about 70 percent of the time and I never really know why I used it so much. Others always ask me why I do it in the datastep when I could use a case statement but your answer must reflect inner knowledge that I just not aware of but spontaneously do.

mkeintz · Posted 11-18-2018 12:22 PM

I would say, as a general rule, if the data sets are sorted advantageously, that most data step code utilizing that order is likely to be faster than the equivalent proc sql. That includes:

MERGE (or SET) with BY statement (where the by is utilizing the physical record sequence instead of data set index on the by variable), which I mentioned earlier.
LAG functions (which aren't even supported in proc sql).
self-MERGEs to get lead values,
e.g. MERGE have have (firstobs=2 keep=x rename=(x=xlead1));.
rolling windows for time series (sometimes better than the time-series-oriented PROC EXPAND)
interpolating observations.

Edited addition: Also, in cases where use of arrays are beneficial in data steps, I am not aware of an analogous coding structure in proc sql. But my knowledge of proc sql is quite inferior to my knowledge of the data step, so it might be out there in the documentation. If it is, I'd welcome a pointer.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Tom · Posted 11-18-2018 10:31 AM

You need to show the data.

Are ACCT and APP_DT variables in TABLEA or TABLEB? Are TABLEA and TABLEB sorted by APPLICATION_NO? How does relationship between ACCT and APPLICATION_NO work? Is ACCT a unique key on TABLEC? Does TABLEC have an index on ACCT?

Geo- · Posted 11-19-2018 09:23 AM

hi,expert~I have updated the question,please take a look

Kurt_Bremser · Posted 11-19-2018 09:33 AM

@Geo- wrote:
hi,expert~I have updated the question,please take a look

Still no example data.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Geo- · Posted 11-19-2018 06:24 PM

Yes,I have made some samples in the attachment

Kurt_Bremser · Posted 11-20-2018 05:53 AM

Excel files are NOT SAS datasets. They can tell us nothing about column attributes like lengths, formats etc, as those informations are mostly lost during the export.

Please supply example data as advised.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

SAS Programming

proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Re: proc sql into data step

Follow Us

What is...

SAS Programming

Register Today!

SAS Training: Just a Click Away

Follow Us

What is...