BookmarkSubscribeRSS Feed
Geo-
Quartz | Level 8

Hi~Anyone who is good at sas  please  trun sql in data step.

create table tableD nologging as
select a.acct,
b.app_dt,
case when a.acct in (select acct from tableC) then 1 else 0 end prom_ind
from tableA a
inner join tableB b
on a.application_no = b.application_no
where to_char(b.app_dt,'yyyymmdd') between '20150101' and '20150630'
;
quit;

 

13 REPLIES 13
Geo-
Quartz | Level 8

hi expert,only looking for code the right with SAS data step syntax is fine.

LinusH
Tourmaline | Level 20
Why do you want to translate the code?
Any particular problem with existing one..?
Data never sleeps
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

Why do you want to change a working process into a possible POS.

If it works leave it alone.  Why make a sports car run like a VW.

mkeintz
PROC Star

@VDD

 

There may be many good reasons for converting proc sql code to data step code.

 

One of the most common is to take advantage of situations in which tables to be joined based on equality of variables arise from data sets sorted on the respective join variables.  This would almost always be a speed enhancer for large data sets.

 

The data step might be the sports car.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

thank you @mkeintz that makes since.  In the programming that I do I find myself using the datastep about 70 percent of the time and I never really know why I used it so much.  Others always ask me why I do it in the datastep when I could use a case statement but your answer must reflect inner knowledge that I just not aware of but spontaneously do.

 

mkeintz
PROC Star

I would say, as a general rule, if the data sets are sorted advantageously, that most data step code utilizing that order is likely to be faster than the equivalent proc sql.  That includes:

  1. MERGE (or SET) with BY statement (where the by is utilizing the physical record sequence instead of data set index on the by variable), which I mentioned earlier.
  2. LAG functions (which aren't even supported in proc sql).
  3. self-MERGEs to get lead values,
      e.g. MERGE have have (firstobs=2 keep=x rename=(x=xlead1));.
  4. rolling windows for time series (sometimes better than the time-series-oriented PROC EXPAND)
  5. interpolating observations.

 

Edited addition:  Also, in cases where use of arrays are beneficial in data steps, I am not aware of an analogous coding structure in proc sql.  But my knowledge of proc sql is quite inferior to my knowledge of the data step, so it might be out there in the documentation.  If it is, I'd welcome a pointer.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

You need to show the data.

Are ACCT and APP_DT variables in TABLEA or TABLEB? Are TABLEA and TABLEB sorted by APPLICATION_NO?  How does relationship between ACCT and APPLICATION_NO work?  Is ACCT a unique key on TABLEC?  Does TABLEC have an index on ACCT?

 

 

Geo-
Quartz | Level 8
hi,expert~I have updated the question,please take a look
Geo-
Quartz | Level 8
Yes,I have made some samples in the attachment
Kurt_Bremser
Super User

Excel files are NOT SAS datasets. They can tell us nothing about column attributes like lengths, formats etc, as those informations are mostly lost during the export.

Please supply example data as advised.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2128 views
  • 0 likes
  • 6 in conversation