- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi~Anyone who is good at sas please trun sql in data step.
create table tableD nologging as
select a.acct,
b.app_dt,
case when a.acct in (select acct from tableC) then 1 else 0 end prom_ind
from tableA a
inner join tableB b
on a.application_no = b.application_no
where to_char(b.app_dt,'yyyymmdd') between '20150101' and '20150630'
;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please provide examples (data steps with datalines) for tablea, tableb and tablec.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hi expert,only looking for code the right with SAS data step syntax is fine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Geo- wrote:
hi expert,only looking for code the right with SAS data step syntax is fine.
Code is driven by the data. Help us to help you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Any particular problem with existing one..?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why do you want to change a working process into a possible POS.
If it works leave it alone. Why make a sports car run like a VW.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There may be many good reasons for converting proc sql code to data step code.
One of the most common is to take advantage of situations in which tables to be joined based on equality of variables arise from data sets sorted on the respective join variables. This would almost always be a speed enhancer for large data sets.
The data step might be the sports car.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
thank you @mkeintz that makes since. In the programming that I do I find myself using the datastep about 70 percent of the time and I never really know why I used it so much. Others always ask me why I do it in the datastep when I could use a case statement but your answer must reflect inner knowledge that I just not aware of but spontaneously do.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would say, as a general rule, if the data sets are sorted advantageously, that most data step code utilizing that order is likely to be faster than the equivalent proc sql. That includes:
- MERGE (or SET) with BY statement (where the by is utilizing the physical record sequence instead of data set index on the by variable), which I mentioned earlier.
- LAG functions (which aren't even supported in proc sql).
- self-MERGEs to get lead values,
e.g. MERGE have have (firstobs=2 keep=x rename=(x=xlead1));. - rolling windows for time series (sometimes better than the time-series-oriented PROC EXPAND)
- interpolating observations.
Edited addition: Also, in cases where use of arrays are beneficial in data steps, I am not aware of an analogous coding structure in proc sql. But my knowledge of proc sql is quite inferior to my knowledge of the data step, so it might be out there in the documentation. If it is, I'd welcome a pointer.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to show the data.
Are ACCT and APP_DT variables in TABLEA or TABLEB? Are TABLEA and TABLEB sorted by APPLICATION_NO? How does relationship between ACCT and APPLICATION_NO work? Is ACCT a unique key on TABLEC? Does TABLEC have an index on ACCT?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Geo- wrote:
hi,expert~I have updated the question,please take a look
Still no example data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Excel files are NOT SAS datasets. They can tell us nothing about column attributes like lengths, formats etc, as those informations are mostly lost during the export.
Please supply example data as advised.