Hi mates,
How can i transform this proc sql in data step, to check witch one is faster ?
PROC SQL;
CREATE TABLE MMM2 AS
SELECT DISTINCT A.X_ACCT_KEY, ( SELECT B.IND_QTD_SMS
FROM MMM B
WHERE B.X_ACCT_KEY = A.X_ACCT_KEY AND
B.IND_QTD_SMS NOT IN ('0', '')
) AS IND_SMS,
( SELECT B.IND_QTD_EMAIL
FROM MMM B
WHERE B.X_ACCT_KEY = A.X_ACCT_KEY AND
B.IND_QTD_EMAIL NOT IN ('0', '')
) AS IND_EMAIL
FROM MMM A;
QUIT;
Thanks in Advance
I don't have the time to program this, but here are some questions to consider that will make life easier for some of the other posters.
Evidently you can have multiple records per X_ACCT_KEY. In that case, what matches do you want to keep? For example, what if the same incoming record has both IND_QTD_SMS='0' and IND_QTS_EMAIL='12345'? What if a second record for the same X_ACCT_KEY has both IND_QTD_SMS='2458' and IND_QTS_EMAIL=' '?
I want to transform this:
Key | SMS | Email
6456545 | 2 |
6456545 | | 2
Into this:
Key | SMS | Email
6456545 | 2 | 2
With this proc sql i can do this, but it takes a long time to run.
I'd like to test with datastep.
Thanks
Given your sample data, unless you've oversimplified wouldn't it be:
proc sql;
create table want as
select key, max(sms) as sms, max(email) as email
from have
group by key;
quit;
or
data want;
update have(obs=0) have;
by key;
run;
Reeza, the possibility of '0' in the data (which must be ignored ... see the original post) complicates things. In the case of UPDATE, there's an obvious problem where '0' could replace a valid value. In the case of SQL, it depends on both the operating system and what might be in the realm of valid values. On some operating systems, '0' is lower than letters. On some, it is greater.
Can you post sample data for testing? I'm fairly certain I understand the query but wouldn't bet on it. Especially when it can be easily tested. Cross joins are an interesting beast in a data step.
Sorry, what is the question, we don't do contract work here? At a breif glance, datastep each subquery for the where clause, then merge all three datasets, then sort nodupkey.
I believe this will work, but without data to test it's hard to say for certain.
data MMM2;
set MMM;
keep X_ACCT_KEY IND_SMS IND_EMAIL;
if IND_QTD_SMS IN ('0', '') then IND_SMS=''; else IND_SMS=IND_QTD_SMS;
if IND_QTD_EMAIL IN ('0', '') then IND_EMAIL=''; else IND_EMAIL=IND_QTD_EMAIL;
run;
proc sort noduplicates; by X_ACCT_KEY IND_SMS IND_EMAIL; run;
OK, given your subsequent description of the goal you are trying to reach:
data mmm2;
set mmm;
by x_acct_key;
if ind_qtd_sms not in ('0', ' ') then new_sms=ind_qtd_sms;
if ind_qtd_email not in ('0', ' ') then new_email = ind_qts_email;
retain new_sms new_email;
if last.x_aact_key;
output;
drop ind_qtd_sms ind_qts_email;
rename new_sms = ind_qtd_sms new_email = ind_qts_email;
new_sms=' ';
new_email=' ';
run;
It requires that your data set is already sorted, but virtually any DATA step approach will require that. If there are multiple records (not just 2) for an account key, you still end up with only one record for the account key. And it should be MUCH faster than the SQL approach.
I solved this by creating two separated data sets:
DATA SMS_30; SET TEST_30D_V3 (WHERE=(CHANNEL_CD EQ '_MP') DROP=IND_QTD_EMAIL);RUN;
DATA EMAIL_30;SET TEST_30D_V3 (WHERE=(CHANNEL_CD EQ '_EM') DROP=IND_QTD_SMS);RUN;
PROC SORT DATA=TEST_30D_V3;
BY X_ACCT_KEY;
RUN;
PROC SORT DATA=SMS_30;
BY X_ACCT_KEY;
RUN;
PROC SORT DATA=EMAIL_30;
BY X_ACCT_KEY;
RUN;
DATA TESTE_EV_30D(DROP=CHANNEL_CD);
MERGE SMS_30 (IN=A DROP=MOMENTO)
EMAIL_30 (IN=B DROP=MOMENTO)
TEST_30D_V3 (IN=C KEEP=X_ACCT_KEY MOMENTO);
BY X_ACCT_KEY;
RUN;
Thanks in Advance
No need to shout, code runs just as well in lower case with consistent identations
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.