Solved: proc sql to data step

Ranjeeta · Posted 04-04-2019 01:31 PM

I am getting an error in the data step but not proc sql what am i doing wrong?

proc sql;
create table icd as
select distinct key_enc from
pa.dad_px_all
where key_enc in (select key_enc from pa.dad_px_all
where px in ('I210', 'I211'))
;quit;

data icd ;

set pa.dad_px_all;

where px in ('I210', 'I211');

if first.key_enc and last.key_enc;

run;

novinosrin · Posted 04-04-2019 01:46 PM

data icd ;

set pa.dad_px_all;

by key_enc ;

where px in ('I210', 'I211');

if first.key_enc ;

run;

See the bold

Make sure your pa.dad_px_all is sorted by by key_enc ;

View solution in original post

novinosrin · Posted 04-04-2019 01:39 PM

Hi @Ranjeeta

I am afraid if you happen to think

if first.key_enc and last.key_enc; /*this picks only the unique occurrence*/

is an equivalent to

select distinct key_enc from /*this picks only one occurrence of each key from multiples or unique*/
pa.dad_px_all

your understanding is incorrect. Anyways, can you post a sample of what you HAVE and WANT so that we can provide you with the correct solution

Ranjeeta · Posted 04-04-2019 01:45 PM

Thanks so how would i rewite the above code in the data step

novinosrin · Posted 04-04-2019 01:46 PM

data icd ;

set pa.dad_px_all;

by key_enc ;

where px in ('I210', 'I211');

if first.key_enc ;

run;

See the bold

Make sure your pa.dad_px_all is sorted by by key_enc ;

Reeza · Posted 04-04-2019 01:49 PM

@novinosrin wrote:

Hi @Ranjeeta

I am afraid if you happen to think

if first.key_enc and last.key_enc; /*this picks only the unique occurrence*/

is an equivalent to

select distinct key_enc from /*this picks only one occurrence of each key from multiples or unique*/
pa.dad_px_all

your understanding is incorrect. Anyways, can you post a sample of what you HAVE and WANT so that we can provide you with the correct solution

Since the final results is only the ID, it doesn't matter here, but does in general

novinosrin · Posted 04-04-2019 01:53 PM

Then plz explain this difference in the results


data have;
input transaction_id;
cards;
1
1
1
1
2
2
3
;

proc sql;
create table want_sql as
select distinct transaction_id
from have;
quit;

data want_datastep;
set have;
by transaction_id;
if first.transaction_id and last.transaction_id;
run;

Ranjeeta · Posted 04-04-2019 01:59 PM

Thanks This makes it very clear

novinosrin · Posted 04-04-2019 02:07 PM

Hi @Ranjeeta I had a similar misunderstanding too. The person who corrected me was @mkeintz who picked a nit in mine when my understanding wasn't clear and I posted a rather incorrect solution stressing that was right. Well Mark has helped me correct and taught me tons in the past but he doesn't seem to do that lately. Probably had enough of me Haha lol

Ok furthermore,

select distinct does an implicit sort and picks one

so using select distinct with an unsorted below

data have;
input transaction_id;
cards;
1
1
1
1
6
6
2
2
3
;

would result in sorted output.

So basically the SQL processor does an implicit sort and picks each unique key whereas the above sample would require a presort for the datastep compiler to execute. So lazy happy go lucky folks like me prefer proc sql unless a datastep is warranted. 🙂

novinosrin · Posted 04-04-2019 02:20 PM

If you want to avoid presort or implicit sort assuming the output order doesn't matter, hash is a good idea


data have;
input transaction_id;
cards;
1
1
1
1
6
6
2
2
3
;

data _null_;
dcl hash h(dataset:'have');
   h.definekey  ("transaction_id") ;
   h.definedata ("transaction_id") ;
   h.definedone () ;
h.output(dataset:'want');
stop;
set have;
run;

Reeza · Posted 04-04-2019 02:10 PM

@novinosrin I think I understand what you're trying to say and I didn't the first time. I guess unique doesn't equal 'single' to me but that technical definition is correct, records are unique or duplicate and in this case OP wants a list of unique keys. Can you tell I'm spending too much time on writing now?

You can find a list of unique keys using PROC SORT, SQL or Data step can accomplish.

novinosrin · Posted 04-04-2019 02:16 PM

Oh well, it's no biggie and trivial to somebody of your experience and expertise where the communication misunderstanding happens too often as sometimes there is lack of clarity.

I was going at length for the benefit of new folks and certainly not you et al. If I didn't the Gentle giant Ballardw would hahaha. I didn;t want to give him a chance

novinosrin · Posted 04-04-2019 02:44 PM

And mere sort and nodupkey should do just fine


proc sort data=pa.dad_px_all(where=(px in ('I210', 'I211')))  out=icd  nodupkey;
by key_enc ;
run;

proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Re: proc sql to data step

Registration is open

Registration is open

SAS Training: Just a Click Away