BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ranjeeta
Pyrite | Level 9

I am getting an error in the data step but not proc sql what am i doing wrong?

 

proc sql;
create table icd as
select distinct key_enc from
pa.dad_px_all
where key_enc in (select key_enc from pa.dad_px_all
where px in ('I210', 'I211'))
;quit;

 

data icd ;

set pa.dad_px_all;

where px in ('I210', 'I211');

if first.key_enc and last.key_enc;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data icd ;

set pa.dad_px_all;

by key_enc ;

where px in ('I210', 'I211');

if first.key_enc ;

run;

 

See the bold 

 

Make sure your pa.dad_px_all is sorted by by key_enc ;

View solution in original post

11 REPLIES 11
novinosrin
Tourmaline | Level 20

Hi @Ranjeeta   

 

I am afraid if you happen to think 

if first.key_enc and last.key_enc;    /*this picks only the unique occurrence*/

is an equivalent to

 

select distinct key_enc from    /*this picks only one occurrence of each key from multiples or unique*/
pa.dad_px_all

 

your understanding is incorrect. Anyways, can you post a sample of what you HAVE and WANT so that we can provide you with the correct solution

 

 

Ranjeeta
Pyrite | Level 9
Thanks so how would i rewite the above code in the data step
novinosrin
Tourmaline | Level 20

data icd ;

set pa.dad_px_all;

by key_enc ;

where px in ('I210', 'I211');

if first.key_enc ;

run;

 

See the bold 

 

Make sure your pa.dad_px_all is sorted by by key_enc ;

Reeza
Super User

@novinosrin wrote:

Hi @Ranjeeta   

 

I am afraid if you happen to think 

if first.key_enc and last.key_enc;    /*this picks only the unique occurrence*/

is an equivalent to

 

select distinct key_enc from    /*this picks only one occurrence of each key from multiples or unique*/
pa.dad_px_all

 

your understanding is incorrect. Anyways, can you post a sample of what you HAVE and WANT so that we can provide you with the correct solution

 

 


Since the final results is only the ID, it doesn't matter here, but does in general

novinosrin
Tourmaline | Level 20

Then plz explain this difference in the results

 


data have;
input transaction_id;
cards;
1
1
1
1
2
2
3
;

proc sql;
create table want_sql as
select distinct transaction_id
from have;
quit;

data want_datastep;
set have;
by transaction_id;
if first.transaction_id and last.transaction_id;
run;

 

Ranjeeta
Pyrite | Level 9

Thanks This makes it very clear 

novinosrin
Tourmaline | Level 20

Hi @Ranjeeta   I had a similar misunderstanding too. The person who corrected me was @mkeintz  who picked a nit in mine when  my understanding wasn't clear and I posted a rather incorrect solution stressing that was right. Well Mark has helped me correct and taught me tons in the past but he doesn't seem to do that lately. Probably had enough of me  Haha lol 

 

Ok furthermore, 

 

select distinct does an implicit sort and picks one

so using select distinct with an unsorted below


data have;
input transaction_id;
cards;
1
1
1
1
6
6
2
2
3
;

 

would result in sorted output.

 

So basically the SQL processor does an implicit sort and picks each unique key whereas the above sample would require a presort for the datastep compiler to execute. So lazy happy go lucky folks like me prefer proc sql unless a datastep is warranted. 🙂

 

 

 

 

 

 

novinosrin
Tourmaline | Level 20

If you want to avoid presort or implicit sort assuming the output order doesn't matter, hash is a good idea

 


data have;
input transaction_id;
cards;
1
1
1
1
6
6
2
2
3
;

data _null_;
dcl hash h(dataset:'have');
   h.definekey  ("transaction_id") ;
   h.definedata ("transaction_id") ;
   h.definedone () ;
h.output(dataset:'want');
stop;
set have;
run;
Reeza
Super User

@novinosrin  I think I understand what you're trying to say and I didn't the first time. I guess unique doesn't equal 'single' to me but that technical definition is correct, records are unique or duplicate and in this case OP wants a list of unique keys. Can you tell I'm spending too much time on writing now?

 

You can find a list of unique keys using PROC SORT, SQL or Data step can accomplish.

 

 

 

 

novinosrin
Tourmaline | Level 20

Oh well, it's no biggie and trivial to somebody of your experience and expertise where the communication misunderstanding happens too often as sometimes there is lack of clarity.

 

I was going at length for the benefit of new folks and certainly not you et al. If I didn't the Gentle giant Ballardw would hahaha. I didn;t want to give him a chance

novinosrin
Tourmaline | Level 20

And mere sort and nodupkey should do just fine

 


proc sort data=pa.dad_px_all(where=(px in ('I210', 'I211')))  out=icd  nodupkey;
by key_enc ;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2197 views
  • 6 likes
  • 3 in conversation