BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ranjeeta
Pyrite | Level 9
proc sort data=deno_2 out=deno_2sorted;
by HCNE;
run;

data deno_3v1;
set deno_2sorted;
by HCNE;
if first.HCNE and last.HCNE;
run;
/*72,122*/ proc sql; create table deno_3 as select distinct HCNE from deno_2 ; quit;*78,368 , make sure unique patient only;

Hello Can someone advise why the 2 codes above would return different results 

Thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

In the data step you're selecting only rows which are already unique in the source table (first AND last).

The SQL is deduping the rows from the source table so it also returns a unique row where you've got duplicates in source.

 

data deno_2;
  hcne=1; output;
  hcne=2; output;output;
  stop;
run;

proc sort data=deno_2 out=deno_2sorted;
  by HCNE;
run;

/* only pick rows already unique in source */
data ds1_1;
  set deno_2sorted;
  by HCNE;
  if first.HCNE and last.HCNE;
run;

proc sql;
  create table ds1_2 as
    select HCNE
    from deno_2
    group by hcne
    having count(*)=1
  ;
quit;

/* dedup rows from source */
data ds2_1;
  set deno_2sorted;
  by HCNE;
  if first.HCNE;
run;

proc sql;
  create table ds2_2 as
    select distinct HCNE
    from deno_2
  ;
quit;

proc sort data=deno_2 out=ds2_3 nodupkey;
  by hcne;
run;

 

View solution in original post

2 REPLIES 2
novinosrin
Tourmaline | Level 20

Hi @Ranjeeta 

 

1. Datastep picks only unique occurrences of the values, i.e the value that occurs only once the dataset

2. Proc SQL,- sorts, eliminates the dup occurrences from all , limits to one from all occurrences and outputs . So you would indeed have the difference. HTH

 

Also, Select distinct can be considered an equivalent of proc sort nodupkey or if first.key in datastep

Patrick
Opal | Level 21

In the data step you're selecting only rows which are already unique in the source table (first AND last).

The SQL is deduping the rows from the source table so it also returns a unique row where you've got duplicates in source.

 

data deno_2;
  hcne=1; output;
  hcne=2; output;output;
  stop;
run;

proc sort data=deno_2 out=deno_2sorted;
  by HCNE;
run;

/* only pick rows already unique in source */
data ds1_1;
  set deno_2sorted;
  by HCNE;
  if first.HCNE and last.HCNE;
run;

proc sql;
  create table ds1_2 as
    select HCNE
    from deno_2
    group by hcne
    having count(*)=1
  ;
quit;

/* dedup rows from source */
data ds2_1;
  set deno_2sorted;
  by HCNE;
  if first.HCNE;
run;

proc sql;
  create table ds2_2 as
    select distinct HCNE
    from deno_2
  ;
quit;

proc sort data=deno_2 out=ds2_3 nodupkey;
  by hcne;
run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 900 views
  • 3 likes
  • 3 in conversation