DATA Step, Macro, Functions and more

Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Reply
New Contributor
Posts: 3

Using the First.with multiple BY variables, find first occurance for unique combo of by variable

How can i get first occurance basis multiple by variable for eg

for below data i need 1 in new column where ever i have unique combination of id1 & id2

id1id2
100110
100110
100111
100110
100212
100212
100213
Super User
Posts: 7,758

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to MohitDamani

Please show which of your example observations should be flagged.

Do you need to preserve the current order?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
PROC Star
Posts: 733

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to KurtBremser

I think this is what you want, but please post your data in the form of a datastep and describe your desired outcome if not

 

data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;

proc sort data = have;
	by id1 id2;
run;

data want;
	set have;
	by id1 id2;
	if first.id2 then first_unique = 1;
	else first_unique = 0;
run;
New Contributor
Posts: 3

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Tried, not working, i need combination of both variables like we do partition by in SQL
New Contributor
Posts: 3

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to KurtBremser
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
Super User
Posts: 7,758

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

[ Edited ]
Posted in reply to MohitDamani

MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0


A slight expansion of @draycut's code shows that it clearly works:

data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;

proc sort data = have;
	by id1 id2;
run;

data want;
	set have;
	by id1 id2;
	if first.id2 then first_unique = 1;
	else first_unique = 0;
run;

proc print data=want noobs;
run;

Result:

                     first_
  id1    id2    n    unique

 1001     10    1       1  
 1001     10    2       0  
 1001     10    4       0  
 1001     11    3       1  
 1002     12    5       1  
 1002     12    6       0  
 1002     13    7       1  

If you need the original order restored, just sort by n.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,942

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to MohitDamani

Post test data in the form of a datastep!!

 

/* assumes sorted */

data want;
  set have;
  by id1 id2;
  if first.id2 then new_var=1;
run;
Frequent Contributor
Posts: 77

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to MohitDamani

data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
RUN;

 

proc sort data = have;
by id1 id2;
run;

 

data want;
set have;
by id1 id2;
if first.id1 then first_unique = 1;
else first_unique = 0;
run;

 

OR 

 

if yo want to extract the unique data then pls use below code ;

 


proc sort data = have OUT= WANT NODUP;BY ID1;

run;

Trusted Advisor
Posts: 1,137

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Posted in reply to MohitDamani

I am not sure of your expected output, you want the unique records per id1 and id2 without the duplicates I mean if there are a combination of dulicates on id1 and id2 then exclude them from flagging

 


data want;
set have;
by id1 id2;
if first.id2 and last.id2 then flag=1;
else flag=0;
run;
Thanks,
Jag
Ask a Question
Discussion stats
  • 8 replies
  • 161 views
  • 1 like
  • 6 in conversation