Solved: Re: Using the First.with multiple BY variables, find first occurance f...

MohitDamani · Posted 06-12-2017 06:20 AM

How can i get first occurance basis multiple by variable for eg

for below data i need 1 in new column where ever i have unique combination of id1 & id2

id1	id2
1001	10
1001	10
1001	11
1001	10
1002	12
1002	12
1002	13

Kurt_Bremser · Posted 06-12-2017 10:59 AM

@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0

A slight expansion of @PeterClemmensen's code shows that it clearly works:

data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;

proc sort data = have;
	by id1 id2;
run;

data want;
	set have;
	by id1 id2;
	if first.id2 then first_unique = 1;
	else first_unique = 0;
run;

proc print data=want noobs;
run;

Result:

                     first_
  id1    id2    n    unique

 1001     10    1       1  
 1001     10    2       0  
 1001     10    4       0  
 1001     11    3       1  
 1002     12    5       1  
 1002     12    6       0  
 1002     13    7       1

If you need the original order restored, just sort by n.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 06-12-2017 06:24 AM

Please show which of your example observations should be flagged.

Do you need to preserve the current order?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

PeterClemmensen · Posted 06-12-2017 06:39 AM

I think this is what you want, but please post your data in the form of a datastep and describe your desired outcome if not

data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;

proc sort data = have;
	by id1 id2;
run;

data want;
	set have;
	by id1 id2;
	if first.id2 then first_unique = 1;
	else first_unique = 0;
run;

The DATA to DATA Step Macro
Blog: SASnrd

MohitDamani · Posted 06-12-2017 10:51 AM

Tried, not working, i need combination of both variables like we do partition by in SQL

MohitDamani · Posted 06-12-2017 10:53 AM

i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0

Kurt_Bremser · Posted 06-12-2017 10:59 AM

@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0

A slight expansion of @PeterClemmensen's code shows that it clearly works:

data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;

proc sort data = have;
	by id1 id2;
run;

data want;
	set have;
	by id1 id2;
	if first.id2 then first_unique = 1;
	else first_unique = 0;
run;

proc print data=want noobs;
run;

Result:

                     first_
  id1    id2    n    unique

 1001     10    1       1  
 1001     10    2       0  
 1001     10    4       0  
 1001     11    3       1  
 1002     12    5       1  
 1002     12    6       0  
 1002     13    7       1

If you need the original order restored, just sort by n.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

RW9 · Posted 06-12-2017 06:28 AM

Post test data in the form of a datastep!!

/* assumes sorted */

data want;
  set have;
  by id1 id2;
  if first.id2 then new_var=1;
run;

TarunKumar · Posted 06-12-2017 07:04 AM

data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
RUN;

proc sort data = have;
by id1 id2;
run;

data want;
set have;
by id1 id2;
if first.id1 then first_unique = 1;
else first_unique = 0;
run;

OR

if yo want to extract the unique data then pls use below code ;

proc sort data = have OUT= WANT NODUP;BY ID1;

run;

Jagadishkatam · Posted 06-12-2017 08:24 AM

I am not sure of your expected output, you want the unique records per id1 and id2 without the duplicates I mean if there are a combination of dulicates on id1 and id2 then exclude them from flagging


data want;
set have;
by id1 id2;
if first.id2 and last.id2 then flag=1;
else flag=0;
run;

Thanks,
Jag

Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Re: Using the First.with multiple BY variables, find first occurance for unique combo of by variable

Classroom Training Available!