How can i get first occurance basis multiple by variable for eg
for below data i need 1 in new column where ever i have unique combination of id1 & id2
id1 | id2 |
1001 | 10 |
1001 | 10 |
1001 | 11 |
1001 | 10 |
1002 | 12 |
1002 | 12 |
1002 | 13 |
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
Please show which of your example observations should be flagged.
Do you need to preserve the current order?
I think this is what you want, but please post your data in the form of a datastep and describe your desired outcome if not
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
Post test data in the form of a datastep!!
/* assumes sorted */ data want; set have; by id1 id2; if first.id2 then new_var=1; run;
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
RUN;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id1 then first_unique = 1;
else first_unique = 0;
run;
OR
if yo want to extract the unique data then pls use below code ;
proc sort data = have OUT= WANT NODUP;BY ID1;
run;
I am not sure of your expected output, you want the unique records per id1 and id2 without the duplicates I mean if there are a combination of dulicates on id1 and id2 then exclude them from flagging
data want;
set have;
by id1 id2;
if first.id2 and last.id2 then flag=1;
else flag=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.