- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How can i get first occurance basis multiple by variable for eg
for below data i need 1 in new column where ever i have unique combination of id1 & id2
id1 | id2 |
1001 | 10 |
1001 | 10 |
1001 | 11 |
1001 | 10 |
1002 | 12 |
1002 | 12 |
1002 | 13 |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please show which of your example observations should be flagged.
Do you need to preserve the current order?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think this is what you want, but please post your data in the form of a datastep and describe your desired outcome if not
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Post test data in the form of a datastep!!
/* assumes sorted */ data want; set have; by id1 id2; if first.id2 then new_var=1; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
RUN;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id1 then first_unique = 1;
else first_unique = 0;
run;
OR
if yo want to extract the unique data then pls use below code ;
proc sort data = have OUT= WANT NODUP;BY ID1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am not sure of your expected output, you want the unique records per id1 and id2 without the duplicates I mean if there are a combination of dulicates on id1 and id2 then exclude them from flagging
data want;
set have;
by id1 id2;
if first.id2 and last.id2 then flag=1;
else flag=0;
run;
Jag