How can i get first occurance basis multiple by variable for eg
for below data i need 1 in new column where ever i have unique combination of id1 & id2
id1 | id2 |
1001 | 10 |
1001 | 10 |
1001 | 11 |
1001 | 10 |
1002 | 12 |
1002 | 12 |
1002 | 13 |
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
Please show which of your example observations should be flagged.
Do you need to preserve the current order?
I think this is what you want, but please post your data in the form of a datastep and describe your desired outcome if not
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
@MohitDamani wrote:
i need combination of both variables like we do partition by in SQL, for eg row 1,3,5 & 7 should have 1 rest 0
A slight expansion of @PeterClemmensen's code shows that it clearly works:
data have;
input id1 id2;
n = _n_;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
run;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id2 then first_unique = 1;
else first_unique = 0;
run;
proc print data=want noobs;
run;
Result:
first_ id1 id2 n unique 1001 10 1 1 1001 10 2 0 1001 10 4 0 1001 11 3 1 1002 12 5 1 1002 12 6 0 1002 13 7 1
If you need the original order restored, just sort by n.
Post test data in the form of a datastep!!
/* assumes sorted */ data want; set have; by id1 id2; if first.id2 then new_var=1; run;
data have;
input id1 id2;
datalines;
1001 10
1001 10
1001 11
1001 10
1002 12
1002 12
1002 13
;
RUN;
proc sort data = have;
by id1 id2;
run;
data want;
set have;
by id1 id2;
if first.id1 then first_unique = 1;
else first_unique = 0;
run;
OR
if yo want to extract the unique data then pls use below code ;
proc sort data = have OUT= WANT NODUP;BY ID1;
run;
I am not sure of your expected output, you want the unique records per id1 and id2 without the duplicates I mean if there are a combination of dulicates on id1 and id2 then exclude them from flagging
data want;
set have;
by id1 id2;
if first.id2 and last.id2 then flag=1;
else flag=0;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.