My data set has variables user_id and others like v1, v2, .. v5.
There are some duplicate cases which have the same user_id.
is there a way to label the duplicate cases, by adding a new variable, value 1 for the primary cases, and value 0 for the second case with the same user_id? By doing this I do not need to delete the duplicate cases, but when analysis I can select the distinct cases by this new variable.
Thanks in advance.
Sort your data if it's not yet in order:
proc sort data=have;
by user_id;
run;
Then it's simple:
data want;
set have;
by user_id;
new_variable = first.user_id;
run;
Sort your data if it's not yet in order:
proc sort data=have;
by user_id;
run;
Then it's simple:
data want;
set have;
by user_id;
new_variable = first.user_id;
run;
great! this works!
This forum is really good.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.