Keep first observation without deleteing duplicates if first observation is repeated

Reply
User
Posts: 1

Keep first observation without deleteing duplicates if first observation is repeated

I have data that looks like this:

 

ID     VAR A
1       2
1       2
1       2
1       4
1       4
1       3
1       3
1       3
1       4
1       4
1       3

 

I want it to look like this:

  

ID      VAR A_NEW
1        2 4 3 4 3

 

I would like to keep the first. observation of VAR A but without removing any duplicates if the first observation of VAR A is repeated.

 

Thank you!

PROC Star
Posts: 357

Re: Keep first observation without deleteing duplicates if first observation is repeated

[ Edited ]

data have;

input (ID     VAR_A) ($);

datalines;

1       2

1       2

1       2

1       4

1       4

1       3

1       3

1       3

1       4

1       4

1       3

;

 

data want;

set have;

by id var_a notsorted;

length temp $20;

retain temp;

if first.id and first.var_a then do;

call missing(temp);

temp=var_a;

end;

else if first.var_a then temp=cats(temp,var_a);

if last.id;

drop var_a;

run;

Regular Contributor
Regular Contributor
Posts: 185

Re: Keep first observation without deleteing duplicates if first observation is repeated

[ Edited ]

My guess is that you don't really want to do that.

 

Data that looks like

 

ID     VAR A
1       2
1       4
1       3
1       4
1       3

Would be a lot easier to work with generally than a field that contaned "2 4 3 4 3"

 

Perhaps some context?

 

 

Edit:

cats(temp, var_A) will produce 24343

catx(" ", temp, var_A) will produce 2 4 3 4 3

 

 

 

 

Super User
Super User
Posts: 7,154

Re: Keep first observation without deleteing duplicates if first observation is repeated

It is very unclear what you want.

If you just want to get the first observation for each ID*VARA group then you can do this.

data want ;
  set have ;
  by id vara;
  if first.vara;
run;

What you posted looks totally different.  Looks like you want to concatenate multiple values into a single variable.

data want ;
  set have ;
  by id vara;
  length new $200 ;
  if first.id the new=' ';
  if first.vara then new= catx(' ',new,vara);
  if last.id;
run;

 

 

Trusted Advisor
Posts: 1,065

Re: Keep first observation without deleteing duplicates if first observation is repeated

You want one observation per ID, with a new variable containing the sequence of VARA values (excluding consecutive duplicates):

 

data want;
  length vara_new $80;

  do until (last.id);
    set have;
    by id vara notsorted;
    if first.vara then vara_new=catx(' ',vara_new,vara);
  end;
  drop vara;
run;

 

 

Frequent Contributor
Posts: 106

Re: Keep first observation without deleteing duplicates if first observation is repeated

Hi.  I took the script from novinosrin from HB and made a little tweaking.

data want;

  set sample;

by id var_a  notsorted;

 

length new $20;

 

retain new;

 

if first.id and first.var_a then do;

  call missing(new);

  new=var;

end;

 

else if first.var_a then new=catx("",new,var_a);

if last.id then do;

  var=substr(new,1,1);

  output;

end;

run;

Ask a Question
Discussion stats
  • 5 replies
  • 120 views
  • 0 likes
  • 6 in conversation