Re: Keep first observation without deleteing duplicates if first obser...

hicksv · Posted 09-11-2017 12:39 PM

I have data that looks like this:

ID     VAR A
1       2
1       2
1       2
1       4
1       4
1       3
1       3
1       3
1       4
1       4
1       3

I want it to look like this:

ID      VAR A_NEW
1        2 4 3 4 3

I would like to keep the first. observation of VAR A but without removing any duplicates if the first observation of VAR A is repeated.

Thank you!

novinosrin · Posted 09-11-2017 12:59 PM

data have;

input (ID VAR_A) ($);

datalines;

1 2

1 4

1 3

1 4

1 3

;

data want;

set have;

by id var_a notsorted;

length temp $20;

retain temp;

if first.id and first.var_a then do;

call missing(temp);

temp=var_a;

end;

else if first.var_a then temp=cats(temp,var_a);

if last.id;

drop var_a;

run;

HB · Posted 09-11-2017 01:00 PM

My guess is that you don't really want to do that.

Data that looks like

ID     VAR A
1       2
1       4
1       3
1       4
1       3

Would be a lot easier to work with generally than a field that contaned "2 4 3 4 3"

Perhaps some context?

Edit:

cats(temp, var_A) will produce 24343

catx(" ", temp, var_A) will produce 2 4 3 4 3

Tom · Posted 09-11-2017 01:27 PM

It is very unclear what you want.

If you just want to get the first observation for each ID*VARA group then you can do this.

data want ;
  set have ;
  by id vara;
  if first.vara;
run;

What you posted looks totally different. Looks like you want to concatenate multiple values into a single variable.

data want ;
  set have ;
  by id vara;
  length new $200 ;
  if first.id the new=' ';
  if first.vara then new= catx(' ',new,vara);
  if last.id;
run;

mkeintz · Posted 09-11-2017 07:01 PM

You want one observation per ID, with a new variable containing the sequence of VARA values (excluding consecutive duplicates):

data want;
  length vara_new $80;

  do until (last.id);
    set have;
    by id vara notsorted;
    if first.vara then vara_new=catx(' ',vara_new,vara);
  end;
  drop vara;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

ShiroAmada · Posted 09-12-2017 12:13 AM

Hi. I took the script from novinosrin from HB and made a little tweaking.

data want;

set sample;

by id var_a notsorted;

length new $20;

retain new;

if first.id and first.var_a then do;

call missing(new);

new=var;

end;

else if first.var_a then new=catx("",new,var_a);

if last.id then do;

var=substr(new,1,1);

output;

end;

run;

Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

SAS Innovate 2026 Registration is Open

Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

Re: Keep first observation without deleteing duplicates if first observation is repeated

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away