BookmarkSubscribeRSS Feed
hicksv
Calcite | Level 5

I have data that looks like this:

 

ID     VAR A
1       2
1       2
1       2
1       4
1       4
1       3
1       3
1       3
1       4
1       4
1       3

 

I want it to look like this:

  

ID      VAR A_NEW
1        2 4 3 4 3

 

I would like to keep the first. observation of VAR A but without removing any duplicates if the first observation of VAR A is repeated.

 

Thank you!

5 REPLIES 5
novinosrin
Tourmaline | Level 20

data have;

input (ID     VAR_A) ($);

datalines;

1       2

1       2

1       2

1       4

1       4

1       3

1       3

1       3

1       4

1       4

1       3

;

 

data want;

set have;

by id var_a notsorted;

length temp $20;

retain temp;

if first.id and first.var_a then do;

call missing(temp);

temp=var_a;

end;

else if first.var_a then temp=cats(temp,var_a);

if last.id;

drop var_a;

run;

HB
Barite | Level 11 HB
Barite | Level 11

My guess is that you don't really want to do that.

 

Data that looks like

 

ID     VAR A
1       2
1       4
1       3
1       4
1       3

Would be a lot easier to work with generally than a field that contaned "2 4 3 4 3"

 

Perhaps some context?

 

 

Edit:

cats(temp, var_A) will produce 24343

catx(" ", temp, var_A) will produce 2 4 3 4 3

 

 

 

 

Tom
Super User Tom
Super User

It is very unclear what you want.

If you just want to get the first observation for each ID*VARA group then you can do this.

data want ;
  set have ;
  by id vara;
  if first.vara;
run;

What you posted looks totally different.  Looks like you want to concatenate multiple values into a single variable.

data want ;
  set have ;
  by id vara;
  length new $200 ;
  if first.id the new=' ';
  if first.vara then new= catx(' ',new,vara);
  if last.id;
run;

 

 

mkeintz
PROC Star

You want one observation per ID, with a new variable containing the sequence of VARA values (excluding consecutive duplicates):

 

data want;
  length vara_new $80;

  do until (last.id);
    set have;
    by id vara notsorted;
    if first.vara then vara_new=catx(' ',vara_new,vara);
  end;
  drop vara;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ShiroAmada
Lapis Lazuli | Level 10

Hi.  I took the script from novinosrin from HB and made a little tweaking.

data want;

  set sample;

by id var_a  notsorted;

 

length new $20;

 

retain new;

 

if first.id and first.var_a then do;

  call missing(new);

  new=var;

end;

 

else if first.var_a then new=catx("",new,var_a);

if last.id then do;

  var=substr(new,1,1);

  output;

end;

run;

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1648 views
  • 0 likes
  • 6 in conversation