BookmarkSubscribeRSS Feed
hicksv
Calcite | Level 5

I have data that looks like this:

 

ID     VAR A
1       2
1       2
1       2
1       4
1       4
1       3
1       3
1       3
1       4
1       4
1       3

 

I want it to look like this:

  

ID      VAR A_NEW
1        2 4 3 4 3

 

I would like to keep the first. observation of VAR A but without removing any duplicates if the first observation of VAR A is repeated.

 

Thank you!

5 REPLIES 5
novinosrin
Tourmaline | Level 20

data have;

input (ID     VAR_A) ($);

datalines;

1       2

1       2

1       2

1       4

1       4

1       3

1       3

1       3

1       4

1       4

1       3

;

 

data want;

set have;

by id var_a notsorted;

length temp $20;

retain temp;

if first.id and first.var_a then do;

call missing(temp);

temp=var_a;

end;

else if first.var_a then temp=cats(temp,var_a);

if last.id;

drop var_a;

run;

HB
Barite | Level 11 HB
Barite | Level 11

My guess is that you don't really want to do that.

 

Data that looks like

 

ID     VAR A
1       2
1       4
1       3
1       4
1       3

Would be a lot easier to work with generally than a field that contaned "2 4 3 4 3"

 

Perhaps some context?

 

 

Edit:

cats(temp, var_A) will produce 24343

catx(" ", temp, var_A) will produce 2 4 3 4 3

 

 

 

 

Tom
Super User Tom
Super User

It is very unclear what you want.

If you just want to get the first observation for each ID*VARA group then you can do this.

data want ;
  set have ;
  by id vara;
  if first.vara;
run;

What you posted looks totally different.  Looks like you want to concatenate multiple values into a single variable.

data want ;
  set have ;
  by id vara;
  length new $200 ;
  if first.id the new=' ';
  if first.vara then new= catx(' ',new,vara);
  if last.id;
run;

 

 

mkeintz
PROC Star

You want one observation per ID, with a new variable containing the sequence of VARA values (excluding consecutive duplicates):

 

data want;
  length vara_new $80;

  do until (last.id);
    set have;
    by id vara notsorted;
    if first.vara then vara_new=catx(' ',vara_new,vara);
  end;
  drop vara;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ShiroAmada
Lapis Lazuli | Level 10

Hi.  I took the script from novinosrin from HB and made a little tweaking.

data want;

  set sample;

by id var_a  notsorted;

 

length new $20;

 

retain new;

 

if first.id and first.var_a then do;

  call missing(new);

  new=var;

end;

 

else if first.var_a then new=catx("",new,var_a);

if last.id then do;

  var=substr(new,1,1);

  output;

end;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 728 views
  • 0 likes
  • 6 in conversation