BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear ,

I am concatenating variable "value" values by id and seq until 200 characters  in "value1" variable and  next 200-400 charaters in value2. I got the output I need for this simple data set as i know that I need only two variables(value1 and value2).  In my actual data I do not know and may need up to value1-value 6. How to modify my code to get required variables depending on length.

 

eg :

output needed (I put only a sample output for value1 variable). 

 

value1

2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*
2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*

 

 

data one;
input id seq value $;
datalines;
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 9000
1 1 2000
1 1 3000
1 1 4000
1 2 5000
1 2 6000
1 3 7000
;
proc sort data=one;
by id seq value;
run;

data two;
length value1 value2 $200;
do until(last.seq);
set one;
by id seq;
if first.seq then value1=value;
else do;
if length(value1) <200 then do;
value1=catx('*',value1,value);
end;
else if (length(value1) >=200)  and (length(value2) <200) then value2=catx('*',value2,value);
if last.seq then output;
end;
end;
value1='';
value2='';
keep id value1 value2;
run;
5 REPLIES 5
ballardw
Super User

How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.

 

Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?

200 used to be the limit of character variables but that limit has increased considerably.

PaigeMiller
Diamond | Level 26

@ballardw wrote:Reply

How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.

 

Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?

200 used to be the limit of character variables but that limit has increased considerably.


I was thinking similar things to @ballardw, I can't imagine any usefulness to a long string of 200 (or 1200) characters as described, and there must be a better approach.

--
Paige Miller
art297
Opal | Level 21

One possibility is simply to have SAS do all of the grunt work. e.g.:

proc sort data=one;
  by id seq value;
run;

data _null_;
  set one end=eof;
  by id seq;
  length value1 $200;
  retain value1 maxi;
  if first.seq then do;
    i=1;
    value1=value;
  end;
  else do;
    if length(catx('*',value1,value)) le 200 then
     value1=catx('*',value1,value);
    else do;
      i+1;
      value1=value;
    end;
  end;
  if last.seq then do;
    maxi=max(maxi,i);
    i=0;
  end;
  if eof then do;
    call symput('maxi',catt('value',maxi));
    i=0;
  end;
run;

data two (drop=value);
  array values(*) $200 value1-&maxi.;
  set one;
  by id seq;
  retain values;
  if first.seq then do;
    i=1;
    call missing(of values(*));
    values(i)=value;
  end;
  else do;
    if length(catx('*',values(i),value)) le 200 then
     values(i)=catx('*',values(i),value);
    else do;
      i+1;
      values(i)=value;
    end;
  end;
  if last.seq then output;
run;

Art, CEO, AnalystFinder.com

 

Patrick
Opal | Level 21

@knveraraju91

I'd be really curious WHY you're doing this.

 

Does below code return what you're after?


data two;
  length cat_val $200;
  retain cat_val;
  set one;
  by id seq;

  if lengthn(catx('*',cat_val,value))>200 then
    do;
      output;
      call missing(cat_val);
    end;

  cat_val=catx('*',cat_val,value);

  if last.seq then
    do;
      output;
      call missing(cat_val);
    end;

run;

proc transpose data=two out=want prefix=value_;
  by id seq;
  var cat_val;
run;
knveraraju91
Barite | Level 11

Thank you. The code provided by art297 worked for me. Reason why I had to do is,

 

I am validating an oncology ae data set that has related records whose values I am concatenating  as per specs. Thank you all for the great support.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 811 views
  • 3 likes
  • 5 in conversation