Dear ,
I am concatenating variable "value" values by id and seq until 200 characters in "value1" variable and next 200-400 charaters in value2. I got the output I need for this simple data set as i know that I need only two variables(value1 and value2). In my actual data I do not know and may need up to value1-value 6. How to modify my code to get required variables depending on length.
eg :
output needed (I put only a sample output for value1 variable).
value1
2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*
2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*
data one; input id seq value $; datalines; 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 4000 1 1 2000 1 1 3000 1 1 4000 1 1 4000 1 1 9000 1 1 2000 1 1 3000 1 1 4000 1 2 5000 1 2 6000 1 3 7000 ; proc sort data=one; by id seq value; run; data two; length value1 value2 $200; do until(last.seq); set one; by id seq; if first.seq then value1=value; else do; if length(value1) <200 then do; value1=catx('*',value1,value); end; else if (length(value1) >=200) and (length(value2) <200) then value2=catx('*',value2,value); if last.seq then output; end; end; value1=''; value2=''; keep id value1 value2; run;
How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.
Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?
200 used to be the limit of character variables but that limit has increased considerably.
@ballardw wrote:Reply
How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.
Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?
200 used to be the limit of character variables but that limit has increased considerably.
I was thinking similar things to @ballardw, I can't imagine any usefulness to a long string of 200 (or 1200) characters as described, and there must be a better approach.
One possibility is simply to have SAS do all of the grunt work. e.g.:
proc sort data=one;
by id seq value;
run;
data _null_;
set one end=eof;
by id seq;
length value1 $200;
retain value1 maxi;
if first.seq then do;
i=1;
value1=value;
end;
else do;
if length(catx('*',value1,value)) le 200 then
value1=catx('*',value1,value);
else do;
i+1;
value1=value;
end;
end;
if last.seq then do;
maxi=max(maxi,i);
i=0;
end;
if eof then do;
call symput('maxi',catt('value',maxi));
i=0;
end;
run;
data two (drop=value);
array values(*) $200 value1-&maxi.;
set one;
by id seq;
retain values;
if first.seq then do;
i=1;
call missing(of values(*));
values(i)=value;
end;
else do;
if length(catx('*',values(i),value)) le 200 then
values(i)=catx('*',values(i),value);
else do;
i+1;
values(i)=value;
end;
end;
if last.seq then output;
run;
Art, CEO, AnalystFinder.com
I'd be really curious WHY you're doing this.
Does below code return what you're after?
data two;
length cat_val $200;
retain cat_val;
set one;
by id seq;
if lengthn(catx('*',cat_val,value))>200 then
do;
output;
call missing(cat_val);
end;
cat_val=catx('*',cat_val,value);
if last.seq then
do;
output;
call missing(cat_val);
end;
run;
proc transpose data=two out=want prefix=value_;
by id seq;
var cat_val;
run;
Thank you. The code provided by art297 worked for me. Reason why I had to do is,
I am validating an oncology ae data set that has related records whose values I am concatenating as per specs. Thank you all for the great support.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.