DATA Step, Macro, Functions and more

Need Help with my Code

Reply
Super Contributor
Posts: 307

Need Help with my Code

Dear ,

I am concatenating variable "value" values by id and seq until 200 characters  in "value1" variable and  next 200-400 charaters in value2. I got the output I need for this simple data set as i know that I need only two variables(value1 and value2).  In my actual data I do not know and may need up to value1-value 6. How to modify my code to get required variables depending on length.

 

eg :

output needed (I put only a sample output for value1 variable). 

 

value1

2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*
2001*3005*4001*4013*4015*4017*4019*4021*4023*4027*5033*5039*5041*5047*5051*5055*5059*5063*5067*5071*5075*5081*5091*5095*5101*5105*5109*5115*5127*5131*5133*5137*5143*5147*5151*5155*5159*5169*5173*5177*

 

 

data one;
input id seq value $;
datalines;
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 2000
1 1 3000
1 1 4000
1 1 4000
1 1 9000
1 1 2000
1 1 3000
1 1 4000
1 2 5000
1 2 6000
1 3 7000
;
proc sort data=one;
by id seq value;
run;

data two;
length value1 value2 $200;
do until(last.seq);
set one;
by id seq;
if first.seq then value1=value;
else do;
if length(value1) <200 then do;
value1=catx('*',value1,value);
end;
else if (length(value1) >=200)  and (length(value2) <200) then value2=catx('*',value2,value);
if last.seq then output;
end;
end;
value1='';
value2='';
keep id value1 value2;
run;
Super User
Posts: 13,006

Re: Need Help with my Code

Posted in reply to knveraraju91

How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.

 

Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?

200 used to be the limit of character variables but that limit has increased considerably.

Respected Advisor
Posts: 2,647

Re: Need Help with my Code

[ Edited ]

@ballardw wrote:Reply

How are those "value" variables to be used? Often when you don't know how many of these you need that is a symptom that the basic approach may have a flaw.

 

Also since your are talking about a maximum of 6 variables each of 200 characters long why wouldn't one variable of up to 1200 characters (give or take) work?

200 used to be the limit of character variables but that limit has increased considerably.


I was thinking similar things to @ballardw, I can't imagine any usefulness to a long string of 200 (or 1200) characters as described, and there must be a better approach.

--
Paige Miller
PROC Star
Posts: 8,101

Re: Need Help with my Code

Posted in reply to knveraraju91

One possibility is simply to have SAS do all of the grunt work. e.g.:

proc sort data=one;
  by id seq value;
run;

data _null_;
  set one end=eof;
  by id seq;
  length value1 $200;
  retain value1 maxi;
  if first.seq then do;
    i=1;
    value1=value;
  end;
  else do;
    if length(catx('*',value1,value)) le 200 then
     value1=catx('*',value1,value);
    else do;
      i+1;
      value1=value;
    end;
  end;
  if last.seq then do;
    maxi=max(maxi,i);
    i=0;
  end;
  if eof then do;
    call symput('maxi',catt('value',maxi));
    i=0;
  end;
run;

data two (drop=value);
  array values(*) $200 value1-&maxi.;
  set one;
  by id seq;
  retain values;
  if first.seq then do;
    i=1;
    call missing(of values(*));
    values(i)=value;
  end;
  else do;
    if length(catx('*',values(i),value)) le 200 then
     values(i)=catx('*',values(i),value);
    else do;
      i+1;
      values(i)=value;
    end;
  end;
  if last.seq then output;
run;

Art, CEO, AnalystFinder.com

 

Respected Advisor
Posts: 4,540

Re: Need Help with my Code

Posted in reply to knveraraju91

@knveraraju91

I'd be really curious WHY you're doing this.

 

Does below code return what you're after?


data two;
  length cat_val $200;
  retain cat_val;
  set one;
  by id seq;

  if lengthn(catx('*',cat_val,value))>200 then
    do;
      output;
      call missing(cat_val);
    end;

  cat_val=catx('*',cat_val,value);

  if last.seq then
    do;
      output;
      call missing(cat_val);
    end;

run;

proc transpose data=two out=want prefix=value_;
  by id seq;
  var cat_val;
run;
Super Contributor
Posts: 307

Re: Need Help with my Code

Thank you. The code provided by art297 worked for me. Reason why I had to do is,

 

I am validating an oncology ae data set that has related records whose values I am concatenating  as per specs. Thank you all for the great support.

Ask a Question
Discussion stats
  • 5 replies
  • 168 views
  • 3 likes
  • 5 in conversation