DATA Step, Macro, Functions and more

creating new varibles within a range of values of 6 different variables

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

creating new varibles within a range of values of 6 different variables

Hello there, Among other variables, I have a data set with 6 character variables (lenght 4) like this:

 

Obs Var1 Var2 Var3 Var4 Var5 Var6
1 O809 O809 Z370 Z301 Z390 A20X
2 B171 B172 K746 I10X I519 K546
3 O809 O809 Z370 Z302 Z390  
4 X101 X102 I12X      

 

and would like:

1.- to get rid of the "X" but only for entries with the "X" at the end so de data would look like:

 

Obs Var1 Var2 Var3 Var4 Var5 Var6
1 O809 O809 Z370 Z301 Z390 A20
2 B171 B172 K746 I10 I519 K546
3 O809 O809 Z370 Z302 Z390  
4 X101 X102 I12      

 

then

2.- I would like to create new variables that are within a specific range, for example if I want a new variable Z to select any entry of Var1 to Var 6 between Z301 and Z370, the new variable would be like the example below, in this case "VariableZ", if I want another new varible to select any value of Var1 to Var6 to be between B171 and B172 or between X101 and X102 the new variables would be respectivelly like "VariableB" and "VariableX" below and so on 

 

New data set                
Obs Var1 Var2 Var3 Var4 Var5 Var6 VariableZ  VariableB VariableX
1 O809 O809 Z370 Z301 Z390 A20 Z301-Z370    
2 B171 B172 K746 I10 I519 K546   B171-172  
3 O809 O829 Z370 Z302 Z390   Z301-Z370    
4 X101 D109 I12           X101-X102

 

all your help will be appreciated

 

Thanks

EHG.


Accepted Solutions
Solution
‎02-22-2017 07:49 AM
Valued Guide
Posts: 797

Re: creating new varibles within a range of values of 6 different variables

This is a good task to show off the benefits of the CHAR function (extracts a single character substring), the INPUT function, and especially the SELECT statement:  

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array v {*} var1-var6;
  do I=1 to dim(v);
    if char(var6,4)='X' then substr(var6,4,1)=' ';
    select (char(v{I},1));
      when ('Z') if 301 <= input(substr(v{I},2),best32.) <= 370 then varz='Z301-Z370';
      when ('B') if 171 <= input(substr(v{I},2),best32.) <= 172 then varb='B171-B172';
      when ('X') if 101 <= input(substr(v{I},2),best32.) <= 102 then varx='X101-X102';
      otherwise ;
    end;
  end;
run;

  

The INPUT function is set to an informat of BEST32. as an insurance policy.  It means you generally don't have to worry about the length of the character variables being INPUTed. 

View solution in original post


All Replies
Super User
Posts: 5,076

Re: creating new varibles within a range of values of 6 different variables

Part 1 is pretty easy.  In a DATA step:

 

array var {6};

do i=1 to 6;

   if substr(var{i}, 4, 1) = 'X' then substr(var{i}, 4, 1) = ' ';

end;

 

For part 2, I'm not really sure what you are trying to achieve here.  But I would warn you about making character comparisons.  For example, as character strings, "Z32" falls within the range of "Z301" through "Z370".

PROC Star
Posts: 7,357

Re: creating new varibles within a range of values of 6 different variables

I'm sure that someone (Hi @Reeza) will complain about my using DO OVER, the following is an easy way to accomplish both tasks:

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array stuff var1-var6;
  do over stuff;
    if substr(stuff,length(stuff),1) eq 'X' then
      substr(stuff,length(stuff),1) = '';
    if substr(stuff,1,2) eq 'Z3' and
      1<=input(substr(stuff,3,2),8.)<=70  then variablez='Z301-Z370';
    if stuff in ('B171','B172') then variableb='B171-B172';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

Solution
‎02-22-2017 07:49 AM
Valued Guide
Posts: 797

Re: creating new varibles within a range of values of 6 different variables

This is a good task to show off the benefits of the CHAR function (extracts a single character substring), the INPUT function, and especially the SELECT statement:  

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array v {*} var1-var6;
  do I=1 to dim(v);
    if char(var6,4)='X' then substr(var6,4,1)=' ';
    select (char(v{I},1));
      when ('Z') if 301 <= input(substr(v{I},2),best32.) <= 370 then varz='Z301-Z370';
      when ('B') if 171 <= input(substr(v{I},2),best32.) <= 172 then varb='B171-B172';
      when ('X') if 101 <= input(substr(v{I},2),best32.) <= 102 then varx='X101-X102';
      otherwise ;
    end;
  end;
run;

  

The INPUT function is set to an informat of BEST32. as an insurance policy.  It means you generally don't have to worry about the length of the character variables being INPUTed. 

Contributor
Posts: 33

Re: creating new varibles within a range of values of 6 different variables

Thank you everybody,

the code worked just fine

 

your help is very much appreciated.

 

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 171 views
  • 3 likes
  • 4 in conversation