BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lalohg
Quartz | Level 8

Hello there, Among other variables, I have a data set with 6 character variables (lenght 4) like this:

 

Obs Var1 Var2 Var3 Var4 Var5 Var6
1 O809 O809 Z370 Z301 Z390 A20X
2 B171 B172 K746 I10X I519 K546
3 O809 O809 Z370 Z302 Z390  
4 X101 X102 I12X      

 

and would like:

1.- to get rid of the "X" but only for entries with the "X" at the end so de data would look like:

 

Obs Var1 Var2 Var3 Var4 Var5 Var6
1 O809 O809 Z370 Z301 Z390 A20
2 B171 B172 K746 I10 I519 K546
3 O809 O809 Z370 Z302 Z390  
4 X101 X102 I12      

 

then

2.- I would like to create new variables that are within a specific range, for example if I want a new variable Z to select any entry of Var1 to Var 6 between Z301 and Z370, the new variable would be like the example below, in this case "VariableZ", if I want another new varible to select any value of Var1 to Var6 to be between B171 and B172 or between X101 and X102 the new variables would be respectivelly like "VariableB" and "VariableX" below and so on 

 

New data set                
Obs Var1 Var2 Var3 Var4 Var5 Var6 VariableZ  VariableB VariableX
1 O809 O809 Z370 Z301 Z390 A20 Z301-Z370    
2 B171 B172 K746 I10 I519 K546   B171-172  
3 O809 O829 Z370 Z302 Z390   Z301-Z370    
4 X101 D109 I12           X101-X102

 

all your help will be appreciated

 

Thanks

EHG.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This is a good task to show off the benefits of the CHAR function (extracts a single character substring), the INPUT function, and especially the SELECT statement:  

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array v {*} var1-var6;
  do I=1 to dim(v);
    if char(var6,4)='X' then substr(var6,4,1)=' ';
    select (char(v{I},1));
      when ('Z') if 301 <= input(substr(v{I},2),best32.) <= 370 then varz='Z301-Z370';
      when ('B') if 171 <= input(substr(v{I},2),best32.) <= 172 then varb='B171-B172';
      when ('X') if 101 <= input(substr(v{I},2),best32.) <= 102 then varx='X101-X102';
      otherwise ;
    end;
  end;
run;

  

The INPUT function is set to an informat of BEST32. as an insurance policy.  It means you generally don't have to worry about the length of the character variables being INPUTed. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
Astounding
PROC Star

Part 1 is pretty easy.  In a DATA step:

 

array var {6};

do i=1 to 6;

   if substr(var{i}, 4, 1) = 'X' then substr(var{i}, 4, 1) = ' ';

end;

 

For part 2, I'm not really sure what you are trying to achieve here.  But I would warn you about making character comparisons.  For example, as character strings, "Z32" falls within the range of "Z301" through "Z370".

art297
Opal | Level 21

I'm sure that someone (Hi @Reeza) will complain about my using DO OVER, the following is an easy way to accomplish both tasks:

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array stuff var1-var6;
  do over stuff;
    if substr(stuff,length(stuff),1) eq 'X' then
      substr(stuff,length(stuff),1) = '';
    if substr(stuff,1,2) eq 'Z3' and
      1<=input(substr(stuff,3,2),8.)<=70  then variablez='Z301-Z370';
    if stuff in ('B171','B172') then variableb='B171-B172';
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

This is a good task to show off the benefits of the CHAR function (extracts a single character substring), the INPUT function, and especially the SELECT statement:  

 

data have;
  input (Var1-Var6) ($);
  cards;
O809 O809 Z370 Z301 Z390 A20X
B171 B172 K746 I10X I519 K546
O809 O809 Z370 Z302 Z390 . 
X101 X102 I12X  .  .  .
;
data want;
  set have;
  array v {*} var1-var6;
  do I=1 to dim(v);
    if char(var6,4)='X' then substr(var6,4,1)=' ';
    select (char(v{I},1));
      when ('Z') if 301 <= input(substr(v{I},2),best32.) <= 370 then varz='Z301-Z370';
      when ('B') if 171 <= input(substr(v{I},2),best32.) <= 172 then varb='B171-B172';
      when ('X') if 101 <= input(substr(v{I},2),best32.) <= 102 then varx='X101-X102';
      otherwise ;
    end;
  end;
run;

  

The INPUT function is set to an informat of BEST32. as an insurance policy.  It means you generally don't have to worry about the length of the character variables being INPUTed. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
lalohg
Quartz | Level 8

Thank you everybody,

the code worked just fine

 

your help is very much appreciated.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1997 views
  • 3 likes
  • 4 in conversation