SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Total_FREQ
Calcite | Level 5

Hey All!

I have a character variable of varying lengths that I want to separate into separate variables.  The variable always follows the following pattern - AA.AA.AA.AA.AA (2 characters then a period, 2 characters then a period, etc) - and each "AA" signifies a unique code I need to have separated for further analysis. 

Example of what it looks like now:

orig_var
AA.4F.9U.EB
E5.U9
GB.D2.4F.EB.AA.M2
T4
YY.4D.U9

How I'd like for it to look:

orig_varnew_var1new_var2new_var3new_var4new_var5new_var6
AA.4F.9U.EBAA4F9UEB..
E5.U9....
GB.D2.4F.EB.AA.M2GBD24FEBAAM2
T4T4.....
YY.4D.U9YY4DU9...

Any thoughts on how to code this to parse those 2-char codes?

Thanks!

Erice

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Do you know the max number of variables? if so loop using the scan function and . as your delimiter.

data want;

set have;

array parsed_vars(*) new_var1-new_var6;

i=1;

do while(scan(str, i, ".") ne "");

parsed_vars(i) =scan(str, i, ".");

i+1;

end;

run;

View solution in original post

18 REPLIES 18
Fugue
Quartz | Level 8

Specify the period as your delimiter.

Total_FREQ
Calcite | Level 5

Can you be more specific?  Also, note that orig_var is one variable in a SAS dataset that has like 200 other variables.  It's not a raw data file for me to import.

Fugue
Quartz | Level 8

That changes things then. A simple approach would be to use something like the scan function.

newvar1=scan(orig_var, 1, '.');

newvar2=scan(orig_var, 2, '.');

etc

Reeza
Super User

Do you know the max number of variables? if so loop using the scan function and . as your delimiter.

data want;

set have;

array parsed_vars(*) new_var1-new_var6;

i=1;

do while(scan(str, i, ".") ne "");

parsed_vars(i) =scan(str, i, ".");

i+1;

end;

run;

Total_FREQ
Calcite | Level 5

Reeza --

That worked except that there was a numeric vs character mismatch betweent he array variables and the orig_var pieces.  When I added _char_ at the end of the array line, it work PERFECTLY.  I was attempting some sort of array thing but just couldn't get it quite right on my own.

THANKS!!

Eric

Reeza
Super User

I think you should add the dollar sign instead of _char_. That would add in all the character variables to your array and you could potentially overwrite a variable if you had more than 6 variables in your variable that was to be parsed. In fact, it may not be a bad idea to explicitly declare the array length rather than use the * method. 

array parsed_vars(6) $  new_var1-new_var6;

ErikLund_Jensen
Rhodochrosite | Level 12

If you don't know the max. number of elements in orig_var, it can be found and stored in a macro variable by proc sql. The number can then be used to dynamically create the required number of new variables in the output data set:

proc sql noprint;

  select max(count(orig_var,'.'))+1 into :maxelements from have;

quit;

data want (drop=i); set have;

  array parsed_vars $ 2 new_var1-new_var%eval(&maxelements);

  do i = 1 to &maxelements;

  parsed_vars{i} = scan(orig_var,i,'.');

  end;

run;

Haikuo
Onyx | Level 15

Another low-maintenance dynamic option can be Proc Transpose:

data have;

     input var $40.;

     length svar $ 2;

     do i=1 by 1 until (missing(svar));

           svar=scan(var,i,'.');

           if not missing(svar) then

                output;

     end;

     cards;

AA.4F.9U.EB

E5.U9

GB.D2.4F.EB.AA.M2

T4

YY.4D.U9

;

proc sort data=have;

     by var;

run;

proc transpose data=have out=want (drop=_name_) prefix=new_var;

     by var;

     var svar;

run;

Quinn
Fluorite | Level 6

* Nice ! ;
* My minor edits:
* 1. no PRoc sort needed. ;
* 2. use ID statement ;
proc transpose data=have out=want (drop=_name_) prefix=var;
by var notsorted;
id i;
var svar;
run;

CR204
Calcite | Level 5

An alternative to Reeza's solution is to use a for-loop rather than a while (for people that don't trust while loops):

 

data want;
    set have;
    array parsed_vars(*) new_var1-new_var6;

    do i = 1 to 6;
        parsed_vars(i) =scan(str, i, ".");
    end;

run;

The 6 can be redefined as a macro variable. In my code I have a length statement before the array statement, I do not know whether it works for al data without the length statement.

Quentin
Super User

Hi @CR204, curious what you mean by not trusting WHILE loops?  They've always been trustworthy to me.  I suppose one risk they bring is that you can accidentally write an infinite loop.  But to me, they're still worthy of trust and are a useful tool.  Are you thinking of other concerns?

 

The Boston Area SAS Users Group is hosting free webinars!
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.
CR204
Calcite | Level 5

@Quentin While loops can be useful, but I dislike using them when a simple for loop would work. The "don't trust" part should be read with a wink. 😉

mstergia
Calcite | Level 5

Hi there,

 

I used your code which worked well in my project.  Today, however, I am getting an error (Array subscript out of range at line 29, column 1).

 

This is my code:  Not sure why it worked last week and not working today.  Any help would be appreciated!

data Parsed_Topics;

set have;

array PARSED_VARS(*)$100 Topic1-Topic12;

i=1;

do while(scan(TrainingTopic, i, ";") ne "");

 

PARSED_VARS(i) =scan(TrainingTopic, i, ";");

i+1;

end;

run;

 

 

Quinn
Fluorite | Level 6

See if string variable TrainingTopic contains more than 12 delimiters ':' - 12 is the array upbound.   Hop it helps.

L.L 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 92039 views
  • 12 likes
  • 9 in conversation