BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Total_FREQ
Calcite | Level 5

Hey All!

I have a character variable of varying lengths that I want to separate into separate variables.  The variable always follows the following pattern - AA.AA.AA.AA.AA (2 characters then a period, 2 characters then a period, etc) - and each "AA" signifies a unique code I need to have separated for further analysis. 

Example of what it looks like now:

orig_var
AA.4F.9U.EB
E5.U9
GB.D2.4F.EB.AA.M2
T4
YY.4D.U9

How I'd like for it to look:

orig_varnew_var1new_var2new_var3new_var4new_var5new_var6
AA.4F.9U.EBAA4F9UEB..
E5.U9....
GB.D2.4F.EB.AA.M2GBD24FEBAAM2
T4T4.....
YY.4D.U9YY4DU9...

Any thoughts on how to code this to parse those 2-char codes?

Thanks!

Erice

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Do you know the max number of variables? if so loop using the scan function and . as your delimiter.

data want;

set have;

array parsed_vars(*) new_var1-new_var6;

i=1;

do while(scan(str, i, ".") ne "");

parsed_vars(i) =scan(str, i, ".");

i+1;

end;

run;

View solution in original post

18 REPLIES 18
Fugue
Quartz | Level 8

Specify the period as your delimiter.

Total_FREQ
Calcite | Level 5

Can you be more specific?  Also, note that orig_var is one variable in a SAS dataset that has like 200 other variables.  It's not a raw data file for me to import.

Fugue
Quartz | Level 8

That changes things then. A simple approach would be to use something like the scan function.

newvar1=scan(orig_var, 1, '.');

newvar2=scan(orig_var, 2, '.');

etc

Reeza
Super User

Do you know the max number of variables? if so loop using the scan function and . as your delimiter.

data want;

set have;

array parsed_vars(*) new_var1-new_var6;

i=1;

do while(scan(str, i, ".") ne "");

parsed_vars(i) =scan(str, i, ".");

i+1;

end;

run;

Total_FREQ
Calcite | Level 5

Reeza --

That worked except that there was a numeric vs character mismatch betweent he array variables and the orig_var pieces.  When I added _char_ at the end of the array line, it work PERFECTLY.  I was attempting some sort of array thing but just couldn't get it quite right on my own.

THANKS!!

Eric

Reeza
Super User

I think you should add the dollar sign instead of _char_. That would add in all the character variables to your array and you could potentially overwrite a variable if you had more than 6 variables in your variable that was to be parsed. In fact, it may not be a bad idea to explicitly declare the array length rather than use the * method. 

array parsed_vars(6) $  new_var1-new_var6;

ErikLund_Jensen
Rhodochrosite | Level 12

If you don't know the max. number of elements in orig_var, it can be found and stored in a macro variable by proc sql. The number can then be used to dynamically create the required number of new variables in the output data set:

proc sql noprint;

  select max(count(orig_var,'.'))+1 into :maxelements from have;

quit;

data want (drop=i); set have;

  array parsed_vars $ 2 new_var1-new_var%eval(&maxelements);

  do i = 1 to &maxelements;

  parsed_vars{i} = scan(orig_var,i,'.');

  end;

run;

Haikuo
Onyx | Level 15

Another low-maintenance dynamic option can be Proc Transpose:

data have;

     input var $40.;

     length svar $ 2;

     do i=1 by 1 until (missing(svar));

           svar=scan(var,i,'.');

           if not missing(svar) then

                output;

     end;

     cards;

AA.4F.9U.EB

E5.U9

GB.D2.4F.EB.AA.M2

T4

YY.4D.U9

;

proc sort data=have;

     by var;

run;

proc transpose data=have out=want (drop=_name_) prefix=new_var;

     by var;

     var svar;

run;

Quinn
Fluorite | Level 6

* Nice ! ;
* My minor edits:
* 1. no PRoc sort needed. ;
* 2. use ID statement ;
proc transpose data=have out=want (drop=_name_) prefix=var;
by var notsorted;
id i;
var svar;
run;

CR204
Calcite | Level 5

An alternative to Reeza's solution is to use a for-loop rather than a while (for people that don't trust while loops):

 

data want;
    set have;
    array parsed_vars(*) new_var1-new_var6;

    do i = 1 to 6;
        parsed_vars(i) =scan(str, i, ".");
    end;

run;

The 6 can be redefined as a macro variable. In my code I have a length statement before the array statement, I do not know whether it works for al data without the length statement.

Quentin
Super User

Hi @CR204, curious what you mean by not trusting WHILE loops?  They've always been trustworthy to me.  I suppose one risk they bring is that you can accidentally write an infinite loop.  But to me, they're still worthy of trust and are a useful tool.  Are you thinking of other concerns?

 

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
CR204
Calcite | Level 5

@Quentin While loops can be useful, but I dislike using them when a simple for loop would work. The "don't trust" part should be read with a wink. 😉

mstergia
Calcite | Level 5

Hi there,

 

I used your code which worked well in my project.  Today, however, I am getting an error (Array subscript out of range at line 29, column 1).

 

This is my code:  Not sure why it worked last week and not working today.  Any help would be appreciated!

data Parsed_Topics;

set have;

array PARSED_VARS(*)$100 Topic1-Topic12;

i=1;

do while(scan(TrainingTopic, i, ";") ne "");

 

PARSED_VARS(i) =scan(TrainingTopic, i, ";");

i+1;

end;

run;

 

 

Quinn
Fluorite | Level 6

See if string variable TrainingTopic contains more than 12 delimiters ':' - 12 is the array upbound.   Hop it helps.

L.L 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 89333 views
  • 12 likes
  • 9 in conversation