DATA Step, Macro, Functions and more

Separating a string value into different variables

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 144
Accepted Solution

Separating a string value into different variables

Hi all--

I have a variable titled  Creative_name which contains long sting values. There are sections in the values which are delimited by a _ . Here is an example of a value:

acq_sale_3xconoffer_8999_nacpg_2102014

What I need is break this out into separate variables by the deliminator. So I need the output to look like this:

LOBBundle Creative Concept Price  Sweetener Date
acqsale3xconoffer$89.99nacpg10Feb14

If you notice Price and Date are reformatted as well.  For Price I always have to enter a decimal place before the last two digest so I turn the variables into currency  format.

Any assistance is greatly appreciated!  Thanks!


Accepted Solutions
Solution
‎04-07-2014 05:25 PM
Super User
Posts: 19,772

Re: Separating a string value into different variables

It was supposed to be anydtdte but that didn't work for me either.

I'm sure there's a simpler way than this, and this only accounts for missing a leading zero, assuming it is for the month. If your date has a single digit day then it won't work.

data test;

text="acq_sale_3xconoffer_8999_nacpg_2102014";

date=input(scan(text, 6, "_"), 8.);

date2=input(put(input(scan(text, 6, "_"), 8.), z8.), mmddyy10.);

format date2 date9.;

run;

View solution in original post


All Replies
Super User
Posts: 19,772

Re: Separating a string value into different variables

Use the scan function to parse

Use an input function to get the variable as a number

Use a format to apply the currency format.

Here's an example for Price. Price is a numeric variable and PRICE2 is a character variable.

data test;

    text="acq_sale_3xconoffer_8999_nacpg_2102014";

    price=input(scan(text, 4, "_"), 8.)/100;

    format price dollar8.2;

    price2=put(input(scan(text, 4, "_"), 8.)/100, dollar8.2 -l);

run;

Frequent Contributor
Posts: 144

Re: Separating a string value into different variables

Thanks.  the string values change though.

Could look something like this:

acq_newsale_0_CartChat_0_Flash_1192014

or this:

acq_old_3x_GadgetPotomac_7999_Flash_112014

so $0.00 or $79.99.  The value is not always  $89.99

Super User
Posts: 5,499

Re: Separating a string value into different variables

You might want to add a LENGTH statement before parsing the string.  Otherwise the new variables will be longer than you might expect.

Super User
Posts: 19,772

Re: Separating a string value into different variables

So you're saying the delimiters aren't consistent?

ie in the first example the dollar amount is the 4th parameter

and in the 2 new samples above it's the 5th parameter.

How do you know how many you'll have, or where it will be? You'll have to add rules for that.

We can only provide answers for the information you provide, so please explain your problem thorough Smiley Happy

Frequent Contributor
Posts: 144

Re: Separating a string value into different variables

Thanks Rezza. Sorry its always the 5th parameter. Thanks!

Frequent Contributor
Posts: 144

Re: Separating a string value into different variables

And again, the value can change but its always a two digit two decimal number (so four characters)  or 0.

Super Contributor
Posts: 275

Re: Separating a string value into different variables

data test;
infile cards truncover;
  input text $50.;
  price =(input(prxchange('s/(\w+)_(\d+)_(\w+)/$2/',-1,text),8.))/100;
  format price dollar8.2;
cards;
acq_newsale_0_CartChat_0_Flash_1192014
acq_old_3x_GadgetPotomac_7999_Flash_112014
acq_sale_3xconoffer_8999_nacpg_2102014
;
run;

Respected Advisor
Posts: 3,799

Re: Separating a string value into different variables

data d;
   input string $50.;
   d = input(scan(string,
5,'_'),8.2);
   format d dollar10.2;
cards;
acq_newsale_0_CartChat_0_Flash_1192014
acq_old_3x_GadgetPotomac_7999_Flash_112014
;;;;
   run;
Frequent Contributor
Posts: 144

Re: Separating a string value into different variables

Posted in reply to data_null__

ok I get  Price but I'm having issues pulling out date and formatting it

2102014

data test2;

set data;

    LOB=scan(CreativeName, 1, "_");

  Bundle=scan(CreativeName, 3, "_");

  CreativeConcept=scan(CreativeName, 4, "_");

    price=input(scan(CreativeName, 5, "_"), 8.)/100;

   date=scan(CreativeName, 8, "_");  /*just pulls out the value*/

  date2=input(scan(CreativeName, 8, "_", mmddyyyy );  /*does not work*/

  Sweetner=scan(CreativeName, 6, "_");

    format price dollar8.2;

    price2=put(input(scan(CreativeName, 5, "_"), 8.)/100, dollar8.2 -l);

run;

Super User
Posts: 19,772

Re: Separating a string value into different variables

Try anydtdte format rather than mmddyyyy.

Frequent Contributor
Posts: 144

Re: Separating a string value into different variables

Still does not work Smiley Sad

data test2;

set data;

    LOB=scan(CreativeName, 1, "_");

  Bundle=scan(CreativeName, 3, "_");

  CreativeConcept=scan(CreativeName, 4, "_");

    price=input(scan(CreativeName, 5, "_"), 8.)/100;

  date2=input(scan(CreativeName, 8, "_", 8.), anydate.);

  Sweetner=scan(CreativeName, 6, "_");

  date=scan(CreativeName, 8, "_");

    format price dollar8.2;

    price2=put(input(scan(CreativeName, 5, "_"), 8.)/100, dollar8.2 -l);

run;

Occasional Contributor
Posts: 9

Re: Separating a string value into different variables

Is there any way that you can change the way the raw data are written? The month and the day both need to be two digits. Otherwise, how will you know whether 1122014 is January 12 (1/12) or November 2 (11/2)? A six-digit number is obviously a 1-digit month and 1-digit day and an eight-digit number is obviously a two-digit month and a two-digit year, but a seven-digit date could be a one-digit month and two-digit day or a two-digit month and one-digit day.

liz

Solution
‎04-07-2014 05:25 PM
Super User
Posts: 19,772

Re: Separating a string value into different variables

It was supposed to be anydtdte but that didn't work for me either.

I'm sure there's a simpler way than this, and this only accounts for missing a leading zero, assuming it is for the month. If your date has a single digit day then it won't work.

data test;

text="acq_sale_3xconoffer_8999_nacpg_2102014";

date=input(scan(text, 6, "_"), 8.);

date2=input(put(input(scan(text, 6, "_"), 8.), z8.), mmddyy10.);

format date2 date9.;

run;

Regular Contributor
Posts: 217

Re: Separating a string value into different variables

If the number of underscores varies, how do you know which word in the string is a money value and which is a date value?  First, count the number of underscores by substracting the length of the variable without underscores from the total length of the variable.   length without underscores equals length(compress blanks(translate underscores to blanks)).  Once you know how many words in the variable, use the scan function to parse out all the individual words that are in the variable.  You'll have to figure out from there which word is a dollar value and which is a date.  Always be sure to account for an underscore that could be at the end of the variable.    var1 = scan(variable,1,'_'); etc. 

If I can find my old code, I'll post it.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 405 views
  • 6 likes
  • 7 in conversation