Hi all--
I have a variable titled Creative_name which contains long sting values. There are sections in the values which are delimited by a _ . Here is an example of a value:
acq_sale_3xconoffer_8999_nacpg_2102014
What I need is break this out into separate variables by the deliminator. So I need the output to look like this:
LOB | Bundle | Creative Concept | Price | Sweetener | Date |
acq | sale | 3xconoffer | $89.99 | nacpg | 10Feb14 |
If you notice Price and Date are reformatted as well. For Price I always have to enter a decimal place before the last two digest so I turn the variables into currency format.
Any assistance is greatly appreciated! Thanks!
It was supposed to be anydtdte but that didn't work for me either.
I'm sure there's a simpler way than this, and this only accounts for missing a leading zero, assuming it is for the month. If your date has a single digit day then it won't work.
data test;
text="acq_sale_3xconoffer_8999_nacpg_2102014";
date=input(scan(text, 6, "_"), 8.);
date2=input(put(input(scan(text, 6, "_"), 8.), z8.), mmddyy10.);
format date2 date9.;
run;
Use the scan function to parse
Use an input function to get the variable as a number
Use a format to apply the currency format.
Here's an example for Price. Price is a numeric variable and PRICE2 is a character variable.
data test;
text="acq_sale_3xconoffer_8999_nacpg_2102014";
price=input(scan(text, 4, "_"), 8.)/100;
format price dollar8.2;
price2=put(input(scan(text, 4, "_"), 8.)/100, dollar8.2 -l);
run;
Thanks. the string values change though.
Could look something like this:
acq_newsale_0_CartChat_0_Flash_1192014
or this:
acq_old_3x_GadgetPotomac_7999_Flash_112014
so $0.00 or $79.99. The value is not always $89.99
You might want to add a LENGTH statement before parsing the string. Otherwise the new variables will be longer than you might expect.
So you're saying the delimiters aren't consistent?
ie in the first example the dollar amount is the 4th parameter
and in the 2 new samples above it's the 5th parameter.
How do you know how many you'll have, or where it will be? You'll have to add rules for that.
We can only provide answers for the information you provide, so please explain your problem thorough
Thanks Rezza. Sorry its always the 5th parameter. Thanks!
And again, the value can change but its always a two digit two decimal number (so four characters) or 0.
data test;
infile cards truncover;
input text $50.;
price =(input(prxchange('s/(\w+)_(\d+)_(\w+)/$2/',-1,text),8.))/100;
format price dollar8.2;
cards;
acq_newsale_0_CartChat_0_Flash_1192014
acq_old_3x_GadgetPotomac_7999_Flash_112014
acq_sale_3xconoffer_8999_nacpg_2102014
;
run;
ok I get Price but I'm having issues pulling out date and formatting it
2102014
data test2;
set data;
LOB=scan(CreativeName, 1, "_");
Bundle=scan(CreativeName, 3, "_");
CreativeConcept=scan(CreativeName, 4, "_");
price=input(scan(CreativeName, 5, "_"), 8.)/100;
date=scan(CreativeName, 8, "_"); /*just pulls out the value*/
date2=input(scan(CreativeName, 8, "_", mmddyyyy ); /*does not work*/
Sweetner=scan(CreativeName, 6, "_");
format price dollar8.2;
price2=put(input(scan(CreativeName, 5, "_"), 8.)/100, dollar8.2 -l);
run;
Try anydtdte format rather than mmddyyyy.
Still does not work
data test2;
set data;
LOB=scan(CreativeName, 1, "_");
Bundle=scan(CreativeName, 3, "_");
CreativeConcept=scan(CreativeName, 4, "_");
price=input(scan(CreativeName, 5, "_"), 8.)/100;
date2=input(scan(CreativeName, 8, "_", 8.), anydate.);
Sweetner=scan(CreativeName, 6, "_");
date=scan(CreativeName, 8, "_");
format price dollar8.2;
price2=put(input(scan(CreativeName, 5, "_"), 8.)/100, dollar8.2 -l);
run;
Is there any way that you can change the way the raw data are written? The month and the day both need to be two digits. Otherwise, how will you know whether 1122014 is January 12 (1/12) or November 2 (11/2)? A six-digit number is obviously a 1-digit month and 1-digit day and an eight-digit number is obviously a two-digit month and a two-digit year, but a seven-digit date could be a one-digit month and two-digit day or a two-digit month and one-digit day.
liz
It was supposed to be anydtdte but that didn't work for me either.
I'm sure there's a simpler way than this, and this only accounts for missing a leading zero, assuming it is for the month. If your date has a single digit day then it won't work.
data test;
text="acq_sale_3xconoffer_8999_nacpg_2102014";
date=input(scan(text, 6, "_"), 8.);
date2=input(put(input(scan(text, 6, "_"), 8.), z8.), mmddyy10.);
format date2 date9.;
run;
If the number of underscores varies, how do you know which word in the string is a money value and which is a date value? First, count the number of underscores by substracting the length of the variable without underscores from the total length of the variable. length without underscores equals length(compress blanks(translate underscores to blanks)). Once you know how many words in the variable, use the scan function to parse out all the individual words that are in the variable. You'll have to figure out from there which word is a dollar value and which is a date. Always be sure to account for an underscore that could be at the end of the variable. var1 = scan(variable,1,'_'); etc.
If I can find my old code, I'll post it.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.