Hi All,
Need suggestion on creating new variables from an existing variable which is delimited.
foir example i have values in the variable VAR_A as
&PRODUCT=AXCV12345&sState=ABC&year_det=2014&TELEPH=1234567890.
i need to create new variables from variable "VAR_A".
product State year_det Phone_Number
AXCV12345 ABC 2014 1234567890
the delimiter appears "&,? and =" appear more than once for the variable "VAR_A".
Please suggest.
Here are some questions to consider, before the programming begins.
How do you know which variables are character and which are numeric? For example, Phone Number would normally be stored as character even if it contains only digits.
Should each observation in the data set execute only the VAR_A instructions that appear on that observation? Or does the contents of VAR_A apply to all observations in the data set?
Could variables have different lengths for different values of VAR_A? For example, could one observation contain &STATE=Utah& while another contains &STATE=Minnesota& and another contains &STATE=New York& ? Do you know what the longest required length would be for each possible character variable referenced within VAR_A?
Hi
Thank you for looking into this and posting your vluable comments.
How do you know which variables are character and which are numeric? For example, Phone Number would normally be stored as character even if it contains only digits.
The variables stored as character.
Should each observation in the data set execute only the VAR_A instructions that appear on that observation? Or does the contents of VAR_A apply to all observations in the data set?
VAR_A apply to all observations in the data set.
Could variables have different lengths for different values of VAR_A? For example, could one observation contain &STATE=Utah& while another contains &STATE=Minnesota& and another contains &STATE=New York& ? Do you know what the longest required length would be for each possible character variable referenced within VAR_A?
Yes it can have multiple lengths, i am not sure of the longest length as the
VAR_A looks something like this
&STATE=Minnesota&PRODUCT=AAAXXXCCC&TELEPHONE=1234567890.
so i have defined length of A as 250.
Do the variables ALWAYS appear in the same order and ALWAYS appear in each record?
If so in a data step you could use:
data want;
set have;
/* it is a good idea to set Lengths for the results before use
long enough to hold the longest expected value*/
Length Product $
Product = scan(Var_A,2,'&=?');
State = scan(Var_A,4,'&=?');
Year_det = input(scan(Var_A,6,'&=?'),best.); /* assumes you want this as numeric value*/
Phone_number = scan(Var_A,8,'&=?');
run;
Note that since you don't show use of ? then there could be different behavior.
IF the order may change or some variables could be missing this won't work and you'll have to do some searching of in the variable using FIND or index functions.
Hi Super user,
Thank you for your comments and suggestions.
FIND or index need to be used, i was looking for a function which search the position of "&=,?" and read it into a macro variable and then use the macro variable in a Do loop other than creating the other variables based on the position of the & = , ?.
Kindly suggest
There is absolutely no need for a macro for this.
If the order changes then you have one piece of code to look for each variable that you know of.
This shows one example. I did not assign a length for the variable(s) and hence this example gets the length of the starting X variable.
For your year again use the Input bit for a numeric.
data example;
x='&PRODUCT=AXCV12345&sState=ABC&year_det=2014&TELEPH=1234567890';
/* how to find your product variable if the order changes, or may not be there*/
/* check to see if the variable is there*/
pos = index(x,'PRODUCT=');
if pos>0 then do; /* the variable is there if not in the string this bit is skipped*/
/*POS has the position of the string, so you know the the value you want starts at POS+length of variable and the =,8 in this case*/
/* so you search from that to find the & which appears to be the end of the value*/
endpos = find(x,'&',Pos);
Product = substr(x,Pos+8,endpos-(Pos+8));
end;
run;
You still have not shown how the ? might appear.
Hi Super User,
Thank you so much!!!!
Your comments and suggestions helped me alot to achieve.
@santosh_pat69 if your issue is resolved please take the time to mark the appropriate response as the correc answer.
It looks like you're parsing web queries in some form? Or an HTML address?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.