BookmarkSubscribeRSS Feed
santosh_pat69
Quartz | Level 8

Hi All,

 

Need suggestion on creating new variables from an existing variable which is delimited.

foir example i have values in the variable VAR_A as

&PRODUCT=AXCV12345&sState=ABC&year_det=2014&TELEPH=1234567890.

 

i need to create new variables from variable "VAR_A".

product                 State      year_det   Phone_Number

AXCV12345           ABC          2014      1234567890

 

the delimiter appears "&,? and =" appear more than once for the variable  "VAR_A".

 

Please suggest.

 

 

 

8 REPLIES 8
Astounding
PROC Star

Here are some questions to consider, before the programming begins.

 

How do you know which variables are character and which are numeric?  For example, Phone Number would normally be stored as character even if it contains only digits.

 

Should each observation in the data set execute only the VAR_A instructions that appear on that observation?  Or does the contents of VAR_A apply to all observations in the data set?

 

Could variables have different lengths for different values of VAR_A?  For example, could one observation contain &STATE=Utah& while another contains &STATE=Minnesota& and another contains &STATE=New York& ?  Do you know what the longest required length would be for each possible character variable referenced within VAR_A?

santosh_pat69
Quartz | Level 8

Hi

Thank you for looking into this and posting your vluable comments.

 

How do you know which variables are character and which are numeric?  For example, Phone Number would normally be stored as character even if it contains only digits.

 

The variables stored as character.

 

Should each observation in the data set execute only the VAR_A instructions that appear on that observation?  Or does the contents of VAR_A apply to all observations in the data set?

 

VAR_A apply to all observations in the data set.

 

Could variables have different lengths for different values of VAR_A?  For example, could one observation contain &STATE=Utah& while another contains &STATE=Minnesota& and another contains &STATE=New York& ?  Do you know what the longest required length would be for each possible character variable referenced within VAR_A?

 

Yes it can have multiple lengths, i am not sure of the longest length as the

VAR_A looks something like this

&STATE=Minnesota&PRODUCT=AAAXXXCCC&TELEPHONE=1234567890.

so i have defined length of A as 250.

 

 

ballardw
Super User

Do the variables ALWAYS appear in the same order and ALWAYS appear in each record?

If so in a data step you could use:

data want;
    set have;
    /* it is a good idea to set Lengths for the results before use
       long enough to hold the longest expected value*/
    Length Product $ 
    Product = scan(Var_A,2,'&=?');
    State = scan(Var_A,4,'&=?');
    Year_det = input(scan(Var_A,6,'&=?'),best.); /* assumes you want this as numeric value*/
    Phone_number = scan(Var_A,8,'&=?');
run;

Note that since you don't show use of ? then there could be different behavior.

 

IF the order may change or some variables could be missing this won't work and you'll have to do some searching of in the variable using FIND or index functions.

santosh_pat69
Quartz | Level 8

Hi Super user,

Thank you for your comments and suggestions.

FIND or index need to be used, i was looking for a function which search the position of "&=,?" and read it into a macro variable and then use the macro variable in a Do loop other than creating the other variables based on the position of the & = , ?.

 

 

Kindly suggest

ballardw
Super User

There is absolutely no need for a macro for this.

If the order changes then you have one piece of code to look for each variable that you know of.

This shows one example. I did not assign a length for the variable(s) and hence this example gets the length of the starting X variable.

For your year again use the Input bit for a numeric.

data example;
   x='&PRODUCT=AXCV12345&sState=ABC&year_det=2014&TELEPH​=1234567890';
   /* how to find your product variable if the order changes, or may not be there*/
   /* check to see if the variable is there*/
   pos = index(x,'PRODUCT=');
   if pos>0 then do;  /* the variable is there if not in the string this bit is skipped*/
      /*POS has the position of the string, so you know the the value you want starts at POS+length of variable and the =,8 in this case*/
      /* so you search from that to find the & which appears to be the end of the value*/
      endpos = find(x,'&',Pos);
      Product = substr(x,Pos+8,endpos-(Pos+8));
   end;
run;

 

You still have not shown how the ? might appear.

santosh_pat69
Quartz | Level 8

Hi Super User,

 

 

Thank you so much!!!!

Your comments and suggestions helped me alot to achieve.

 

 

 

 

 

 

Reeza
Super User

@santosh_pat69 if your issue is resolved please take the time to mark the appropriate response as the correc answer. 

Reeza
Super User

It looks like you're parsing web queries in some form? Or an HTML address? 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 999 views
  • 0 likes
  • 4 in conversation