12-23-2016 12:27 AM
12-23-2016 01:35 AM
With what purpose? Are you creating this from scratch and need to know how? Or have a dataset and need to create new variables? Or is this theoretical and you need to know how many different ways for a course?
12-23-2016 02:07 AM - edited 12-23-2016 02:12 AM
Providing some sample data and some example of what your desired result looks like would help
I am going to assume that your dataset looks like this
data have; format brand $15.; input brand $ total_bought $; datalines; burberry yes valentino yes valentino1 yes ;
And that you want to create the variables burberry_bought and valentino_bought as below
data want; set have; if index(brand, 'burberry') > 0 then burberry_bought = 'yes'; else burberry_bought = 'no'; if index(brand, 'valentino') > 0 then valentino_bought = 'yes'; else valentino_bought = 'no'; run;
Hope it helps
12-23-2016 10:29 AM
As a minor change to @draycut's solution I would suggest:
data want; set have; burberry_bought = (index(UPCASE(brand), 'BURBERRY')) > 0; valentino_bought = index(UPCASE(brand), 'VALENTINO') > 0; run;
This will assign values of 1 for true and 0 for false. If you really need to show text Yes/No then a custom format can be assigned. The 1/0 coding lends itself to summaries much better as the SUM of the bought variable will be the total times bought, the MEAN will be a percentage in decimal form. Also if you actual data has one field with potentially multiple entries, such as "burberry valentino", extending this approach allows you to sum the variables within a record to know how many brands were bought.
The UPCASE and change in case to the value searched will help in case your data entry has values like Burberry, burBerry and other similar changes is letter case. Since your example data had two different values involving VALENTINO this seems a likely concern.