DATA Step, Macro, Functions and more

Choosing variables based on condition

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 107
Accepted Solution

Choosing variables based on condition

Hello SAS gurus,

Could someone please help me with the following query. Thanks in advance


For instance I have a dataset like this


Name Age sex BMW BMW_Premium BMW_Standard AUDI Mercedes Mercedes_card  Vauxhall Toyota Toyota_Rnw
    x       x     x      x              x                         x                   x           x                      x                  x            x            x                            


Now I want to see only BMW data along with demographic data. How do I select only brand specific variables along with Name Age sex using macros or any other method?


for instance user should be able to select what he wants by entering the brand name in %let(Ex: %let brand = BMW)


Accepted Solutions
Solution
‎06-16-2015 10:37 AM
Frequent Contributor
Posts: 107

Re: Choosing variables based on condition

Posted in reply to KiranMaddi


Hi all,

Sorry for the delay in coming back. Was on holiday and just came back. The below code worked for me.

proc sql;
  select  distinct upcase(VALUE)
  into    :KEEP_LIST separated by " "
  from   SASHELP.VMACRO
  where NAME in ('BRAND','BRAND1','BRAND2','BRAND3','BRAND4','BRAND5','BRAND6','BRAND7','BRAND8','BRAND9,''BRAND10','BRAND11','BRAND12','BRAND13');

quit;

I have 13 brands in the prompts to choose. If user chooses only one brand then brand will be &brand. if multiple it will be &brand1 &brand2 and so on.

Thank you all for your help. You all are wonderful

View solution in original post


All Replies
Super User
Super User
Posts: 7,977

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

Hi,

The below uses your columns naming which is not consitent, hence I have to query the dataset first to get variable names, e.g. BMW and BMW_PREMIUM, but no AUDI_PREMIUM etc.  Having consistent column naming would simplify this.

data have;

  name="abcd"; age=20; sex="M"; bmw="3 series"; bmw_premium="Y"; audi="N"; mercedes="N"; output;

  name="efj"; age=20; sex="M"; bmw="3 series"; bmw_premium="Y"; audi="A8"; mercedes="SLR"; output;

run;

%let brand=bmw;

proc sql;

  select  distinct NAME

  into    :KEEP_LIST separated by " "

  from    DICTIONARY.COLUMNS

  where   LIBNAME="WORK"

    and   MEMNAME="HAVE"

    and   (upcase(NAME) in ("NAME","AGE","SEX") or index(upcase(NAME),upcase("&BRAND."))>0);

quit;

data want;

  set have (keep=&KEEP_LIST.);

run;

Frequent Contributor
Posts: 107

Re: Choosing variables based on condition


Hi RW9,

Thank you very much for your quick response.

Please do you mind explain this line of code?

   (upcase(NAME) in ("NAME","AGE","SEX") or index(upcase(NAME),upcase("&BRAND."))>0);

In this example we only have 5 or 6 variables. What if we have like 100 variables? Do we have to manually enter all those variable names?

Message was edited by: Kiran Maddi

Super User
Super User
Posts: 7,977

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

SAS automatically keeps metadata about tables.  These can be seen in SASHELP library, under VTABLE and VCOLUMN.  You can also reference these in SQL as dictionary.tables, and dictionary.columns.  What I am doing in that statement is looking at what columns are in your dataset.  If the name is NAME, AGE, SEX, or has BMW at any point in the variable name, then create a macro variable with these column names.  Then the datastep uses this macro variable to know what columns to keep.

Super User
Posts: 10,041

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

If you have multiple KAY word need to grab ,then use prxmatch().

Frequent Contributor
Posts: 107

Re: Choosing variables based on condition

Hi Xia,

I did not understand that. Please can you be more specific?

Super User
Posts: 10,041

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

If they starts with bmw , Mercedes  ,then use JACK suggested:

set have (keep=name--sex bmw: Mercedes: );

Otherwise, could consider use the following:

(upcase(NAME) in ("NAME","AGE","SEX") or  prxmatch('/bmw|Mercedes|Porche/i',name);


Frequent Contributor
Posts: 107

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

Hi RW9,

That seems to be working. What if I want to select multiple brands? (ex: bmw and Mercedes)

Super User
Super User
Posts: 7,977

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

Well, just update the where and/or the let:

%let brand1=bmw;

%let brand2=mercedes;

    and   (upcase(NAME) in ("NAME","AGE","SEX") or index(upcase(NAME),upcase("&BRAND1."))>0 or index(upcase(NAME),upcase("&BRAND2."))>0);


However JackHamilton raises a great point, if your variables do have the same prefix, then you can use that.  I always forget about this syntax.  So try that first, it will shrink the needed code.

Frequent Contributor
Posts: 107

Re: Choosing variables based on condition

Hi all,

The reason I want  this program is I want to create a prompt for the brand where user selects one or multiple brands from a static list when kicking off the program  and the final dataset should end up with that user chosen brand specific data. I do not want the users to go in the program and edit the data.

Hope I am clear?

Frequent Contributor
Posts: 107

Re: Choosing variables based on condition

Hi RW9

Does this work if user chose to go with only one brand as brand two wont be initialized?

%let brand1=bmw;

%let brand2=mercedes;

    and   (upcase(NAME) in ("NAME","AGE","SEX") or index(upcase(NAME),upcase("&BRAND1."))>0 or index(upcase(NAME),upcase("&BRAND2."))>0);

Super User
Super User
Posts: 7,977

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

You may want to clarify your initial post.  By create a prompt, what software are you using?  Is it stored process or something?

My suggestion would be to change the layout of your dataset to match the functionality you require, so instead of:

Name Age sex BMW BMW_Premium BMW_Standard AUDI Mercedes Mercedes_card  Vauxhall Toyota Toyota_Rnw

    x       x     x      x              x                         x                   x           x                      x                  x            x            x                           


Normalise the data:

NAME     AGE     SEX     BRAND          BRAND_SUBCATEGORY     RESULT

XYZ          123     M          BMW                                                            1

XYZ          123     M          BMW              Premium                                   12

ZYT           10       F           Mercedes                         ...


This will make your life easier in the long run.  You can get a distinct list of BRANDS by select distinct(BRAND) from..., and you can where clauses as many or as few as you need to return, e.g. where brand in (-yourlist-).

If your output needs to be transposed then:

proc sort data=have out=want;

     by name age sex brand; brand_subcategory;

run;

proc tranpose

     by name age sex brand; brand_subcategory;

     var result;

run;

Then your columns will be named col1-colx, for all the subsets you choose.

Frequent Contributor
Posts: 107

Re: Choosing variables based on condition


Hi RW9.

I use SAS enterprise guide 5.1

I actually had your thought before about creating a brand flag in the dataset. Because of some business reasons we wont be able to add that flag into the table. I think the above methods suits me if it allows to choose multiple brands in the prompts?

Frequent Contributor
Posts: 102

Re: Choosing variables based on condition

Posted in reply to KiranMaddi

Why not simply

     set have (keep=name--sex bmwSmiley Happy

Regular Contributor
Posts: 227

Re: Choosing variables based on condition

Posted in reply to JackHamilton

Here is a follow-up on Jack's suggestion about keep=()

I discovered this trick recently in polishing summary parameters

%let keep_ids = name age sex;

%let list_brands = bmw  mercedes;

*note double space, therefore need for cmpres;

%let keep_brands = %sysfunc(tranwrd(%cmpres(&list_brands)%str( )

,%str( ),%str(: )));

%put &=keep_brands;

set have( keep = &keep_ids &keep_brands);

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 28 replies
  • 813 views
  • 14 likes
  • 7 in conversation