turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Weight age of each numeric column/ variable and th...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-17-2012 08:56 AM

Hi there, I am a beginner in SAS. I have attached an excel for your assistance..

please can anyone tell me the code to :

**1. Calculate the weight age of all the numeric columns in a data set (sum/ total sample size) and accordingly filter out those columns/ variables which have a weight age of less than 2%. The filtered output is stored in a new data set. (step 1 tab in the excel file attached)**

**2. As per the step 2 in the excel file attached, I now want to use proc logistic and get the desired output as given in the step 2 tab of the excel.**

It would be really kind of you to provide the complete code as I am new to SAS.

How Many thanks for your co operation and time !!

Accepted Solutions

Solution

12-17-2012
09:50 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gaurav21s

12-17-2012 09:50 AM

Not sure that this makes sense from a statistical analysis point of view.

To find the variables with at least 2% positive response rate for binary variables you can just take the MEAN.

You do not need to generate a new version of the source data, just the list of variables to include in your analysis.

Here is one way to get the list of variables with at least 2% positive responses into the macro variable VARLIST.

proc summary data=HAVE ;

var _numeric_;

output out=means (drop=_type_ _freq_);

run;

proc transpose data=means out=vertical;

id _stat_;

run;

proc sql noprint ;

select _name_ into :varlist separated by ' '

from vertical

where mean > 0.02

;

quit;

%put varlist=&varlist;

All Replies

Solution

12-17-2012
09:50 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gaurav21s

12-17-2012 09:50 AM

Not sure that this makes sense from a statistical analysis point of view.

To find the variables with at least 2% positive response rate for binary variables you can just take the MEAN.

You do not need to generate a new version of the source data, just the list of variables to include in your analysis.

Here is one way to get the list of variables with at least 2% positive responses into the macro variable VARLIST.

proc summary data=HAVE ;

var _numeric_;

output out=means (drop=_type_ _freq_);

run;

proc transpose data=means out=vertical;

id _stat_;

run;

proc sql noprint ;

select _name_ into :varlist separated by ' '

from vertical

where mean > 0.02

;

quit;

%put varlist=&varlist;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-17-2012 11:32 AM

Dear Tom,

Thanks for your assistance. I will try this code and update you soon with what I intend to do here

Kind regards

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-18-2012 10:40 AM

Dear Tom,

I successfully ran your code

I, then tried the point no. 2 as mentioned in the initial question - "**to use proc logistic and get the desired output as given in the 'step 2' tab of the excel**".

**proc** **summary** data=Experiment ;

var _numeric_;

output out=means (drop=_type_ _freq_);

**run**;

**proc** **transpose** data=means out=vertical;

id _Stat_;

**run**;

**proc** **sql** ;

select _name_ into :varlist separated by ' '

from vertical

where mean > **0.02**

;

**quit**;

%put varlist=&varlist;

**PROC LOGISTIC DATA=?? descending;**

**MODEL detractor = ?? ;**

**RUN;**

I am trying to figure out, how do I assign only those independent variables which have a mean value greater than 0.02 and more importantly get the desired output as shown in the step 2 tab of the excel attached. I believe there would some selection (forward backward etc)..

Thank you so much for taking out time to read all this