The SAS Output Delivery System and reporting techniques

Recoding a string variable --> most frequent category has largest value

Reply
N/A
Posts: 0

Recoding a string variable --> most frequent category has largest value

Hi there,

I have a question about coding values of a variable. I have split this into two parts as it would be helpful to me even if I could only get an answer to the first part. Thanks ahead of time for any light anyone is able to shed on this!


/** DATA SAMPLE **/
I have hundreds of string variables with over a thousand observations for each.
Sample Data
VAR1 VAR2
CC freq = 494 GG freq = 3000
CT freq = 29 GT freq = 185
TT freq = 1 TT freq = 39


/** PART A **/
I need to convert the strings to numerals while simultaneously setting the largest (most frequent) category to the largest value. Ultimately, this is so I can use catmod for polytomous regression and get the correct reference categories. (These are all independent variables, I have been able to set the correct reference category for the dependent variable.)

Is there a function that will do this for me?


/** PART B **/

and how can I integrate that into a macro?

Here is the macro I am using at the moment:
%macro bob;
%let varnum = 1;
%let var = %scan(&varlist, &varnum);
%do %while (%length(&var) ne 0);

title ;
ods listing close;
proc catmod data=cox2 ;
direct X ;
model dep_var = X &var ;
ods output Estimates = model_&var ;
run;
quit;
ods listing;

/** I am then using PROC REPORT to recover outputs **/
/** I am interested in. These lines have been **/
/** omitted from this example **/

%let varnum = %eval(&varnum + 1);
%let var = %scan(&varlist, &varnum);
%put &varnum &var;
%end;
%mend;

I have tried the "order =" option in catmod, but haven't found it to be useful in this case.

thank you!

Claudia
Message was edited by: CAS at Oct 31, 2006 3:33 AM -- to make data sample easier to read
SAS Employee
Posts: 174

Re: Recoding a string variable --> most frequent category has largest value

Posted in reply to deleted_user
This question is perhaps best answered by Technical Support. You can submit it online at http://support.sas.com/ctx/supportform/index.jsp .

-- David Kelley, SAS
N/A
Posts: 0

Re: Recoding a string variable --> most frequent category has largest value

Posted in reply to deleted_user
Thank you! I'll post it there.
Claudai
Super Contributor
Posts: 260

Re: Recoding a string variable --> most frequent category has largest value

Posted in reply to deleted_user
Apart from using the LOGISTIC procedure instead of CATMOD, which may lead you to find the ORDER=FREQ option useful, I suggest that kind of trick :

1) PROC FREQ ORDER=FREQ your data according to the chosen variable (the one to recode), then save results with ODS OUTPUT oneWayFreqs = work.myValues ;
2) Sort MyValues By DESCENDING variable, then with a DATA step, re-read the MyValues dataset to add a new variable, recoded=_N_, to it ;
3) Merge MyValues with your core data BY the chosen variable
4) use recoded as input for your modelling procedure.

This can be performed for each variable in your macro, even if it may be time-consuming to re-FREQ the whole dataset, then re-merge it, at each loop.

Regards.
Olivier
Ask a Question
Discussion stats
  • 3 replies
  • 138 views
  • 0 likes
  • 3 in conversation