BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JUMMY
Obsidian | Level 7

I have a data set below. I want to format all categorical variables that have numbers (variables starting with "i" are all indicators (1="yes" and 0="no")). The picture below only shows six of those variables (idem, ipsychrx, icnsrx, incoven, iatypical and ibenzo). But I have 47 of them in total. How to I format them without necessarily creating proc format for each variable? And how do I create a new four level variable type from the variables 'iconven' and 'iatypical' that have values 0="neither", 1="conventional", 2="atypical only", 3="both"?

 

Screen Shot 2018-10-02 at 11.03.27 PM.png

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @JUMMY,

 

As has been mentioned, variable lists make it easy to associate a format with several variables. In your case the list i: using the common prefix "i" and the colon operator is ideal. How could there be a shorter way to refer to all variables whose names start with "i"?

 

So, you define your formats (with the FORMAT procedure) once ...

proc format;
value yesno
0='no'
1='yes'
.=' ';

value type
0='neither'
1='conventional'
2='atypical only'
3='both'
.=' ';
run;

... and then (if desired) associate them with variables using the FORMAT statement:

data want;
set have;
if n(iatypical, iconven)=2 then type=2*iatypical+iconven;
format i: yesno. type type.;
run;

I put the FORMAT statement into a data step only because a new variable, TYPE, had to be created (not to be confused with the format of the same name). Otherwise, I would have used PROC DATASETS (with a MODIFY statement and the above FORMAT statement) to avoid unnecessary reading and writing of data if only changes to a dataset's "header information" (such as formats and labels) had been required.

View solution in original post

7 REPLIES 7
Reeza
Super User

@JUMMY wrote:

I have a data set below. I want to format all categorical variables that have numbers (variables starting with "i" are all indicators (1="yes" and 0="no")). The picture below only shows six of those variables (idem, ipsychrx, icnsrx, incoven, iatypical and ibenzo). But I have 47 of them in total. How to I format them without necessarily creating proc format for each variable? And how do I create a new four level variable type from the variables 'iconven' and 'iatypical' that have values 0="neither", 1="conventional", 2="atypical only", 3="both"?

 

Screen Shot 2018-10-02 at 11.03.27 PM.png


 

Applying a format is different than creating a format. You can create a format once and apply that format to multiple variables in various ways. 

 

To apply the format, you can use a variable list with the format. 

 

Format iFIRSTVAR —iLastVar yesNofmt.;

Different ways of referencing a variable list. 

https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

Please post data as text not images, to work with your data we’d have to type it out. 

JUMMY
Obsidian | Level 7
But the data set is not well arranged. So how do I go about it?
Reeza
Super User
What does not well arranged mean? Which of the variable listing options in the link provided did you try that didn’t work - please show your code as well.
Kurt_Bremser
Super User

It is NEVER proper to post data as a picture. Pictures do not reveal

  • variable types
  • length of variables
  • assigned formats
  • raw values that underlie those formats

and they force anyone who wants to help you to type data off the screenshot, which is

  • tedious
  • opens possibilities for mistakes

Please post example data as a data step with datalines. My second footnote leads you to a macro which does the conversion from a dataset to a data step automatically. See my third footnote for advice on posting code in a reliable manner.

FreelanceReinh
Jade | Level 19

Hi @JUMMY,

 

As has been mentioned, variable lists make it easy to associate a format with several variables. In your case the list i: using the common prefix "i" and the colon operator is ideal. How could there be a shorter way to refer to all variables whose names start with "i"?

 

So, you define your formats (with the FORMAT procedure) once ...

proc format;
value yesno
0='no'
1='yes'
.=' ';

value type
0='neither'
1='conventional'
2='atypical only'
3='both'
.=' ';
run;

... and then (if desired) associate them with variables using the FORMAT statement:

data want;
set have;
if n(iatypical, iconven)=2 then type=2*iatypical+iconven;
format i: yesno. type type.;
run;

I put the FORMAT statement into a data step only because a new variable, TYPE, had to be created (not to be confused with the format of the same name). Otherwise, I would have used PROC DATASETS (with a MODIFY statement and the above FORMAT statement) to avoid unnecessary reading and writing of data if only changes to a dataset's "header information" (such as formats and labels) had been required.

JUMMY
Obsidian | Level 7
@FreelanceReinhard, there are four original cost variables: stpamt, sfpamt, sppamt and srpamt (some are not shown in the data above). How do I use arrays and do loops to create two sets of transformed cost variables, the natural log of each of the costs and the square root of each the costs?
FreelanceReinh
Jade | Level 19

Just insert a few lines into the datastep that I suggested:

data want;
set have;
array a[*]   stpamt   sfpamt   sppamt   srpamt;
array l[*] l_stpamt l_sfpamt l_sppamt l_srpamt;
array r[*] r_stpamt r_sfpamt r_sppamt r_srpamt;
do j=1 to dim(a);
  if a[j]> 0 then l[j]=log(a[j]);
  if a[j]>=0 then r[j]=sqrt(a[j]);
end;
if n(iatypical, iconven)=2 then type=2*iatypical+iconven;
format i: yesno. type type.;
drop j;
run;

This assumes that there are no variables aj, l or r in dataset HAVE and also none of the eight newly created variables l_stpamt, ..., r_srpamt. If this assumption is violated, please use different names to avoid name conflicts.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1217 views
  • 3 likes
  • 4 in conversation