BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
aigiss
Obsidian | Level 7

Hello everyone. I have a dataset called combined_dataset. I have a variable called "genderid" with multiple subcategorical variables, see below: 

aigiss_0-1676493123988.png

 

I want to conduct a 2x2 table with each of these subcategorical variables with my disease outcome of interest. However, I need to make each categories into a binary variable. I tried doing female by doing this code: 

proc format;
  value $female
    "female" = "yes"
    other = "no";
run;
data combined_dataset;
  set k.combined_dataset;
  if genderid = "Female" then Female2= "Yes";
  else Female2= "No";
run;

However I looked over the frequency table for that specific variable, I get this result:

aigiss_1-1676493267420.png

What am I doing wrong? I appreciate all the help! 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
option validvarname=v7;
proc format;
value yesno 
1='Yes' 0='No';
run;

proc glmselect data=sashelp.class NOPRINT outdesign(addinputvars)=Want;
   class      sex;   /* list the categorical variables here */
   model weight = sex /  noint selection=none;
run;

proc freq data=want;
table sex_f sex_m;
format sex_: yesno.;
run;

You can run this to see how the dummy variables in the example I linked to previously make this trivial to do. 

View solution in original post

9 REPLIES 9
ballardw
Super User

If you want to use a format make sure the actual values of the variable are used. Your Proc freq output shows "Female" but your format wants to use "female". You also do not use the format $female anywhere so I am not sure why you include that.

I would expect this to yield what I think you want.

proc format;
  value $female
    "Female" = "Yes"
    other = "No";
run;

proc freq data=k.combined_dataset;
   tables genderid;
   format genderid $female. ;
run;

 

Please show the LOG for the Proc Freq that generated the "incorrect" output. If needed re-run the code. Then go to the log, copy from the log the code and all the notes or messages and then on the forum open a text box with the </> icon and paste the text.

 

You may have to re-run the data step showing us the log there as well. I might expect that shown incorrect output if a prior data step had run as and not actually replaced with code that used the "else Female2="No"; "

data combined_dataset;
  set k.combined_dataset;
  if genderid = "Female" then Female2= "Yes";
run;
aigiss
Obsidian | Level 7

This is the result of the last code: 

aigiss_0-1676495522736.png

The log shows no issue of the proposed code. However, the issue is that I cannot conduct other multivariables I need to reformat. This is not what I am looking for. 

ballardw
Super User

@aigiss wrote:

This is the result of the last code: 

aigiss_0-1676495522736.png

The log shows no issue of the proposed code. However, the issue is that I cannot conduct other multivariables I need to reformat. This is not what I am looking for. 


Then show us what your are looking for. Please do not make us guess. Your only example shows one format and one variable with two values. So we have no idea what you may want.

aigiss
Obsidian | Level 7

Hi there I did not mean to make you guess. I assumed the description above is sufficient to know what I wanted. But here is what I want.

 

In the genderid there are multiple subcategories

Genderid = 

Female

Male 

Unknown/Missing

Transgender Male 

etc...

 

What want to do is breakdown each subcategories to have a binary result:

Female Y/N

Male Y/N

Unknown Y/N

 

Please let me know if you need more clarifications. 

Reeza
Super User
This generates a 2x1 table in PROC FREQ not a 2x2 table. If you want a 2x2 table some other variable must be identified.
aigiss
Obsidian | Level 7

Correct. And to do so I would like to create a new variable column that has females with Y/N (observations are string) under the genderid variable.

Reeza
Super User
option validvarname=v7;
proc format;
value yesno 
1='Yes' 0='No';
run;

proc glmselect data=sashelp.class NOPRINT outdesign(addinputvars)=Want;
   class      sex;   /* list the categorical variables here */
   model weight = sex /  noint selection=none;
run;

proc freq data=want;
table sex_f sex_m;
format sex_: yesno.;
run;

You can run this to see how the dummy variables in the example I linked to previously make this trivial to do. 

ballardw
Super User

@aigiss wrote:

Hi there I did not mean to make you guess. I assumed the description above is sufficient to know what I wanted. But here is what I want.

 

In the genderid there are multiple subcategories

Genderid = 

Female

Male 

Unknown/Missing

Transgender Male 

etc...

 

What want to do is breakdown each subcategories to have a binary result:

Female Y/N

Male Y/N

Unknown Y/N

 

Please let me know if you need more clarifications. 


Mud. Clear as.

Provide some data with values of the genderid as data step code or at least plain text pasted into a text box, not a picture (cannot code against pictures) and then show a manually created report /count of what you are expecting from that example data. Make sure the counts are different numbers so we can trace some logic.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 509 views
  • 0 likes
  • 3 in conversation