Help using Base SAS procedures

help with grouping multiple column

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

help with grouping multiple column

I am having trouble when dealing with multiple column:

column A B C D and each column have value of either 0 or 1. I want to make a new variable to be able to use the PROC FREQ function for the new variable so that I can know the frequency of each individual column. Any thoughts please ??


Accepted Solutions
Solution
‎11-07-2012 10:04 AM
PROC Star
Posts: 7,363

Re: help with grouping multiple column

Does the following accomplish what you want/need?:

data have;

  input ID $ TestA TestB TestC TestD TestE;

  cards;

AB348 0 0 1 0 1

SB876 0 1 1 0 0

VB912 0 0 0 0 0

SK623 1 0 0 1 0

;

data need (keep=id test);

  set have;

  length test $5;

  array _test TestA--TestE;

  do over _test;

    if _test eq 1 then do;

      test=vname(_test);

      output;

    end;

  end;

run;

proc freq data=need;

  table test;

run;

View solution in original post


All Replies
Regular Contributor
Posts: 151

Re: help with grouping multiple column

Could you clarify exactly what you want please.  My reading of the question is that you want to create a new variable that sums across the values of variables A - D so that you can perform a proc freq on that new variable, is that correct?

Occasional Contributor
Posts: 10

Re: help with grouping multiple column

Thanks for your reply.

This is what I am looking for:

AB348 0 0 0 0 1 0

SB876 0 1 0 0 0 0

VB912 0 0 1 0 0

So, I want to name a new variable which consists of the last five variable in the above data set. Then, I want to do the PROC FREQ on the new variable so that I can get the individual freq of each of the five variables.

I hope I am clear this time.

Thanks.

PROC Star
Posts: 7,363

Re: help with grouping multiple column

No, I think you have to be clearer.  In addition to the fact that your example has 2 rows with 7 variables each, and one with 6, what do you want the resulting dataset to look like (please show as a datastep) and what do you want the result of proc freq to look like?

Occasional Contributor
Posts: 10

Re: help with grouping multiple column

I apologize for being not clear.

This is the dataset:

ID $ TestA TestB TestC TestD TestE

AB348 0 0 0 0 1

SB876 0 1 0 0 0

VB912 0 0 1 0 0

SK623 1 0 0 0 0

and so on--------

I want a new variable "Test" on which I should be able to perform the PROC FREQ:

and the output should look like

Test          Frequency   Percent Cumulative_Frequency Cumulative Percent

-----------------------------------------------------------------------------------------------------

TestA

TestB

TestC

TestD

TestE

Hopefully, I am clear now.

Thanks for helping me out.

PROC Star
Posts: 7,363

Re: help with grouping multiple column

Still not quite clear enough.  Can a record be assigned to more than one test or are they mutually exclusive?  And, will there ever be a case where all of the tests have a value of 0?

Occasional Contributor
Posts: 10

Re: help with grouping multiple column

Thanks Arthur for your patience.

Yes, it can be assigned to more than one test.

for eg:

for one of the IDs it can be:

MJ376 0 1 1  0 1

Also, in some cases all the values can be zero.

Solution
‎11-07-2012 10:04 AM
PROC Star
Posts: 7,363

Re: help with grouping multiple column

Does the following accomplish what you want/need?:

data have;

  input ID $ TestA TestB TestC TestD TestE;

  cards;

AB348 0 0 1 0 1

SB876 0 1 1 0 0

VB912 0 0 0 0 0

SK623 1 0 0 1 0

;

data need (keep=id test);

  set have;

  length test $5;

  array _test TestA--TestE;

  do over _test;

    if _test eq 1 then do;

      test=vname(_test);

      output;

    end;

  end;

run;

proc freq data=need;

  table test;

run;

Occasional Contributor
Posts: 10

Re: help with grouping multiple column

WOW...Thanks a lot, Arthur. That was really helpful.

I just have another quick question. How can I combine TestD and TestE and make a column name "Test_Var" and then I need to do the same thing with

the data variables as:

ID $ TestA TestB TestC Test_Var


Thank you very much.

PROC Star
Posts: 7,363

Re: help with grouping multiple column

One way would be to just extend the same logic.  E.g.:

data have;

  input ID $ TestA TestB TestC TestD TestE;

  cards;

AB348 0 0 1 0 1

SB876 0 1 1 0 0

VB912 0 0 0 0 0

SK623 1 0 0 1 0

;

data need (keep=id test);

  set have;

  length test $8;

  array _test TestA--TestE;

  do over _test;

    if _test eq 1 then do;

      if vname(_test) in ('TestD','TestE') then test='Test_Var';

      else test=vname(_test);

      output;

    end;

  end;

run;

proc freq data=need;

  table test;

run;

Occasional Contributor
Posts: 10

Re: help with grouping multiple column

That certainly works. Thank you very much for your help.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 265 views
  • 4 likes
  • 3 in conversation