BookmarkSubscribeRSS Feed
andp
Calcite | Level 5

Hi all,

I'm currently trying to output a table with the basic form of:

Code          Case          Control

1               300               250

2               100               100

3               200               150

etc.

Case is a sum of all the people who have a 1 for Code_1exp and Case. Control is for people with Case = 0.

The data is currently in the format of

ID              Code_1                    Code_2                    Code_3          Case

1001               .022                         0                              0               1

1002                    0                         0                         .010               0     

1003                    0                         0                              0               1

etc.

Code_x are all variables with values GE 0 and Case is a binary variable in this table.

I'm looking for a highly efficient way to code this since I have approximately 180 Codes.

Thanks for your help!

7 REPLIES 7
ballardw
Super User

Do you mean that you have code variables Code_1 to Code_180 or does Code_1 take on potentially 180 different values?

If the first case which values of Code_1 are used to sum over for the Case/control total? You might need to provide some slightly longer example, say 20 records, with the desired output for those 20.

My guess for your 3 lines of example data is:

Code     Case     Control

1           1          0

2           0          0

3           0          1

Am I close?

andp
Calcite | Level 5

For the first question, Code_1 to Code_180, each of the codes have potentially any value above 0. However, I'm mainly looking at ever/never type results so a 1 can be used to replace any value above 0.

Here's a more accurate example table. To get registered into the Code row, they need a nonzero value for Code_x. So for Code_1, there are a total of 7 nonzero ID's, and out of those 7, 3 are cases and 4 are noncases.

Your guess is correct. I hope this clarifies what I'm aiming for.

Thanks again!

Code          Case          Control

1                    3                  4     

2                    2                  1

3                    7                 10

IDCode_1Code_2Code_3Case
10010.02201.21
1002002.20
1003003.20
10040.15304.20
100500.31.21
1006002.20
10070030
10085.2051
10092.3081
10100010
10110030
10120020
10133.130.110
10140031
10150000
10160.006000
10170.13030
10180221
10190001
10200011
ballardw
Super User

Does this work for you? (The window put lots of blank lines in here when it pasted. so the datalines aren't right and I'm too lazy to fix them right now)

 

data base;

input ID Code_1 Code_2 Code_3 Case ;

datalines;

1001 0.022 0 1.2 1

1002 0 0 2.2 0

1003 0 0 3.2 0

1004 0.153 0 4.2 0

1005 0 0.3 1.2 1

1006 0 0 2.2 0

1007 0 0 3 0

1008 5.2 0 5 1

1009 2.3 0 8 1

1010 0 0 1 0

1011 0 0 3 0

1012 0 0 2 0

1013 3.13 0.1 1 0

1014 0 0 3 1

1015 0 0 0 0

1016 0.006 0 0 0

1017 0.13 0 3 0

1018 0 2 2 1

1019 0 0 0 1

1020 0 0 1 1

;

run;

data transposed (keep= id code case);

set base ;

array codes code: ;   /* if your varaiables aren't actually named starting with code or have other variables that start with "code"  then you'll need to work with a more explicit list*/

do i = 1 to dim(codes);

if codes > 0 then code=i;

else code=.;

output;

end;

run;

proc format library=work;

value case 1='Case' 0='Control';

run;

proc tabulate data= transposed ;

class case /order=formatted;

class code;

table code='',case=''*n=''*f=f6.0

/misstext=' ' box='Code';

format case case.;

run;

andp
Calcite | Level 5

Hi,

Thanks for the help ballardw. I'm currently running the code and it's looking like each code gets a new table created, I was wondering is there a way to combine the results of these tables into one master table?

Thanks.

andp
Calcite | Level 5

I'm not using your code exactly as I've made it into a macro rather than an array due to the lack of ordering in the Codes (eg. 7, 32, 49, etc. instead of 1,2,3) and also noticed that the results come out more like

Code 7           Case                Control

1                        42                    9483

I'm trying to output a table into Excel that would show (ideally)

Code          Case          Control

7                    42          9483

32                    20          8203

49                    5          206

etc.

ballardw
Super User

Proc tabulate creates the single output table. There is only one intermediate data set, so only two tables involved anyway. Depending on HOW you try to export to Excel you might get odd results..

What piece of the code makes you think it is generating one table per code???

ballardw
Super User

If your code variables aren't sequentially numbered, not mentioned in your original post, other approaches can be made.

For instance the ARRAY statement could be:

Array codes code_1 - code_180;  /* or whatever your max variable name is */

This will create lots of lines in the transposed dataset with missing values for code but they are ignored in proc tabulate anyway.


suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1667 views
  • 0 likes
  • 2 in conversation