Re: Proc tabulate question

SJ12 · Posted 10-26-2022 10:42 PM

Hi,

Down below is my dataset and procedure for reference. I need my proc tabulate table to be a dataset so I can calculate further statistics with it and it seems like "out=test1'' is not working. The proc tabulate procedure works, but the output dataset is just all over the place so I'm clearly missing something here. When I want to use that results just to do this for example:

data test2; set test1; run;

It doesn't show me the table I got in proc tabulate initially. Can anyone help? Thanks!

data test;

input id $ foodtype $ Nova $ Subnova $ kcals;

datalines;

001 butter 1 oils 10

001 fish 2 protein 15

001 banana 3 fruit 10

001 cherry 4 fruit 25

002 burger 2 frozen 5

002 pumpkin 2 other 4

002 carrot 3 veg 6

003 apple 1 fruit 100

003 tahini 4 spread 60

005 burger 2 frozen 13

005 ceral 3 grain 56

005 milk 1 prod 35

005 vinegar 4 Other 100

;

run;

proc tabulate data=test out=test1;

class id nova;

var kcals;

table id all='All participants',

(nova all='Total calories per participant')*

kcals=' '*(sum='Total calories' rowpctsum='Percentage')

/ misstext=' ';

run;

ballardw · Posted 10-27-2022 02:33 AM

You should provide an example of what expect the data set to look like. "Not working" is pretty vague. I suspect you actually mean "the output doesn't match what I thought it would".

Remember that SAS cannot have two variables with the same name, so your result column labels must have additional information somewhere.

I'm not even going to guess what you expect for a result.

The output for Proc Tabulate is very specific and expects you to understand what those _type_ and _table_ variables mean. It may mean that you want a different procedure than tabulate and/or pre or post processing to get the desired result.

Cynthia_sas · Posted 10-27-2022 10:36 AM

Hi:

The TABULATE output is not going to be in the same structure as the results you see in the results window but it is very understandable:

The results at the top show the TABULATE output. The PRINT at the bottom show what you get in TEST1. As you can see, the _TYPE_ indicates which categorical variable crossings are represented by the statistics. I color coded the _TYPE_ values with the same cells in the TABULATE screen shot, so you could trace the _TYPE_ and the calculate values back to the TABULATE results. You should be able to work with the TEST1 data for your subsequent calculations, once you understand how the structure is different and which columns you are interested in based on the _TYPE_ value.

Cynthia

SJ12 · Posted 10-27-2022 11:38 AM

Ok I understand the dataset much better, thank you!

However, the difficulty I have here is that, if the total kcals is 0 for one ID, then I want to remove that ID from my database completely. I don't have one here, but in my real database I do (total kcals is 0 for some IDs) and I will have to remove those IDs. But I can't do it if in the dataset, the total can't be differentiated from individual entries. Do you get what I mean? I might have to use another way to get to the wanted result. Thanks

Cynthia_sas · Posted 10-27-2022 02:20 PM

Hi:

I'm not sure whether your KCALS value would be missing (shown in SAS as a . dot or period) or would actually be 0. Depending on what your data is actually like, you may need to use PROC MEANS or another procedure to pre-summarize KCALS by ID and then just make a new dataset prior to the TABULATE by eliminating the rows with the total of 0 or of missing for the KCALS value. Or you could do the summary and eliminate the rows in the data from PROC TABULATE but either way, you'll have to do something separate to eliminate the rows with total KCALS of missing or 0;

Cynthia

SJ12 · Posted 10-27-2022 02:30 PM

Yes they would be dots and zeros. I didn't have any empty ones in this dataset, but I will in my actul database for my projet. But ok that makes sense. I'll try that, thanks for your help!

Reeza · Posted 10-27-2022 03:00 PM

Do want percentages and counts? Does it matter if they're in the order as shown or in side by side essentially (ID1-ID5 TOTAL PCT_ID1-PCT_ID5 TOTAL)?

SJ12 · Posted 10-27-2022 04:44 PM

Counts would be enough for now and order is not important. I can calculate pourcentages later on I guess.

I am just not able to create the two-dimensional table (categorized by id AND nova) I want with other functions like proc sql. I want my dataset to look like the results I get from proc tabulate.

SJ12 · Posted 10-27-2022 02:30 PM

I could have been more clear, yes. I meant that the output dataset it doesn't match the results. Down below is what I would want as a result (in a dataset). But I will have a deep look into the variables to understand how to get the desired result. Thanks

	ID 1	Id 002	ID 003	ID 004	ID 005	TOTAL
Nova 1	kcals	kcals	kcals	kcals	kcals	kcals
Nova 2	kcals	kcals	kcals	kcals	kcals	kcals
Nova 3	kcals	Kcals	kcals	kcals	kcals	kcals
Nova 4	kcals	kcals	kcals	kcals	kcals	kcals
TOTAL kcals from NOVA 1 to 4	kcals	kcals	kcals	kcals	kcals	kcals

Cynthia_sas · Posted 10-27-2022 08:27 PM

Hi:

You can get IDs going across the rows like this using PROC REPORT:

That's a different approach than TABULATE, but if you do not need percents, may be easier to work with.

Cynthia

SJ12 · Posted 11-02-2022 04:52 PM

Hello,

This was very helpful! I am now struggling to get percentages as it seems like the name of my variables don't exist anymore (kcals, nova, etc.).

I first did this to delete people who didn't consume anything (after having used the synthax you provided):

data novanew2;
set novanew;
if ktot=0 then delete;
run;

Now I want to know the % of consumption of the nova categories among all participants (should be 52.8% for nova 4 for example). Could I used proc report again? And add something such as "define pctsum/"Percent of consumption"; "? Thanks.

Edited reply -

I found this way to do it:

proc tabulate data=novanew2 out=novanew3;
class nova;
var ktot;
table all='All IDS',
(nova all='All cals per id')*
ktot=' '*(rowpctsum)
/ misstext=' '
;
run;

ballardw · Posted 10-28-2022 12:00 AM

It may help to provide a starting data set and what you expect.

You have not clearly described what the resulting data set should contain, variable names, types and some specific values.

Or why.

Ksharp · Posted 10-27-2022 07:57 AM

data test;
input id $ foodtype $ Nova $ Subnova $ kcals;
datalines;
001 butter 1 oils 10
001 fish 2 protein 15
001 banana 3 fruit 10
001 cherry 4 fruit 25
002 burger 2 frozen 5
002 pumpkin 2 other 4
002 carrot 3 veg 6
003 apple 1 fruit 100
003 tahini 4   spread   60
005  burger  2  frozen  13
005 ceral  3 grain 56
005 milk  1  prod 35
005 vinegar 4  Other  100
;
run;
 
proc sql;
create table temp as
select distinct 1 as a,id,input(nova,best.) as nova,cats('Total_',nova) as var length=80,sum(kcals) as value
 from test
  group by id,nova 
union all
select distinct 2 ,id,input(nova,best.),cats('Percent_',nova) as var length=80,sum(kcals)/(select sum(kcals) from test where id=a.id ) as value
 from test as a
  group by id,nova 
union all
select distinct 3 as a,id,999,'All_Total' as var length=80,sum(kcals) as value
 from test
  group by id
union all
select distinct 4 ,id,999,'All_Percent' as var length=80,1 as value
 from test
  group by id
union all
select distinct 5 ,'All',input(nova,best.),cats('Total_',nova) as var,sum(kcals) as value
 from test
  group by nova
union all
select distinct 6 ,'All',input(nova,best.),cats('Percent_',nova) as var, sum(kcals)/(select sum(kcals) from test ) as value
 from test as b
  group by nova 
union all
select distinct 7 ,'All',9999,'All_Total' as var length=80,sum(kcals) as value
 from test
union all
select distinct 8 ,'All',9999,'All_Percent' as var length=80,1 as value
 from test
  order by id,nova,a
;
quit; 
proc transpose data=temp out=want(drop=_:);
by id;
id var;
var value;
format percent_: percent8.2;
run;

Reeza · Posted 10-27-2022 10:51 AM

Why do you want a data set in that same format? You can pipe this output directly to PPT, Word, Excel from proc tabulate.
If you're doing a further step, there may be a better way to get/store this data.

SJ12 · Posted 10-27-2022 11:39 AM

That's exactly it, I want to do a further step. Let me know if you have suggestions!

Registration is open