BookmarkSubscribeRSS Feed
SJ12
Calcite | Level 5

Hi, 

Down below is my dataset and procedure for reference. I need my proc tabulate table to be a dataset so I can calculate further statistics with it and it seems like "out=test1'' is not working. The proc tabulate procedure works, but the output dataset is just all over the place so I'm clearly missing something here. When I want to use that results just to do this for example:

data test2; set test1; run; 

It doesn't show me the table I got in proc tabulate initially. Can anyone help? Thanks!

 

data test;
input id $ foodtype $ Nova $ Subnova $ kcals;
datalines;
001 butter 1 oils 10
001 fish 2 protein 15
001 banana 3 fruit 10
001 cherry 4 fruit 25
002 burger 2 frozen 5
002 pumpkin 2 other 4
002 carrot 3 veg 6
003 apple 1 fruit 100
003 tahini 4   spread   60
005  burger  2  frozen  13
005 ceral  3 grain 56
005 milk  1  prod 35
005 vinegar 4  Other  100
;
run;
 
proc tabulate data=test out=test1;
   class id nova;
   var kcals;
   table id all='All participants',
        (nova all='Total calories per participant')*
             kcals=' '*(sum='Total calories' rowpctsum='Percentage')     
             / misstext=' ';
run;
15 REPLIES 15
ballardw
Super User

You should provide an example of what expect the data set to look like. "Not working" is pretty vague. I suspect you actually  mean "the output doesn't match what I thought it would".

Remember that SAS cannot have two variables with the same name, so your result column labels must have additional information somewhere.

 

I'm not even going to guess what you expect for a result.

 

The output for Proc Tabulate is very specific and expects you to understand what those _type_ and _table_ variables mean. It may mean that you want a different procedure than tabulate and/or pre or post processing to get the desired result.

Cynthia_sas
SAS Super FREQ

Hi:

  The TABULATE output is not going to be in the same structure as the results you see in the results window but it is very understandable:

Cynthia_sas_0-1666881079164.png

  The results at the top show the TABULATE output. The PRINT at the bottom show what you get in TEST1. As you can see, the _TYPE_ indicates which categorical  variable crossings are represented by the statistics. I color coded the _TYPE_ values with the same cells in the TABULATE screen shot, so you could trace the _TYPE_ and the calculate values back to the TABULATE results. You should be able to work with the TEST1 data for your subsequent calculations, once you understand how the structure is different and which columns you are interested in based on the _TYPE_ value.

Cynthia

SJ12
Calcite | Level 5

Ok I understand the dataset much better, thank you!

 

However, the difficulty I have here is that, if the total kcals is 0 for one ID, then I want to remove that ID from my database completely. I don't have one here, but in my real database I do (total kcals is 0 for some IDs) and I will have to remove those IDs. But I can't do it if in the dataset, the total can't be differentiated from individual entries. Do you get what I mean? I might have to use another way to get to the wanted result. Thanks

 

 

Cynthia_sas
SAS Super FREQ

Hi:

  I'm not sure whether your KCALS value would be missing (shown in SAS as a . dot or period) or would actually be 0. Depending on what your data is actually like, you may need to use PROC MEANS or another procedure to pre-summarize KCALS by ID and then just make a new dataset prior to the TABULATE by eliminating the rows with the total of 0 or of missing for the KCALS value. Or you could do the summary and eliminate the rows in the data from PROC TABULATE but either way, you'll have to do something separate to eliminate the rows with total KCALS of missing or 0;

Cynthia

SJ12
Calcite | Level 5

Yes they would be dots and zeros. I didn't have any empty ones in this dataset, but I will in my actul database for my projet. But ok that makes sense. I'll try that, thanks for your help! 

Reeza
Super User
Do want percentages and counts? Does it matter if they're in the order as shown or in side by side essentially (ID1-ID5 TOTAL PCT_ID1-PCT_ID5 TOTAL)?

SJ12
Calcite | Level 5

Counts would be enough for now and order is not important. I can calculate pourcentages later on I guess.

I am just not able to create the two-dimensional table (categorized by id AND nova) I want with other functions like proc sql. I want my dataset to look like the results I get from proc tabulate.

SJ12
Calcite | Level 5

I could have been more clear, yes. I meant that the output dataset it doesn't match the results.  Down below is what I would want as a result (in a dataset). But I will have a deep look into the variables to understand how to get the desired result. Thanks

 

 

ID 1

Id 002

ID 003

ID 004

ID 005

TOTAL

Nova 1 

kcals

kcals

kcals

kcals

kcals

kcals

Nova 2

kcals

kcals

kcals

kcals

kcals

kcals

Nova 3

kcals

Kcals

kcals

kcals

kcals

kcals

Nova 4

kcals

kcals

kcals

kcals

kcals

kcals

TOTAL kcals from NOVA 1 to 4

kcals

kcals

kcals

kcals

kcals

kcals

Cynthia_sas
SAS Super FREQ

Hi:

  You can get IDs going across the rows like this using PROC REPORT:

Cynthia_sas_0-1666916784619.png

That's a different approach than TABULATE, but if you do not need percents, may be easier to work with.

 

Cynthia

SJ12
Calcite | Level 5

Hello,

This was very helpful! I am now struggling to get percentages as it seems like the name of my variables don't exist anymore (kcals, nova, etc.).

I first did this to delete people who didn't consume anything (after having used the synthax you provided):

 

data novanew2;
set novanew;
if ktot=0 then delete;
run;

 

Now I want to know the % of consumption of the nova categories among all participants (should be 52.8% for nova 4 for example). Could I used proc report again? And add something such as "define pctsum/"Percent of consumption"; "? Thanks. 

 

Edited reply -

I found this way to do it:

 

proc tabulate data=novanew2 out=novanew3;
class nova;
var ktot;
table all='All IDS',
(nova all='All cals per id')*
ktot=' '*(rowpctsum)
/ misstext=' '
;
run;

ballardw
Super User

It may help to provide a starting data set and what you expect.

 

You have not clearly described what the resulting data set should contain, variable names, types and some specific values.

 

Or why.

Ksharp
Super User
data test;
input id $ foodtype $ Nova $ Subnova $ kcals;
datalines;
001 butter 1 oils 10
001 fish 2 protein 15
001 banana 3 fruit 10
001 cherry 4 fruit 25
002 burger 2 frozen 5
002 pumpkin 2 other 4
002 carrot 3 veg 6
003 apple 1 fruit 100
003 tahini 4   spread   60
005  burger  2  frozen  13
005 ceral  3 grain 56
005 milk  1  prod 35
005 vinegar 4  Other  100
;
run;
 
proc sql;
create table temp as
select distinct 1 as a,id,input(nova,best.) as nova,cats('Total_',nova) as var length=80,sum(kcals) as value
 from test
  group by id,nova 
union all
select distinct 2 ,id,input(nova,best.),cats('Percent_',nova) as var length=80,sum(kcals)/(select sum(kcals) from test where id=a.id ) as value
 from test as a
  group by id,nova 
union all
select distinct 3 as a,id,999,'All_Total' as var length=80,sum(kcals) as value
 from test
  group by id
union all
select distinct 4 ,id,999,'All_Percent' as var length=80,1 as value
 from test
  group by id
union all
select distinct 5 ,'All',input(nova,best.),cats('Total_',nova) as var,sum(kcals) as value
 from test
  group by nova
union all
select distinct 6 ,'All',input(nova,best.),cats('Percent_',nova) as var, sum(kcals)/(select sum(kcals) from test ) as value
 from test as b
  group by nova 
union all
select distinct 7 ,'All',9999,'All_Total' as var length=80,sum(kcals) as value
 from test
union all
select distinct 8 ,'All',9999,'All_Percent' as var length=80,1 as value
 from test
  order by id,nova,a
;
quit; 
proc transpose data=temp out=want(drop=_:);
by id;
id var;
var value;
format percent_: percent8.2;
run;
Reeza
Super User
Why do you want a data set in that same format? You can pipe this output directly to PPT, Word, Excel from proc tabulate.
If you're doing a further step, there may be a better way to get/store this data.

SJ12
Calcite | Level 5

That's exactly it, I want to do a further step. Let me know if you have suggestions! 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 1545 views
  • 0 likes
  • 5 in conversation