Hi,
Down below is my dataset and procedure for reference. I need my proc tabulate table to be a dataset so I can calculate further statistics with it and it seems like "out=test1'' is not working. The proc tabulate procedure works, but the output dataset is just all over the place so I'm clearly missing something here. When I want to use that results just to do this for example:
data test2; set test1; run;
It doesn't show me the table I got in proc tabulate initially. Can anyone help? Thanks!
You should provide an example of what expect the data set to look like. "Not working" is pretty vague. I suspect you actually mean "the output doesn't match what I thought it would".
Remember that SAS cannot have two variables with the same name, so your result column labels must have additional information somewhere.
I'm not even going to guess what you expect for a result.
The output for Proc Tabulate is very specific and expects you to understand what those _type_ and _table_ variables mean. It may mean that you want a different procedure than tabulate and/or pre or post processing to get the desired result.
Hi:
The TABULATE output is not going to be in the same structure as the results you see in the results window but it is very understandable:
The results at the top show the TABULATE output. The PRINT at the bottom show what you get in TEST1. As you can see, the _TYPE_ indicates which categorical variable crossings are represented by the statistics. I color coded the _TYPE_ values with the same cells in the TABULATE screen shot, so you could trace the _TYPE_ and the calculate values back to the TABULATE results. You should be able to work with the TEST1 data for your subsequent calculations, once you understand how the structure is different and which columns you are interested in based on the _TYPE_ value.
Cynthia
Ok I understand the dataset much better, thank you!
However, the difficulty I have here is that, if the total kcals is 0 for one ID, then I want to remove that ID from my database completely. I don't have one here, but in my real database I do (total kcals is 0 for some IDs) and I will have to remove those IDs. But I can't do it if in the dataset, the total can't be differentiated from individual entries. Do you get what I mean? I might have to use another way to get to the wanted result. Thanks
Hi:
I'm not sure whether your KCALS value would be missing (shown in SAS as a . dot or period) or would actually be 0. Depending on what your data is actually like, you may need to use PROC MEANS or another procedure to pre-summarize KCALS by ID and then just make a new dataset prior to the TABULATE by eliminating the rows with the total of 0 or of missing for the KCALS value. Or you could do the summary and eliminate the rows in the data from PROC TABULATE but either way, you'll have to do something separate to eliminate the rows with total KCALS of missing or 0;
Cynthia
Yes they would be dots and zeros. I didn't have any empty ones in this dataset, but I will in my actul database for my projet. But ok that makes sense. I'll try that, thanks for your help!
Counts would be enough for now and order is not important. I can calculate pourcentages later on I guess.
I am just not able to create the two-dimensional table (categorized by id AND nova) I want with other functions like proc sql. I want my dataset to look like the results I get from proc tabulate.
I could have been more clear, yes. I meant that the output dataset it doesn't match the results. Down below is what I would want as a result (in a dataset). But I will have a deep look into the variables to understand how to get the desired result. Thanks
| ID 1 | Id 002 | ID 003 | ID 004 | ID 005 | TOTAL |
Nova 1 | kcals | kcals | kcals | kcals | kcals | kcals |
Nova 2 | kcals | kcals | kcals | kcals | kcals | kcals |
Nova 3 | kcals | Kcals | kcals | kcals | kcals | kcals |
Nova 4 | kcals | kcals | kcals | kcals | kcals | kcals |
TOTAL kcals from NOVA 1 to 4 | kcals | kcals | kcals | kcals | kcals | kcals |
Hi:
You can get IDs going across the rows like this using PROC REPORT:
That's a different approach than TABULATE, but if you do not need percents, may be easier to work with.
Cynthia
Hello,
This was very helpful! I am now struggling to get percentages as it seems like the name of my variables don't exist anymore (kcals, nova, etc.).
I first did this to delete people who didn't consume anything (after having used the synthax you provided):
data novanew2;
set novanew;
if ktot=0 then delete;
run;
Now I want to know the % of consumption of the nova categories among all participants (should be 52.8% for nova 4 for example). Could I used proc report again? And add something such as "define pctsum/"Percent of consumption"; "? Thanks.
Edited reply -
I found this way to do it:
proc tabulate data=novanew2 out=novanew3;
class nova;
var ktot;
table all='All IDS',
(nova all='All cals per id')*
ktot=' '*(rowpctsum)
/ misstext=' '
;
run;
It may help to provide a starting data set and what you expect.
You have not clearly described what the resulting data set should contain, variable names, types and some specific values.
Or why.
data test;
input id $ foodtype $ Nova $ Subnova $ kcals;
datalines;
001 butter 1 oils 10
001 fish 2 protein 15
001 banana 3 fruit 10
001 cherry 4 fruit 25
002 burger 2 frozen 5
002 pumpkin 2 other 4
002 carrot 3 veg 6
003 apple 1 fruit 100
003 tahini 4 spread 60
005 burger 2 frozen 13
005 ceral 3 grain 56
005 milk 1 prod 35
005 vinegar 4 Other 100
;
run;
proc sql;
create table temp as
select distinct 1 as a,id,input(nova,best.) as nova,cats('Total_',nova) as var length=80,sum(kcals) as value
from test
group by id,nova
union all
select distinct 2 ,id,input(nova,best.),cats('Percent_',nova) as var length=80,sum(kcals)/(select sum(kcals) from test where id=a.id ) as value
from test as a
group by id,nova
union all
select distinct 3 as a,id,999,'All_Total' as var length=80,sum(kcals) as value
from test
group by id
union all
select distinct 4 ,id,999,'All_Percent' as var length=80,1 as value
from test
group by id
union all
select distinct 5 ,'All',input(nova,best.),cats('Total_',nova) as var,sum(kcals) as value
from test
group by nova
union all
select distinct 6 ,'All',input(nova,best.),cats('Percent_',nova) as var, sum(kcals)/(select sum(kcals) from test ) as value
from test as b
group by nova
union all
select distinct 7 ,'All',9999,'All_Total' as var length=80,sum(kcals) as value
from test
union all
select distinct 8 ,'All',9999,'All_Percent' as var length=80,1 as value
from test
order by id,nova,a
;
quit;
proc transpose data=temp out=want(drop=_:);
by id;
id var;
var value;
format percent_: percent8.2;
run;
That's exactly it, I want to do a further step. Let me know if you have suggestions!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.