BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MFraga
Quartz | Level 8

Hello,

I would like to use "proc freq" for descriptive analysis. My data is organized longitudinally so I would like to have the number of observations for "var1" by "id". Here an example of my data:

data have;
input

id time1 event1 weight var1;
datalines;
1 0 0 0.89595 0
1 1 0 0.89595 0
1 2 0 0.89595 0
1 3 0 0.89595 0
1 4 0 0.89595 0
1 5 0 0.89595 0
1 6 0 0.89595 0
1 7 0 0.89595 0
1 8 0 0.89595 0
1 9 0 0.89595 0
1 10 0 0.89595 0
1 11 0 0.89595 0
1 12 0 0.89595 0
1 13 0 0.89595 0
2 0 0 1.124591 1
2 1 1 1.124591 1
2 2 . 1.124591 1
3 0 0 1.15698 1
3 1 0 1.15698 1
3 2 1 1.15698 1
3 3 . 1.15698 1
4 0 1 0.987654 2
4 1 . 0.987654 2
4 2 . 0.987654 2
4 3 . 0.987654 2
4 4 . 0.987654 2
5 0 0 1.13999 2
6 0 0 1.0503 0
6 1 0 1.0503 0
6 2 0 1.0503 0
6 3 0 1.0503 0
6 4 0 1.0503 0
6 5 1 1.0503 0
6 6 . 1.0503 0
6 7 . 1.0503 0
6 8 . 1.0503 0

;

run;

The table that I want would give me the number of observations for variable "var1". It would be 2 observations for the level "0", and 2 observations for the level and 2 observations for the level "2" of the variable "var1". Any idea ?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
proc sql;
create table want as
select var1, count(distinct ID) as N_ID
from have
group by var1;
quit;

Use SQL to do this then. See above.

View solution in original post

10 REPLIES 10
Reeza
Super User
Did you try anything so far?
MFraga
Quartz | Level 8

Yes, I have tried a data step and I have the good result. This is what I have coded that gives me the good answer:

 

data want;

set have;

by id;

if first.id then number_obs=1;

else number_obs+1;

if last.id then output;

run;

 

proc freq data=want;

tables number_obs;

run;

 

The product is a table with the number of observations that each "id" has (how many times data has been collected for this person) and in the last line of the accumulated frequencies I do know how many observations I really have for each "id". The inconvenient is that I have to use a data step that creates a wide table. I was wondering if there is something more straightforward with "proc freq" (something to use with the "have" table).

 

Reeza
Super User
The data step does not create a wide table. It only counts the number of observations by ID. And this doesn't use VAR1 which you indicated in your original question so now I'm really confused as to what you're trying to do here, especially if this is the result you want.
MFraga
Quartz | Level 8

Sorry not to be clear. I will redo the question.

ballardw
Super User

@MFraga wrote:

Yes, I have tried a data step and I have the good result. This is what I have coded that gives me the good answer:

 

data want;

set have;

by id;

if first.id then number_obs=1;

else number_obs+1;

if last.id then output;

run;

 

proc freq data=want;

tables number_obs;

run;

 

The product is a table with the number of observations that each "id" has (how many times data has been collected for this person) and in the last line of the accumulated frequencies I do know how many observations I really have for each "id". The inconvenient is that I have to use a data step that creates a wide table. I was wondering if there is something more straightforward with "proc freq" (something to use with the "have" table).

 


It does not help when place a requirement such as "level 0" but do not mention which variable you are using per the example data.

 

proc freq data=have;
tables id*event1* var1
       id* event1 id*var1 
       event1*var1
      /list;
run;

provides a number of possible counts, there are others. The * operator creates a cross tabulation of the variables. The LIST option says to show the results on a single line

 

MFraga
Quartz | Level 8

Let me start again. I want to know how many observations I have for my categorical variable "var1". But I just want to count "id" once so I can know how many individuas are in my dataset for each category of "var1". What I produce with a data step is good and I have my answer doing this:

 

data want;

set have;

by id;

if first.id then number_obs=1;

else number_obs+1;

if last.id then output;

run;

 

proc freq data=want;

tables var1*number_obs;

run;

 

 

However, I was wondering if there is a way to produce an exit without using a data step first. An example of what I want to do:

 

proc freq data=have;

tables var1;

where first.id;

run;

 

I just need to know the number of individuals (how many different "IDs") I have in my dataset for each level of my categorical variable "var1". Is it clear now ?

 

Thanks for the help!

Reeza
Super User
proc sql;
create table want as
select var1, count(distinct ID) as N_ID
from have
group by var1;
quit;

Use SQL to do this then. See above.
MFraga
Quartz | Level 8

Thanks again!

MFraga
Quartz | Level 8
Thanks for the help. It is not the soluction that I am looking for. I think I did not make a clear question at the beginning.
ballardw
Super User

@MFraga wrote:
Thanks for the help. It is not the soluction that I am looking for. I think I did not make a clear question at the beginning.

One thing that helps is to take example data such as you provided and show the expected result as another data step program to show expected result if you want a data set as a result. Or show an example of a report table as built from the example data.

 

Even just clarifying whether you want report tables, read by people, or data sets is important.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 3050 views
  • 2 likes
  • 3 in conversation