BookmarkSubscribeRSS Feed
drchiragvyas
Calcite | Level 5

Hi,

 

I have 8 observations per subject ID. All subjects were asked to complete 8 tasks (i.e., 8 observations) in the same session, i.e., each ID should have an identical session time. However, in my dataset, for ID #5, 2 observations had the same session time, and 6 observations had the same session time. Therefore, I would like to exclude ID#5 from this example dataset. I have multiple IDs in my extended dataset. I would appreciate if someone could provide an example code to remove IDs with different session times.

 

Example dataset:

 

IDSessionTimev1v2v3
118.2236
118.2378
118.24612
118.26810
118.28922
118.291025
118.2101230
116.712174
216.7376
216.741110
216.75108
216.771220
216.791323
216.7101428
216.711162
216.713214
318.24118
318.25156
318.261418
318.281621
318.2101726
318.211180
318.212202
318.214256
419.35154
419.361916
419.371819
419.392024
419.311217
419.312220
419.313244
419.315292
520.361914
520.372317
528.482222
528.410245
528.412255
528.413262
528.414280
528.4163312

 

Thank you

4 REPLIES 4
AndreaVianello
Obsidian | Level 7

data Session;
 input ID SessionTime  v1 v2 v3 ;
cards;
1 18.2 2 3 6
1 18.2 3 7 8
1 18.2 4 6 12
1 18.2 6 8 10
1 18.2 8 9 22
1 18.2 9 10 25
1 18.2 10 12 30
1 16.7 12 17 4
2 16.7 3 7 6
2 16.7 4 11 10
2 16.7 5 10 8
2 16.7 7 12 20
2 16.7 9 13 23
2 16.7 10 14 28
2 16.7 11 16 2
2 16.7 13 21 4
3 18.2 4 11 8
3 18.2 5 15 6
3 18.2 6 14 18
3 18.2 8 16 21
3 18.2 10 17 26
3 18.2 11 18 0
3 18.2 12 20 2
3 18.2 14 25 6
4 19.3 5 15 4
4 19.3 6 19 16
4 19.3 7 18 19
4 19.3 9 20 24
4 19.3 11 21 7
4 19.3 12 22 0
4 19.3 13 24 4
4 19.3 15 29 2
5 20.3 6 19 14
5 20.3 7 23 17
5 28.4 8 22 22
5 28.4 10 24 5
5 28.4 12 25 5
5 28.4 13 26 2
5 28.4 14 28 0
5 28.4 16 33 12
;;; run;


proc sql ;
 create table chk as 
 select id, sessiontime, count(*) as frq
 from Session
 group by id, sessiontime
  having frq=8
 order by id, sessiontime
; quit;

proc sql; 
 create table session_ok as 
  select s.* 
   from session s 
    where id in (select id from  chk);
quit;


Also ID=1 have 18.2 and 16.7 different SessionTime.

PaigeMiller
Diamond | Level 26
proc summary data=have nway;
    class id;
    var sessiontime;
    output out=_stats_ range=sessiontime_range;
run;
data want;
    merge have _stats_;
    by id;
    if sessiontime_range=0;
    drop _type_ _freq_;
run;

 

 

In this code, we determine the mathematical range of the sessiontime values for each ID, then keep only those IDs for which the sessiontime range is zero.

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hi @drchiragvyas and welcome to the SAS Support Communities!

 

So you want to select entire "BY groups" (here: observations with a common ID) based on a condition involving more than just the first observation per group. You can do this in two steps (first, determine the IDs to keep or delete and then merge them back with the original data; examples have already been provided by AndreaVianello and PaigeMiller). But there are also several techniques which combine the two parts of the task in one step:

 

PROC SQL (unlike the two DATA step techniques) tends to change the order of observations (within a group), unless the order is uniquely determined by an ORDER BY clause. In your example data, the combination if ID and v1 could serve as a sort key:

proc sql;
create table want as
select * from have
group by id
having count(distinct sessiontime)=1
order by id, v1;
quit;

 

The two DATA step techniques read each BY group twice. In the first pass, the first value of SessionTime is stored in a temporary variable (named _st in the examples below) and a numeric flag variable (_del) is set to a positive value if a different value of SessionTime is encountered in the same BY group. In the second pass, only those BY groups are written to the output dataset for which the _del flag has not been set.

 

Double DOW loop:

data want(drop=_:);
do until(last.id);
  set have;
  by id;
  if first.id then _st=sessiontime;
  else if sessiontime ne _st then _del=1;
end;
do until(last.id);
  set have;
  by id;
  if not _del then output;
end;
run;

 

Self-interleaving:

data want(drop=_:);
set have have(in=a);
by id;
if first.id then do;
  _st=sessiontime;
  _del=0;
end;
else _del+(sessiontime ne _st);
if a and not _del;
retain _:;
run;

 

Note that the PROC SQL step would create a different result than the two DATA steps if there were missing values of SessionTime: This is because count(distinct sessiontime) counts only non-missing values, whereas the DATA steps handle missing values like any other values. So, an ID with only (identical) missing SessionTime values would be selected by the DATA steps, but not by the PROC SQL step. However, an ID with one or more missing SessionTime values together with a constant non-missing SessionTime value would be selected by the PROC SQL step, but not by the DATA steps. Of course, all three techniques could be adapted to any requirements regarding missing values.

Ksharp
Super User
data have;
infile cards truncover expandtabs;
input ID SessionTime v1 v2 v3;
cards;
1 18.2 2 3 6
1 18.2 3 7 8
1 18.2 4 6 12
1 18.2 6 8 10
1 18.2 8 9 22
1 18.2 9 10 25
1 18.2 10 12 30
2 16.7 12 17 4
2 16.7 3 7 6
2 16.7 4 11 10
2 16.7 5 10 8
2 16.7 7 12 20
2 16.7 9 13 23
2 16.7 10 14 28
2 16.7 11 16 2
2 16.7 13 21 4
3 18.2 4 11 8
3 18.2 5 15 6
3 18.2 6 14 18
3 18.2 8 16 21
3 18.2 10 17 26
3 18.2 11 18 0
3 18.2 12 20 2
3 18.2 14 25 6
4 19.3 5 15 4
4 19.3 6 19 16
4 19.3 7 18 19
4 19.3 9 20 24
4 19.3 11 21 7
4 19.3 12 22 0
4 19.3 13 24 4
4 19.3 15 29 2
5 20.3 6 19 14
5 20.3 7 23 17
5 28.4 8 22 22
5 28.4 10 24 5
5 28.4 12 25 5
5 28.4 13 26 2
5 28.4 14 28 0
5 28.4 16 33 12
;

proc sql;
create table want as
select *
 from have
  group by id
   having range(SessionTime)=0;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1001 views
  • 0 likes
  • 5 in conversation