How can I subset a dataset according to the value of a variable?

superbibi · Posted 12-02-2021 12:15 PM

Hi Friends, I want to subset my dataset when the type variable starts from Xin and ends with Xin or blank value. How should I write the code? Thanks.

data test; 
input id visit type $ ; 
datalines; 
1 1 xin 
1 2 . 
1 3 xin 
2 1 other 
2 2 xin 
2 3 . 
2 4 other 
3 1 . 
3 2 xin 
3 3 . 
3 4 .
 ; 
run; 

data want; 
input id visit type $ ; 
datalines; 
1 1 xin 
1 2 . 
1 3 xin 
2 2 xin 
2 3 . 
3 2 xin 
3 3 . 
3 4 . 
; 
run;

AMSAS · Posted 12-02-2021 12:29 PM

Not certain I understand your request fully. You seem to indicate you want all observations between and including the obs where type="xin", but then for ID=2 you don't pick up observation 2 4 other, yet there is an opening xin (2 2 xin) and no closing xin.

data have ; 
	input id visit type $ ; 
datalines; 
1 1 xin 
1 2 . 
1 3 xin 
2 1 other 
2 2 xin 
2 3 . 
2 4 other 
3 1 . 
3 2 xin 
3 3 . 
3 4 .
 ; 
run; 

data want ;
	drop flag ;
	retain flag 0 ;
	set have ;
	by id ;
	/* reset flag if this is the first occurrence of ID */
	if first.id then
		flag=0 ;
	/* if type="xin" flip flag (0|1) and output */
	if type="xin" then do ;
		flag=mod(flag+1,2) ;
		output ;
	end ;
	/* if type ne "xin" and flag=1 and output */
	else if flag=1 then 
		output ;
run ;

Amir · Posted 12-02-2021 12:31 PM

@superbibi, I'm not sure that I fully understand your requirement "starts from Xin and ends with Xin".

I have assumed you only want a type of "xin" (lower case) or missing.

data test; 
input id visit type : $ ; 
datalines; 
1 1 xin 
1 2 . 
1 3 xin 
2 1 other 
2 2 xin 
2 3 . 
2 4 other 
3 1 . 
3 2 xin 
3 3 . 
3 4 .
; 
run; 


data want;
   set test;
   where missing(type) or type eq 'xin';
run;

Kind regards,

Amir.

r_behata · Posted 12-02-2021 04:12 PM

Not sure if I understood your requirement. But, for what it is worth -


data want;
	array k[10] $ k1-k10;
	do _n_=1 by 1 until(last.id);
		set test;
		by id;
		k[_n_]=type;
	end;

		st=whichc('xin',of k1-k10);

	do _n_=1 to _n_ ;
		set test;	
		if visit >= st	then do;
			if type in ('xin','') then output;
		end;
	end;

	drop k: st;
run;

superbibi · Posted 12-02-2021 09:07 PM

Thank you very much for the help. It works well in this sample dataset, I will see if it goes the same in the full dataset.

superbibi · Posted 12-06-2021 10:01 AM

Thank you again for the solution. I just realized there is another situation for id2, the "type" variable might have "other xin blank other blank " sequence and I do not want any records after the second "other" as shown in the new code. It means for id 2, visit before the first "xin" and all visits equal to or larger than 4 are supposed to be dropped. What is your suggestion? Thanks.

data test;
input id  visit   type $   ;
datalines;
1 1 xin
1 2 .
1 3 xin
2 1 other
2 2 xin
2 3 .
2 4 other
2 5 .
3 1 .
3 2 xin
3 3 .
3 4 .
;
run;


data want;
input id  visit   type $   ;
datalines;
1 1 xin
1 2 .
1 3 xin

2 2 xin
2 3 .

3 2 xin
3 3 .
3 4 .
;
run;

"

Ksharp · Posted 12-03-2021 07:43 AM

data test; 
input id visit type $ ; 
datalines; 
1 1 xin 
1 2 . 
1 3 xin 
2 1 other 
2 2 xin 
2 3 . 
2 4 other 
3 1 . 
3 2 xin 
3 3 . 
3 4 .
 ; 
run; 


data temp;
 set test;
 by id;
 n+1;
 if type='xin' or  missing(type);
run;
data temp1;
 set temp;
 by id;
 if type='xin' or  (last.id and missing(type));
run;
data temp2;
 set temp1;
 by id;
 if first.id then seq=0;
 seq+1;
 if mod(seq,2)=1 then group+1;
run;
data temp3;
 merge temp2 temp2(keep=group n rename=(group=_group n=_n) firstobs=2);
 output;
 if group=_group then do;
   do i=n+1 to _n-1;
    n=i;output;
   end;
 end;
keep n;
run;

data want;
 if _n_=1 then do;
  if 0 then set temp3;
  declare hash h(dataset:'temp3');
  h.definekey('n');
  h.definedone();
 end;
set test;
if h.check(key:_n_)=0;
drop n;
run;

How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

Re: How can I subset a dataset according to the value of a variable?

SAS Innovate 2025: Call for Content

Classroom Training Available!