BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

My data has id and value.

For each row[i], I want to check the window from [i] to [i+5]. I want to find the first non-missing value, say 1, and count how many 1 before -1 shows up. 

If the first non-missing value is -1, I want to count how many -1 before 1 show up.

To distinguish between 1 series and -1 series, the count should have + sign for 1 and - sign for -1.

 

After a lot of work, I get the code run correctly but Separately for +1 and -1 and then combine the 2 counts.

The logic is simply counting and very straight forward as you can see.

 

My issue is how to combine the 2 parts in 1 do loop.

 

Can you please help?

Many thanks,

HHC

data have;
  input id value;
  datalines;
1 1
1 .
1 1
1 .
1 1
1 .
1 .
1 .
1 .
1 .
1 .
1 -1
1 .
1 -1
1 .
1 .
1 1
1 1
2 -1
2 .
2 1
2 .
2 1
;
/*Working on positive value*/
data want; 
set have nobs=nobs;
drop i j_: out_:;
out_i=0;
pos_count=0;
i+1;
DO j=i to i+5 until (j=nobs or out_j=1);
	set have (keep = id value rename=(id=j_id value=j_value)) point=j;
		if id^=j_id then leave;
		if j_value=1  	then pos_count=pos_count+1;
		else
		if j_value=-1	then leave;
END;
run;

/*Working on Negative value*/
data want; 
set want nobs=nobs;
drop i j_: out_:;
out_i=0;
neg_count=0;
i+1;
DO j=i to i+5 until (j=nobs or out_j=1);
	set want (keep = id value rename=(id=j_id value=j_value)) point=j;
		if id^=j_id then leave;
		if j_value=-1  	then neg_count=neg_count-1;
		else
		if j_value=1	then leave;
END;
run;

/*Combine*/
data want; set want;
final_count =  pos_count+neg_count;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
hhchenfx
Barite | Level 11

Finally, I got it!!!

The 2 issues I fixed which is quite difficult to see is:

1- To get data for all rows, I need a quite "stupid" and "redundant" step: Do j=i;

2- Once value=1 is found, put an out statement so that the outer loop will not run again (to avoid the double/triple counts)

 

If you could, please help to make my code more efficient or suggest other solution.

Many many thanks.

HHC


data want; 
set have nobs=nobs;
drop i j k l k_: l_:  out_:;
i+1;

DO J=i;
	out_k=0;
	count=0;
	Do k=j to j+5 until (k=nobs or out_k=1);
	set have (keep = id value rename=(id=k_id value=k_value)) point=k;

		IF id^=k_id then leave;

		/*positive section----*/
		ELSE
		IF k_value=1  	then do; 
			out_k=1;/*critical step: once k_value=1, it is the only iteration for k (no more run on k)*/
			DO l=k to j+5 until (l=nobs);
			set have (keep = id value rename=(id=l_id value=l_value)) point=l;
				if id^=l_id then leave;
				else
				if l_value>0 then count=count+1;
				else
				if l_value=-1	then leave;
			End;
			End;

		/*negative section----*/
		ELSE
		IF k_value=-1  	then do; 
			out_k=1;/*critical step: once k_value=1, it is the only iteration for k (no more run on k)*/
			DO l=k to j+5 until (l=nobs);
			set have (keep = id value rename=(id=l_id value=l_value)) point=l;
				if id^=l_id then leave;
				else
				if l_value=-1 then count=count-1;
				else
				if l_value=1	then leave;
			End;
			End;
	end;
END;
run;

View solution in original post

6 REPLIES 6
hhchenfx
Barite | Level 11

Finally, I got it!!!

The 2 issues I fixed which is quite difficult to see is:

1- To get data for all rows, I need a quite "stupid" and "redundant" step: Do j=i;

2- Once value=1 is found, put an out statement so that the outer loop will not run again (to avoid the double/triple counts)

 

If you could, please help to make my code more efficient or suggest other solution.

Many many thanks.

HHC


data want; 
set have nobs=nobs;
drop i j k l k_: l_:  out_:;
i+1;

DO J=i;
	out_k=0;
	count=0;
	Do k=j to j+5 until (k=nobs or out_k=1);
	set have (keep = id value rename=(id=k_id value=k_value)) point=k;

		IF id^=k_id then leave;

		/*positive section----*/
		ELSE
		IF k_value=1  	then do; 
			out_k=1;/*critical step: once k_value=1, it is the only iteration for k (no more run on k)*/
			DO l=k to j+5 until (l=nobs);
			set have (keep = id value rename=(id=l_id value=l_value)) point=l;
				if id^=l_id then leave;
				else
				if l_value>0 then count=count+1;
				else
				if l_value=-1	then leave;
			End;
			End;

		/*negative section----*/
		ELSE
		IF k_value=-1  	then do; 
			out_k=1;/*critical step: once k_value=1, it is the only iteration for k (no more run on k)*/
			DO l=k to j+5 until (l=nobs);
			set have (keep = id value rename=(id=l_id value=l_value)) point=l;
				if id^=l_id then leave;
				else
				if l_value=-1 then count=count-1;
				else
				if l_value=1	then leave;
			End;
			End;
	end;
END;
run;
mkeintz
PROC Star

I'm posting the same response I gave to your virtually identical topic.  It does not require separate sets of codes for -1 vs +1:

 

data have;
  input id value @@;
  datalines;
1 1    1 .    1 1    1 .     1 1    1 .     1 .    1 .    
1 .    1 .    1 .    1 -1    1 .    1 -1    1 .    1 .    
1 1    1 1    
2 -1    2 .    2 1    2 .    2 1    
;

data want (drop=_: i);
  merge have (firstobs=1)
        have (firstobs=2 keep=id value rename=(id=_id2 value=_val2))
        have (firstobs=3 keep=id value rename=(id=_id3 value=_val3))
        have (firstobs=4 keep=id value rename=(id=_id4 value=_val4))
        have (firstobs=5 keep=id value rename=(id=_id5 value=_val5))
        have (firstobs=6 keep=id value rename=(id=_id6 value=_val6)) ;

  array _x {*} id     _id:  ;
  array _v {*} value  _val: ;

  count=0;
  do i=1 to dim(_x) while(id=_x{i});
    if _nonmiss1=. then _nonmiss1=_v{i};
    else if _v{i}=. then continue;
    else if _v{i}=_nonmiss1 then count=count+sign(_nonmiss1);
    else if sign(_v{i})=-1*sign(_nonmiss1) then leave;
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hhchenfx
Barite | Level 11

Hi mkeitz,

I run your code and for row 4 and 5, the count is 0. They suppose to be 1.

Since I will have multiple windows to check, such as 5, 50 or 60. I am not sure if this code is easy to change.

Thank you so much for helping.

HHC

Tom
Super User Tom
Super User

Use the POINT= option to "lookahead".

You will need to know how many observations there are for each ID to make sure you don't look TOO far.

So given this data:

data have;
  input id value @@;
datalines;
1 1    1 .    1 1    1 .     1 1    1 .     1 .    1 .    
1 .    1 .    1 .    1 -1    1 .    1 -1    1 .    1 .    
1 1    1 1    
2 -1    2 .    2 1    2 .    2 1    
;

With a window of 5 observations (counting the current non-missing value).

%let window=5;
data want;
  length id value nfound firstobs nobs 8;
  keep id -- nobs;
  do nobs=1 by 1 until(last.id);
    set have;
    by id;
  end;
  do row=1 by 1 until(last.id);
    set have;
    by id;
    obsnum+1;
    if row=1 then firstobs=obsnum;
    if not missing(value) then do;
       nfound=1;
       do p=obsnum+1 to min(obsnum+&window-1,firstobs+nobs-1) until(newvalue=-value);
         set have(keep=value rename=(value=newvalue)) point=p;
         nfound + (value=newvalue);
       end;
       output;
    end;
  end;
run;

You get these counts.

Obs    id    value    nfound    firstobs    nobs

  1     1       1        3          1        18
  2     1       1        2          1        18
  3     1       1        1          1        18
  4     1      -1        2          1        18
  5     1      -1        1          1        18
  6     1       1        2          1        18
  7     1       1        1          1        18
  8     2      -1        1         19         5
  9     2       1        2         19         5
 10     2       1        1         19         5

hhchenfx
Barite | Level 11

Thank you as always, Tom.

HHC

ahuige
Obsidian | Level 7

Have not reviewed/replied SAS programming threads for fun in recent years. This one is an interesting fun topic  to practice all basic DATA step statement elements.

 

data have;
  input id value;
  datalines;
1 1
1 .
1 1
1 .
1 1
1 .
1 .
1 .
1 .
1 .
1 .
1 -1
1 .
1 -1
1 .
1 .
1 1
1 1
2 -1
2 .
2 1
2 .
2 1
;

data ahuige(keep=id value count); 
  do obsnum=1 to last;
     set have nobs=last;
     count=sum(0,sign(value));
     over=0;
     do point=obsnum+1 to obsnum+5;
     if point<=last and not over then 
       do;
       set have(rename=(id=nowId value=nowValue))  point=point; 
       over=((id ne nowId) or sign(NowValue)*sign(count)=-1 );
       if not over then count=sum(count,sign(NowValue)) ; 
       end;        
     end;
     output;
  end;stop;
run;
proc print;run;;










sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 910 views
  • 1 like
  • 4 in conversation