BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bhupeshpanwar
Fluorite | Level 6

Hi,

 

Can someone please help me with the following dataset processing:

 

time price action;

1     5     buy

2     2     buy

3     8     

4     10   sell 

5     12   buy

6     5     sell

7     2     buy

8     8     

9     10   sell 

10    12   buy

11     8     

12    10   sell 

13    12   buy

 

I would like to pickup the first buy signal and then write 'hold' for four observations below it irrespective of the following signals. Desired output should look like this:

time price action;

1     5     buy

2     2     hold

3     8     hold

4     10    hold

5     12    hold

6     5     sell

7     2     buy

8     8     hold

9     10    hold

10    12  hold

11     8    hold

12    10   sell 

13    12   buy

 

I am having trouble with this kind of vertical processing in SAS. I guess an alternate way would be to transpose the data to wide and work with it. I would love to see if this can be achieved without data transformation using loops?

 

Thanks a lot,

 

Bhupesh 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Like this?

 data HAVE;
  input TIME PRICE ACTION $; 
  retain BUYOBS; 
  if ^BUYOBS & ACTION='buy' then BUYOBS = _N_ ; 
  if BUYOBS < _N_ < BUYOBS+5 then ACTION='hold';
  if _N_ = BUYOBS+4 then BUYOBS=0;
cards;
1 5 buy
2 2 buy
3 8 c
4 10 sell 
5 12 buy
6 5 sell
7 2 buy
8 8 c
9 10 sell 
10 12 buy
11 8 c
12 10 sell 
13 12 buy
run;
Obs TIME PRICE ACTION
1 1 5 buy
2 2 2 hold
3 3 8 hold
4 4 10 hold
5 5 12 hold
6 6 5 sell
7 7 2 buy
8 8 8 hold
9 9 10 hold
10 10 12 hold
11 11 8 hold
12 12 10 sell
13 13 12 buy

 

 

 

View solution in original post

10 REPLIES 10
ChrisNZ
Tourmaline | Level 20

Like this?

 data HAVE;
  input TIME PRICE ACTION $; 
  retain BUYOBS; 
  if ^BUYOBS & ACTION='buy' then BUYOBS = _N_ ; 
  if BUYOBS < _N_ < BUYOBS+5 then ACTION='hold';
  if _N_ = BUYOBS+4 then BUYOBS=0;
cards;
1 5 buy
2 2 buy
3 8 c
4 10 sell 
5 12 buy
6 5 sell
7 2 buy
8 8 c
9 10 sell 
10 12 buy
11 8 c
12 10 sell 
13 12 buy
run;
Obs TIME PRICE ACTION
1 1 5 buy
2 2 2 hold
3 3 8 hold
4 4 10 hold
5 5 12 hold
6 6 5 sell
7 7 2 buy
8 8 8 hold
9 9 10 hold
10 10 12 hold
11 11 8 hold
12 12 10 sell
13 13 12 buy

 

 

 

bhupeshpanwar
Fluorite | Level 6
Hi Chris,

This is an excellent solution.
Will this code work in a similar fashion with slight modification?

retain BUYOBS 0;
if BUYOBS = 0 & ACTION='buy' then BUYOBS = _N_ ;
if BUYOBS < _N_ < BUYOBS+5 then ACTION='hold';
if _N_ = BUYOBS+4 then BUYOBS=0;

Thanks,

Bhupesh
ChrisNZ
Tourmaline | Level 20

>Will this code work in a similar fashion with slight modification?

Yes it should.

jimbarbour
Meteorite | Level 14

@bhupeshpanwar,

 

One way to be absolutely sure is to first run Chris' version of the code, then run your version, and then to run a Proc Compare on your results.  If the results are a 100% match you have a pretty good assurance that in fact the new version of the code is working the same as the first version.

 

If we name the dataset from Chris' version "Have" and the dataset from your version "Have2," a simple Proc Compare would look like this:

PROC	COMPARE	BASE	=	Have
				COMPARE	=	Have2;
RUN;

The results after running look like this:

Observation Summary                                                                                                          
                                                                                                                             
Observation      Base  Compare                                                                                               
                                                                                                                             
First Obs           1        1                                                                                               
Last  Obs          13       13                                                                                               
                                                                                                                             
Number of Observations in Common: 13.                                                                                        
Total Number of Observations Read from WORK.HAVE: 13.                                                                        
Total Number of Observations Read from WORK.HAVE2: 13.                                                                       
                                                                                                                             
Number of Observations with Some Compared Variables Unequal: 0.                                                              
Number of Observations with All Compared Variables Equal: 13.                                                                
                                                                                                                             
NOTE: No unequal values were found. All values compared are exactly equal.  

The key phrase I look for is in the Observation Summary:  "All values compared are exactly equal."

 

Jim

mkeintz
PROC Star

You need to do 2 things:

  1. maintain a countdown starting at 4 whenever you detect a "first buy".
  2. detect "first buy" as any 'buy' following something other than a buy - the lag function is the thing here.
data have;
  infile datalines truncover;
  input time price action $; 
cards;
1 5 buy
2 2 buy
3 8 
4 10 sell 
5 12 buy
6 5 sell
7 2 buy
8 8 
9 10 sell 
10 12 buy
11 8 
12 10 sell 
13 12 buy
run;

data want (drop=_:);
  set have;
  retain _countdown 0;
  if _countdown>0 then do;
    action='hold';
    _countdown+(-1);
  end;
    if action='buy' and lag(action)^='buy' and not (_countdown>0) then _countdown=4;
run;

 

edit note: removed the erroneous:

  else if action='buy' and lag(action)^='buy' then _countdown=4;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
bhupeshpanwar
Fluorite | Level 6
Hi,

your solution works well even without the lag(action)^='buy' comparison. Why do we need that? Can you please check the code below as it works as intended for me? Thanks for your help.

data want (drop=_:);
set have;
retain _countdown 0;
if _countdown>0 then do;
action='hold';
_countdown+(-1);
end;
if action='buy' and countdown=0 then _countdown=4;
run;
mkeintz
PROC Star

@bhupeshpanwar 

 

You're right - the use of lag is superfluous, and using countdown works fine, and is less busy.

 

Question:  What do you want to do if you have a sequence of, say, 7 buys. Your (and my) program would convert

           buy, buy, buy, buy, buy, buy, buy

into

           buy, hold,hold,hold,hold,buy, hold

The sixth element is a buy, but it is not the first buy.  Do you really want to treat it as the start of another buy-and-hold series?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
bhupeshpanwar
Fluorite | Level 6
Thanks for the confirmation.
Yes, this is the correct output. The sixth element in your case should be the start of a new buy hold series.

Bhupesh
Ksharp
Super User
data have;
  infile datalines truncover;
  input time price action $; 
cards;
1 5 buy
2 2 buy
3 8 
4 10 sell 
5 12 buy
6 5 sell
7 2 buy
8 8 
9 10 sell 
10 12 buy
11 8 
12 10 sell 
13 12 buy
;
run;

data temp;
 set have;
 if action='buy';
 keep time action;
run;
data temp1;
 set temp;
 retain t;
 if _n_=1 then do;t=time;output;end;
 if time>t+4 then do;t=time;output;end;
 keep t;
run;
data temp2;
 set temp1;
 do time=t+1 to t+4;
  output;
 end;
 keep time;
run;
data want;
 if _n_=1 then do;
  declare hash h(dataset:'temp2');
  h.definekey('time');
  h.definedone();
 end;
set have;
if h.check()=0 then action='hold';
run;
bhupeshpanwar
Fluorite | Level 6

Thank you for all the solutions. All three solutions posted work really well. I will accept Chris as a solution since he was the one to post it first. I leaned a lot from all of you. Thanks.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1265 views
  • 3 likes
  • 5 in conversation