BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

 Hello every body.

I use this code for formulating volume of trades. I need sorting my data by date (TRD_EVENT_DT var.) and then use the formula for calcuating the volume of trades of each stock (based on TRD_STCK_CD var. (text var.)).

/*Sort by TRD_STCK_CD and temporal variables.*/
PROC SORT DATA=Sampledata87_RT OUT=Sampledata87_SumVol;
    BY  TRD_EVENT_DT;
RUN;

/*Sum VOLUME until the last of each TRD_STCK_CD is reached.*/
DATA Sampledata87_SumVolSo;
    SET Sampledata87_SumVol;
    BY  TRD_EVENT_DT TRD_EVENT_DT;
	FORMAT TRD_STCK_CD  $5.;
	INFORMAT TRD_STCK_CD  $5.;
    RETAIN tmp_volume_sum;
    tmp_volume_sum + VOLUME;
    IF last.TRD_STCK_CD THEN DO;
        VOLUME_SUM = tmp_volume_sum;
        CALL missing(tmp_volume_sum);
    END;
    DROP tmp_:;
RUN;

But the result is:

 

NOTE: Variable last.TRD_STCK_CD is uninitialized.
NOTE: There were 42948 observations read from the data set WORK.SAMPLEDATA87_SUMVOL.
NOTE: The data set WORK.SAMPLEDATA87_SUMVOLSO has 42948 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds


 

Thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

@RW9 He doesn't understand the code because he didn't write it. 

 

 @aminkarimid Please review how BY groups work. If you want to use FIRST/LAST on a variable that variable MUST be in your BY statement. 

Your BY Statement is what allows you to uniquely identify groups. 

 

You've changed the code in the solution, removing BY variables. Note that you should also update your comments to reflect your new algorithm. I believe the code on SO is correct and answers your question. 

 

http://stackoverflow.com/questions/44031272/how-can-i-formulate-and-select-one-value-from-a-series-o...

View solution in original post

9 REPLIES 9
RW9
Diamond | Level 26 RW9
Diamond | Level 26

So what is the question?  You appear to have said you have some code and presented the code, not asked a question.  Do look at the guidance also when posting a question, give example test data in the form of a datastep, and show what you would like the output to look like.  Then explain the logic between the two.  Also, please avoid coding all in captials it makes your code so much harder to read.

Finally your topic doesn't appear to have anything to do with the post?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, now you provide further info.  Does this line:

    BY  TRD_EVENT_DT TRD_EVENT_DT;

Contain the variable mentioned in the note:

NOTE: Variable last.TRD_STCK_CD is uninitialized.

 This is why you have the problem.

aminkarimid
Lapis Lazuli | Level 10
Thank you RW9. But I don't want to sort my data by TRD_STCK_CD. First I want to sort my data by date, and second I want to formulate data based on name (TRD_STCK_CD). But I don't want sort them by TRD_STCK_CD var.
Kurt_Bremser
Super User

Well, you must bring your dataset int the order necessary for by-group processing, so your by variables will form the groups in succession as you need them. If you have a kind of order where you need to get a by-group for a variable that is not in a strict ascending or descending sequence, you can use the notsorted option on this variable in the by statement.

 

For detailed help, you will need to supply example data (in a data step), and an example of the expected result.

Reeza
Super User

@RW9 He doesn't understand the code because he didn't write it. 

 

 @aminkarimid Please review how BY groups work. If you want to use FIRST/LAST on a variable that variable MUST be in your BY statement. 

Your BY Statement is what allows you to uniquely identify groups. 

 

You've changed the code in the solution, removing BY variables. Note that you should also update your comments to reflect your new algorithm. I believe the code on SO is correct and answers your question. 

 

http://stackoverflow.com/questions/44031272/how-can-i-formulate-and-select-one-value-from-a-series-o...

aminkarimid
Lapis Lazuli | Level 10

Let me show you an example:

Here is a first data set:

TRD_EVENT_DT  TRD_EVENT_TM  TRD_STCK_CD  TRD_PR  TRD_TUROVR  TRD_EVENT_ROUFOR  VOLUME
3/24/2008     13:13:44      ALBZ1        1567    10          13:00             15670 
3/24/2008     12:20:38      AZAB1        683     10000       12:30             6830000 
3/24/2008     13:13:44      AZAB1        695     10          13:00             6950 
3/24/2008      9:14:57      BALI1        850     9260         9:00             7871000 
3/24/2008      9:15:06      BALI1        850     2000         9:30             1700000 
3/24/2008      9:15:14      BALI1        850     10000        9:30             8500000 
3/24/2008      9:15:24      BALI1        850     6000         9:30             5100000 
3/24/2008      9:29:27      BALI1        850     10000        9:30             8500000 
3/24/2008     12:28:00      BALIl        850     10000       12:30             8500000 
3/24/2008     12:28:07      BALIl        850     10000       12:30             8500000 
3/24/2008     13:13:44      BALI1        865     10          13:00             8650 
4/24/2008     12:28:01      ALBZ1        1537    10000       12:30             15370000 
4/24/2008     13:13:44      ALBZ1        1567    10          13:00             15670 
4/24/2008     12:20:38      AZAB1        683     10000       12:30             6830000 
4/24/2008     13:13:44      AZAB1        695     10          13:00             6950 
4/24/2008      9:14:57      BALI1        850     9260         9:00             7871000 
4/24/2008      9:15:06      BALI1        850     2000         9:30             1700000 
4/24/2008      9:15:14      BALI1        850     10000        9:30             8500000 
4/24/2008      9:15:24      BALI1        850     6000         9:30             5100000 
4/24/2008      9:29:27      BALI1        850     10000        9:30             8500000 
4/24/2008     12:28:00      BALIl        850     10000       12:30             8500000 
4/24/2008     12:28:07      BALIl        850     10000       12:30             8500000 
4/24/2008     13:13:44      BALI1        865     10          13:00             8650 

 And I need this table:

 

 

TRD_EVENT_DT  TRD_EVENT_TM  TRD_STCK_CD  TRD_PR  TRD_TUROVR  TRD_EVENT_ROUFOR  VOLUME         VOLUME_SUM
3/24/2008     13:13:44      ALBZ1        1567    10          13:00             15670            15385670
3/24/2008     12:20:38      AZAB1        683     10000       12:30             6830000 
3/24/2008     13:13:44      AZAB1        695     10          13:00             6950              6836950
3/24/2008      9:14:57      BALI1        850     9260         9:00             7871000 
3/24/2008      9:15:06      BALI1        850     2000         9:30             1700000 
3/24/2008      9:15:14      BALI1        850     10000        9:30             8500000 
3/24/2008      9:15:24      BALI1        850     6000         9:30             5100000 
3/24/2008      9:29:27      BALI1        850     10000        9:30             8500000 
3/24/2008     12:28:00      BALIl        850     10000       12:30             8500000 
3/24/2008     12:28:07      BALIl        850     10000       12:30             8500000 
3/24/2008     13:13:44      BALI1        865     10          13:00             8650             48679650
4/24/2008     12:28:01      ALBZ1        1537    10000       12:30             15370000 
4/24/2008     13:13:44      ALBZ1        1567    10          13:00             15670             15385670 
4/24/2008     12:20:38      AZAB1        683     10000       12:30             6830000 
4/24/2008     13:13:44      AZAB1        695     10          13:00             6950             6836950
4/24/2008      9:14:57      BALI1        850     9260         9:00             7871000 
4/24/2008      9:15:06      BALI1        850     2000         9:30             1700000 
4/24/2008      9:15:14      BALI1        850     10000        9:30             8500000 
4/24/2008      9:15:24      BALI1        850     6000         9:30             5100000 
4/24/2008      9:29:27      BALI1        850     10000        9:30             8500000 
4/24/2008     12:28:00      BALIl        850     10000       12:30             8500000 
4/24/2008     12:28:07      BALIl        850     10000       12:30             8500000 
4/24/2008     13:13:44      BALI1        865     10          13:00             8650             48679650

 

 Please pay attention to last col. It has been generated by summing volumes that have just same TRD_STCK_CD var. So each TRD_STCK_CD obs. has just one Volume_Sum data.

aminkarimid
Lapis Lazuli | Level 10
The solution is:
DATA Sampledata87_SumVolSo;
SET Sampledata87_SumVol;
BY TRD_EVENT_DT
TRD_STCK_CD notsorted;
FORMAT TRD_STCK_CD $5.;
INFORMAT TRD_STCK_CD $5.;
RETAIN tmp_volume_sum;
tmp_volume_sum + VOLUME;
IF last.TRD_STCK_CD THEN DO;
VOLUME_SUM = tmp_volume_sum;
CALL missing(tmp_volume_sum);
END;
DROP tmp_:;
RUN;
ballardw
Super User

@aminkarimid wrote:
The solution is:
DATA Sampledata87_SumVolSo;
SET Sampledata87_SumVol;
BY TRD_EVENT_DT
TRD_STCK_CD notsorted;
FORMAT TRD_STCK_CD $5.;
INFORMAT TRD_STCK_CD $5.;
RETAIN tmp_volume_sum;
tmp_volume_sum + VOLUME;
IF last.TRD_STCK_CD THEN DO;
VOLUME_SUM = tmp_volume_sum;
CALL missing(tmp_volume_sum);
END;
DROP tmp_:;
RUN;

Please use the code boxes for posting code such as the forum {i} menu icon opens. The main message windows reformat code by removing leading spaces.

And also use indenting to make code more legible. Doing so consistently and with a good pattern will make debugging and understanding code much easier.

Reeza
Super User

I don't think that's correct. The BY statement should be reversed, especially since for your next questions. And don't use NOTSORTED, it has its uses, but not your case. Using it here will likely return incorrect results. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 22474 views
  • 1 like
  • 5 in conversation