Help using Base SAS procedures

Controlling multiple level in the by sort

Reply
Occasional Contributor
Posts: 8

Controlling multiple level in the by sort

CountshoecustomerDateStart
1AJoey23/07/2013
2AJoey23/10/2013
1BJoey23/10/2013
2BJoey28/01/2014
1AJoey

5/05/2014

Hi guys,

Based on the data above and by using the customer, dateStrt and shoe, I am trying to create a count variable to calculate the number of times customer is wearing the shoes. Notice the customers will wear shoe A for a period of time and switch to different and can always switch back to the original shoe. The values in count variable is what I wish to achieve.

I have tried several methods such as splitting the datasets out into 2 different datasets based on shoe variable and have them process separately but that will cause the last row to have a value of 3 in count because it didn't detect it was B in previous row. Notice how the dateStrt play a important role here hence the sorting order would be customer, dateStrt and shoe.

Below is that latest piece of code i have created which in theory should work it can't seem to get inside the ELSE DO part hence it never increment. Any idea why?

DATA abc1;

  SET abc;

  by customer dateStrt shoe;

  ATTRIB count LENGTH = 5.;

  RETAIN count ;

  IF FIRST.shoe THEN DO;

  count = 1;

  END;

  ELSE DO;

  count = count + 1;

  END;

RUN;

Thanks

Contributor
Posts: 22

Re: Controlling multiple level in the by sort

Posted in reply to JasonBCoding

Dear

It is normal, it is link to your sort.

Joey 19562 A --> it is the first shoe of the Datastart

Joey 19654 A -> it is too the first shoe of the Datastart

Joey 19654 B

Joey 19751 B

Joey 19848 A

data abc;

length DateStart 8;

shoe='A'; customer="Joey";DateStart='23Jul2013'd;output;

shoe='A'; customer="Joey";DateStart="23Oct2013"d;output;

shoe='B'; customer="Joey";DateStart="23Oct2013"d;output;

shoe='B'; customer="Joey";DateStart="28JAN2014"d;output;

shoe='A'; customer="Joey";DateStart="05MAy2014"d;output;

run;

proc sort data=abc; by customer  shoe datestart;run;

data abc_count (keep= shoe customer count); set abc; by customer  shoe datestart;

retain count 0;

if first.shoe then count=1;

else count+1;

if last.shoe then output;

run;

Shoe Customer Count

A      Joey               3

B      Joey               2

Occasional Contributor
Posts: 8

Re: Controlling multiple level in the by sort

dateStrt was meant to be between the customer and shoe because I want to count the frequency of the shoe was worn on a given period, instead of overall. Notice the sample provided above, has the count of the last row reset back to 1 because once the customer switches from A to B and B back to A, it should be reset from the start.

I suspect I will need to have either N level or N -1 level of nested IF loop with each part, i.e. something looks like this:

IF FIRST.customer THEN DO;

     IF FIRST.dateStrt THEN DO;

          IF FIRST.shoe THEN DO;

          END;

     END;

END;

ELSE DO;

     IF FIRST.dateStrt THEN DO;

          IF FIRST.shoe THEN DO;

          END;

     END;

END;

This is probably not complete but I will just have to give it a try.

Another testing that I thought it would be worthwhile checking is to give each row a value of one then do a proc summary with these three variables in the by or group and see how it results....

Note: the example given has only 3 levels but i just remember the real data actually has 4 levels for what i am trying to achieve but the overall algorithm should be the same anyway....

Cheers,

Super User
Posts: 7,762

Re: Controlling multiple level in the by sort

Posted in reply to JasonBCoding

By having dateStrt in between customer and shoe you basically guarantee that there can only be one shoe instance within a given date, so you always have first.shoe resolve as true.

Do you try to count how often a shoe was worn on a given date, or how often a shoe was worn overall?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 5,497

Re: Controlling multiple level in the by sort

Posted in reply to JasonBCoding

I'm not sure if I follow  your description 100%, but here's a solution that matches your desired outcome.

Don't change your data (which means keep it in its current order).

Change one line in your original DATA step.  The BY statement should become:

by customer shoe notsorted;

Good luck.

Respected Advisor
Posts: 3,156

Re: Controlling multiple level in the by sort

Posted in reply to JasonBCoding

Like others on the board, I am not sure if I have understand you correctly, but DOW seems to get the output you are asking:

data have;

input (shoe customer) (:$) DateStart ddmmyy10.;

format datestart ddmmyy10.;

cards;

A Joey 23/07/2013

A Joey 23/10/2013

B Joey 23/10/2013

B Joey 28/01/2014

A Joey 5/05/2014

;

data want;

  do count=1 by 1 until (last.shoe);

  set have;

by customer shoe notsorted;

  output;

  end;

run;

Regards,

Haikuo

Super User
Posts: 19,770

Re: Controlling multiple level in the by sort

The key is the notsorted option on the BY statement after you've accomplished the sort you want.

You don't need a DOW loop, though you can clearly use it.

data want;

  set have;

retain count;

by customer shoe notsorted;

if first.shoe then count=1;

else count+1;

run;

Occasional Contributor
Posts: 8

Re: Controlling multiple level in the by sort

Never realised about the NOTSORTED option before and it works like magic! Learnt something new today!

It looks like it is now working and it seem to work too even I add more level into the by group. I will probably need to generate more records and find some really tricky examples which switch around at all time to test if it works perfectly. Nevertheless, thanks very much for your suggestion! Smiley Happy

Ask a Question
Discussion stats
  • 7 replies
  • 261 views
  • 6 likes
  • 6 in conversation