BookmarkSubscribeRSS Feed
JasonBCoding
Calcite | Level 5
CountshoecustomerDateStart
1AJoey23/07/2013
2AJoey23/10/2013
1BJoey23/10/2013
2BJoey28/01/2014
1AJoey

5/05/2014

Hi guys,

Based on the data above and by using the customer, dateStrt and shoe, I am trying to create a count variable to calculate the number of times customer is wearing the shoes. Notice the customers will wear shoe A for a period of time and switch to different and can always switch back to the original shoe. The values in count variable is what I wish to achieve.

I have tried several methods such as splitting the datasets out into 2 different datasets based on shoe variable and have them process separately but that will cause the last row to have a value of 3 in count because it didn't detect it was B in previous row. Notice how the dateStrt play a important role here hence the sorting order would be customer, dateStrt and shoe.

Below is that latest piece of code i have created which in theory should work it can't seem to get inside the ELSE DO part hence it never increment. Any idea why?

DATA abc1;

  SET abc;

  by customer dateStrt shoe;

  ATTRIB count LENGTH = 5.;

  RETAIN count ;

  IF FIRST.shoe THEN DO;

  count = 1;

  END;

  ELSE DO;

  count = count + 1;

  END;

RUN;

Thanks

7 REPLIES 7
lacrefa
Calcite | Level 5

Dear

It is normal, it is link to your sort.

Joey 19562 A --> it is the first shoe of the Datastart

Joey 19654 A -> it is too the first shoe of the Datastart

Joey 19654 B

Joey 19751 B

Joey 19848 A

data abc;

length DateStart 8;

shoe='A'; customer="Joey";DateStart='23Jul2013'd;output;

shoe='A'; customer="Joey";DateStart="23Oct2013"d;output;

shoe='B'; customer="Joey";DateStart="23Oct2013"d;output;

shoe='B'; customer="Joey";DateStart="28JAN2014"d;output;

shoe='A'; customer="Joey";DateStart="05MAy2014"d;output;

run;

proc sort data=abc; by customer  shoe datestart;run;

data abc_count (keep= shoe customer count); set abc; by customer  shoe datestart;

retain count 0;

if first.shoe then count=1;

else count+1;

if last.shoe then output;

run;

Shoe Customer Count

A      Joey               3

B      Joey               2

JasonBCoding
Calcite | Level 5

dateStrt was meant to be between the customer and shoe because I want to count the frequency of the shoe was worn on a given period, instead of overall. Notice the sample provided above, has the count of the last row reset back to 1 because once the customer switches from A to B and B back to A, it should be reset from the start.

I suspect I will need to have either N level or N -1 level of nested IF loop with each part, i.e. something looks like this:

IF FIRST.customer THEN DO;

     IF FIRST.dateStrt THEN DO;

          IF FIRST.shoe THEN DO;

          END;

     END;

END;

ELSE DO;

     IF FIRST.dateStrt THEN DO;

          IF FIRST.shoe THEN DO;

          END;

     END;

END;

This is probably not complete but I will just have to give it a try.

Another testing that I thought it would be worthwhile checking is to give each row a value of one then do a proc summary with these three variables in the by or group and see how it results....

Note: the example given has only 3 levels but i just remember the real data actually has 4 levels for what i am trying to achieve but the overall algorithm should be the same anyway....

Cheers,

Kurt_Bremser
Super User

By having dateStrt in between customer and shoe you basically guarantee that there can only be one shoe instance within a given date, so you always have first.shoe resolve as true.

Do you try to count how often a shoe was worn on a given date, or how often a shoe was worn overall?

Astounding
PROC Star

I'm not sure if I follow  your description 100%, but here's a solution that matches your desired outcome.

Don't change your data (which means keep it in its current order).

Change one line in your original DATA step.  The BY statement should become:

by customer shoe notsorted;

Good luck.

Haikuo
Onyx | Level 15

Like others on the board, I am not sure if I have understand you correctly, but DOW seems to get the output you are asking:

data have;

input (shoe customer) (:$) DateStart ddmmyy10.;

format datestart ddmmyy10.;

cards;

A Joey 23/07/2013

A Joey 23/10/2013

B Joey 23/10/2013

B Joey 28/01/2014

A Joey 5/05/2014

;

data want;

  do count=1 by 1 until (last.shoe);

  set have;

by customer shoe notsorted;

  output;

  end;

run;

Regards,

Haikuo

Reeza
Super User

The key is the notsorted option on the BY statement after you've accomplished the sort you want.

You don't need a DOW loop, though you can clearly use it.

data want;

  set have;

retain count;

by customer shoe notsorted;

if first.shoe then count=1;

else count+1;

run;

JasonBCoding
Calcite | Level 5

Never realised about the NOTSORTED option before and it works like magic! Learnt something new today!

It looks like it is now working and it seem to work too even I add more level into the by group. I will probably need to generate more records and find some really tricky examples which switch around at all time to test if it works perfectly. Nevertheless, thanks very much for your suggestion! Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1035 views
  • 6 likes
  • 6 in conversation