BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ANKH1
Pyrite | Level 9

We have this dataset. Each ID is an item sold at a store. OLD_NEW is whether the version of the item is old or new. Months is time it took for the item to sell.  Type1 is equal to A if it got sold before 12 months.

IDSTOREOLD_NEWMONTHStype1
1001OLD34.
1011NEW11A
1022OLD45.
1032NEW15.
1043NEW6A
1053OLD34.

We would like to create a variable type2 that will only be filled out if the other item in the same store had type1=A. Like the following:

IDSTOREOLD_NEWMONTHStype1type2
1001OLD34.B
1011NEW11A.
1022OLD45..
1032NEW15..
1043NEW6A.
1053OLD34.B
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Why did you show the observations in random order?  It would be much easier if they were in order, then you can just remember if there have been any with months < 12.

 

I find binary flags much easier to deal with.

data have;
  input ID STORE OLD_NEW $ MONTHS type1 $ type2 $;
cards;
100 1 OLD 34 . B
101 1 NEW 11 A .
102 2 OLD 45 . .
103 2 NEW 15 . .
104 3 NEW 6 A .
105 3 OLD 34 . B
;

proc sort data=have ;
  by store months ;
run;

data want;
  set have;
  by store ;
  if first.store then n_type1=0;
  new_type1=(months < 12);
  new_type2=(n_type1>0);
  if new_type1 then n_type1+1;
run;
                                                                          new_     new_
OBS     ID    STORE    OLD_NEW    MONTHS    type1    type2    n_type1    type1    type2

 1     101      1        NEW        11        A                  1         1        0
 2     100      1        OLD        34                 B         1         0        1
 3     103      2        NEW        15                           0         0        0
 4     102      2        OLD        45                           0         0        0
 5     104      3        NEW         6        A                  1         1        0
 6     105      3        OLD        34                 B         1         0        1

 

 

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

Why did you show the observations in random order?  It would be much easier if they were in order, then you can just remember if there have been any with months < 12.

 

I find binary flags much easier to deal with.

data have;
  input ID STORE OLD_NEW $ MONTHS type1 $ type2 $;
cards;
100 1 OLD 34 . B
101 1 NEW 11 A .
102 2 OLD 45 . .
103 2 NEW 15 . .
104 3 NEW 6 A .
105 3 OLD 34 . B
;

proc sort data=have ;
  by store months ;
run;

data want;
  set have;
  by store ;
  if first.store then n_type1=0;
  new_type1=(months < 12);
  new_type2=(n_type1>0);
  if new_type1 then n_type1+1;
run;
                                                                          new_     new_
OBS     ID    STORE    OLD_NEW    MONTHS    type1    type2    n_type1    type1    type2

 1     101      1        NEW        11        A                  1         1        0
 2     100      1        OLD        34                 B         1         0        1
 3     103      2        NEW        15                           0         0        0
 4     102      2        OLD        45                           0         0        0
 5     104      3        NEW         6        A                  1         1        0
 6     105      3        OLD        34                 B         1         0        1

 

 

ANKH1
Pyrite | Level 9

Thank you! It works perfectly.

ANKH1
Pyrite | Level 9
Could you please explain why n_type1 turns into 1 for the second observation by adding the last line of code if it is missing up until that line?
Tom
Super User Tom
Super User

It is never going to be missing with that data step.

 

The SUM STATEMENT means that the variable will be initialized to zero.  Plus for the first observation of each BY group it is set to zero.

ANKH1
Pyrite | Level 9
Thank you! Sorry, I am new with SAS and I want to understand the code instead of only using. What is the function of the n_type1? And why for the observation 2 is 1 if you have this line " if new_type1 then n_type1+1;" Doesn't it mean that it is 1 if new_type1 is real? Why line is 0 for n_type1?
Tom
Super User Tom
Super User

Read the documentation on the SUM STATEMENT that I linked before. Also read about the RETAIN statement, using a SUM statement implies that the variable that is being summed into is retained.  Also read about difference in how missing values are handled by normal addition operator (A+B) and the SUM() function, sum(a,b).  The SUM STATEMENT handles missing values of its two arguments in the same way as the SUM() function does.

 

Also understand about how Boolean expressions work in SAS. When you use a number, like NEW_TYPE1, in a boolean expression then SAS will treat a value that is 0 or missing as FALSE and any other value as TRUE.  When you use the result of a Boolean expression, such as (months<12), as a number then TRUE expression result in a 1 and FALSE expressions result in a 0.

 

So

if new_type1 then n_type1+1;

means.

When NEW_TYPE1 has a value that is neither zero or missing then add 1 to the current value on N_TYPE1 (and if N_TYPE1 was somehow set to missing the result is 1 instead of missing).  Or more clearly:

Increment n_type1 by 1 when new_type1 is true.

ANKH1
Pyrite | Level 9
Thank you, this is very helpful.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1095 views
  • 0 likes
  • 2 in conversation