BookmarkSubscribeRSS Feed
adventuresinsas
Fluorite | Level 6

Hi all,

I just can't seem to figure this out so hoping that someone here can help.

 

I have this dataset, where Var1 is an indicator for a particular event:

ID  Month  Var1

A   Jan       1

A   Jan       1

A   Feb       .

B   Jan       1

B   Jan       1

B   Feb      1

B   Mar      1

 

I want to create a new dataset that contains a new count variable that will count the number in Var1 for each ID/month group so that it looks like this:

ID  Month  Var1      Count

A   Jan       1            1

A   Jan       1            2

A   Feb       .            .

B   Jan       1            1

B   Jan       1            2

B   Feb      1            1

B   Mar      1             1

 

How do I do this with a retain statement? This is what I tried:

data want;
	set have;
	where var1=1;
	by id month;
	if first.month then Count=0;
		Count+1;

run;

 

(Side note: I know this probably won't give me the row where Var1=. but that's okay.)

 

10 REPLIES 10
novinosrin
Tourmaline | Level 20
data have;
input ID $  Month $  Var1;
cards;
A   Jan       1
A   Jan       1
A   Feb       .
B   Jan       1
B   Jan       1
B   Feb      1
B   Mar      1
;

data want;
 set have;
 by id month notsorted;
 if first.month and var1 then count=1;
 else if var1 then count+1;
 else count=.;
run;

 /*Or just*/

data want;
 set have;
 by id month notsorted;
 if first.month then count=.;
 if var1 then count+1;
run;

/*Or*/

data want;
 set have;
 by id month var1 notsorted;
 if first.var1 then count=.;
 if var1 then count+1;
run;

 

ed_sas_member
Meteorite | Level 14

Hi @adventuresinsas 

 

You need to specify "notsorted" in the by statement as the value are considered as character values (not a format of a numeric variable for the month 1-12). So SAS consider they are not sorted properly (Feb should occur before Jan etc. alphabetically)

data want;
 set have;
 by id month notsorted;
 if first.month and var1 then count=1;
 else if var1 then count+1;
 else count=.;
run;
adventuresinsas
Fluorite | Level 6

Oh, okay. So I'm sorry, I should have specified but month is actually in YYMM7. format (so it displays as 2018M10, for instance). I did do a proc sort by ID and month. But I will give this notsorted option a try as well.

Tom
Super User Tom
Super User

So what was wrong with the results you got from running your code?

 

adventuresinsas
Fluorite | Level 6

I just ran it again. I have another variable in my dataset for exact date and it looks like my code only counts observations occurring on the same date (as though the bygroup is date). Which doesn't make sense to me because I don't list the date variable anywhere in my code for proc sort or the data step. So in the end, I could have some observations occurring in the same month but the count gets reset whenever the actual date changes, not the month.

Tom
Super User Tom
Super User

When you attach a format to a variable it changes how SAS displays the value, but it does not change the actual value.  So probably to get the behavior you are seeing the two variables have the same values, but one is displaying just the month and the other displays the day of the month also.

 

Some procedures, like PROC FREQ and PROC MEANS, will use the formatted value when aggregating by a variable.  But the BY processing of the DATA step looks at the actual values to see when they change, not the formatted (display) values.

 

You could fix that by converting the variable that you are using to show the month to use the same day of the month every time.

 

So something like this will make a copy of a date variable named DATE into another date valued variable named MONTH, but force the values to the first day of the month.

data want;
  set have ;
  month =intnx('month',date,0,'b');
  format month yymm7. date date9.;
run;

 

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @adventuresinsas 

 

In this case it is easier to use Lag functions instead of by variables with notsorted.

 

data have;
input ID $  Month $  Var1;
cards;
A   Jan       1
A   Jan       1
A   Feb       .
B   Jan       1
B   Jan       1
B   Feb      1
B   Mar      1
;

data want; set have;
	if lag(ID)=ID and lag(Month)=Month then count + 1; 
	else count=Var1;
run;

Result:

 

 

count.gif

 

adventuresinsas
Fluorite | Level 6
thank you! I'm not familiar with lag functions at all so will check this out.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @adventuresinsas 

 

So the Month variable is a SAS date. It is formatted to appear as a month value, but it contains a numeric value counting days from jan 1, 1960. And the value is different, if the days are different, even if they are in the same month. Here is a modified code to handle that:

 

data have;
input ID$ Month date9. Var1;
format Month YYMM7.;
cards;
A 12jan2019 1
A 20jan2019 1
A 16feb2019 .
B 01jan2019 1
B 17jan2019 1
B 24feb2019 1
B 07mar2019 1
;

data want; set have;
	if lag(ID)=ID and month(lag(Month))=month(Month) then count + 1; 
	else count=Var1;
run;

Result - same as previous mail.

 

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @adventuresinsas 

 

I realize it cannot handle a situation, where two consecutive records for the same ID has the same month, but in different years. They will be seen as the same month. This should be failsafe:

 


data want; set have;
	if lag(ID)=ID and put(lag(Month),YYMM7.)=put(Month,YYMM7.) then count + 1; 
	else count=Var1;
run;

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 6195 views
  • 1 like
  • 5 in conversation