BookmarkSubscribeRSS Feed
Nasser_DRMCP
Lapis Lazuli | Level 10

data have;
   infile datalines dsd;
   input score dossier : $2. yearmonth flag $1.  ;
   datalines;
0,c1,201801,N
0,c1,201802,N
0,c1,201803,N
0,c1,201804,N
0,c1,201805,N
0,c1,201806,N
0,c1,201807,N
0,c1,201808,N
1,c1,201809,Y
1,c1,201810,N
2,c1,201811,Y
2,c1,201812,N
1,c2,201801,Y
1,c2,201802,N
1,c2,201803,N
1,c2,201804,N
2,c2,201805,Y
3,c2,201806,Y
3,c2,201807,Y
3,c2,201808,N
3,c2,201809,N
3,c2,201810,N
2,c2,201811,N
1,c2,201812,N
; 
run ;

Hello,

this is a dummy table to ilustrate my need.

for each row (each month) I need to count the number of Y among the last 6 period .

the expected result is the column score

thanks in advance for your help

regards

Nasser

5 REPLIES 5
ballardw
Super User

Some questions to clarify. By "last 6 periods" do you mean that the record for 201807 has the count for 201801 through 201806? Or would 201806 has the count for 201801 through 201806?

What would the value for the count be for 201804? Since there are not a complete "last 6 periods" should the count be missing, the number of the previous 3 (or 4), or zero?

 

What role does dossier have on this count? Is the count supposed to only be within the same values for dossier? (NOT mentioned in the requirement)

 

And why the are flags Y and N? Using numeric 1 and 0 are much easier to manipulate. Sum of 1/0 coded values is the count of 1s; mean is the percent of ones for instance.

 

 

Nasser_DRMCP
Lapis Lazuli | Level 10

Thanks ballardw for your quick respons

1°) the record for 201806 has the count for 201801 through 201806

2°)  201804, not 0 but the number of flag Yes from 201801 to 201804.

3°) I need to count the number of FLAG Yes in the last 6 periods for each dossier

4°) ok I am going to transform Y/N to 1/0

 

 

 

ballardw
Super User

See if this gets close to what you want.

proc format library=work;
invalue yn
'Y'=1
'N'=0
other=.
;
run;
data have;
   infile datalines dsd;
   input score dossier : $2. yearmonth :yymmn6. flag :YN.  ;
   format yearmonth yymmn6.;
   datalines;
0,c1,201801,N
0,c1,201802,N
0,c1,201803,N
0,c1,201804,N
0,c1,201805,N
0,c1,201806,N
0,c1,201807,N
0,c1,201808,N
1,c1,201809,Y
1,c1,201810,N
2,c1,201811,Y
2,c1,201812,N
1,c2,201801,Y
1,c2,201802,N
1,c2,201803,N
1,c2,201804,N
2,c2,201805,Y
3,c2,201806,Y
3,c2,201807,Y
3,c2,201808,N
3,c2,201809,N
3,c2,201810,N
2,c2,201811,N
1,c2,201812,N
; 
run ;
data want;
   set have;
   l1=flag;
   l2=lag1(flag);
   l3=lag2(flag);
   l4=lag3(flag);
   l5=lag4(flag);
   l6=lag5(flag);
   
   select (month(yearmonth));
      when (1) count = l1;
      when (2) count = sum(of l1-l2);
      when (3) count = sum(of l1-l3);
      when (4) count = sum(of l1-l4);
      when (5) count = sum(of l1-l5);
      when (6,7,8,9,10,11,12) count = sum(of l1-l6);
      otherwise;
   end;
   drop l1-l6;
run;

Custom INFORMAT to read data of Y/N into 1/0. If you already have a character flag you could create a numeric flag variable using:

 

NumFlag = input(flag,yn.);

 

I turned your yearmonth into actual SAS date values so I could pull the month easily. If that's not acceptable you could use

input(substr(yearmonth,5,2),f2.) instead of month(yearmonth)

This assumes that your data only represents at most a single year within a Dossier group. If that is not the case then we need examples of the data crossing a year boundary.

 

This is not particularly elegant but I think it relatively easy to understand the approach.

Nasser_DRMCP
Lapis Lazuli | Level 10

Hello Ballardw

many thanks. very interesting.

I am sorry.I realized that my need was not clear enough. because in this case (for example)

3,c2,201809,N

3,c2,201810,N

2,c2,201811,Y

1,c2,201812,N

1,c2,201901,Y

1,c2,201903,N

 

if we look at 201903, the number of Y in the last 6 years should be 2 even if there is 2 differents years.

so, I tried this code

data want;

set haven ;

 

l1=flag_imp;

l2=lag1(flag_imp);

l3=lag2(flag_imp);

l4=lag3(flag_imp);

l5=lag4(flag_imp);

l6=lag5(flag_imp);

nbr_imp_6mois = sum(of l1-l6);

run ;

 

 

 

scoredossieryearmonthflg_ipyflag_impl1l2l3l4l5l6count
1c1201809Y11000001
1c1201810N00100001
2c1201811Y11010002
2c1201812N00101002
1c2201801Y11010103
1c2201802N00101013

 

 

 

Nasser_DRMCP
Lapis Lazuli | Level 10

Hello ballardw,

 

I succeeded to get the attended result.

first step => t I create a row number by dossier

proc sort data=have ; by ref_ctr_dossier yearmonth; run ;

data have;

set have ;

numero + 1 ;

by ref_dossier_exi ;

if first.dossier then numero=1 ;

run ;

 

second step=> I remove the lag 2,3,4,5,6 depending the numero like this

data have;

set have ;

if numero = 5 then l6=0 ;

if numero = 4 then do ; l6=0 ; l5=0; end ;

if numero = 3 then do ; l6=0 ; l5=0; l4=0 ; end ;

if numero = 2 then do ; l6=0 ; l5=0; l4=0 ; l3=0; end ;

if numero = 1 then do ; l6=0 ; l5=0; l4=0 ; l3=0; l2=0; end ;

run ;

 

third step=> I can count.

thanks a lot for your help.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 942 views
  • 0 likes
  • 2 in conversation