BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kishoresamineni
Quartz | Level 8

excel formula is doing the same thing

=IF(G4<3,IFERROR(INDEX($B$1:$F$1,MATCH(3,B4:F4,0)),IF(G4<2,IFERROR(INDEX($B$1:$D$1,MATCH(2,B4:E4,0)),""),"")),"")

PeterClemmensen
Tourmaline | Level 20

Is this correct:

 

Considering only months may to september. If all states are constant 1 1 1 1 1 or 2 2 2 2 2, then the month should be blank.

 

If not, then the month should be the month where the state has the highest value. If there are multiple states that has the same value, month should be the first month where this value occurs. 

 

Correct?

kishoresamineni
Quartz | Level 8

yes you are right..

PeterClemmensen
Tourmaline | Level 20

This implements that logic

 

data have;
input id Stage Date :date9.;
format Date date9.;
datalines;
22313 2 31MAY2019
22314 1 31MAY2019
22315 3 31MAY2019
22316 3 31MAY2019
22317 1 31MAY2019
22318 2 31MAY2019
22313 2 30JUN2019
22314 1 30JUN2019
22315 3 30JUN2019
22316 3 30JUN2019
22317 1 30JUN2019
22318 2 30JUN2019
22313 2 31JUL2019
22314 2 31JUL2019
22315 3 31JUL2019
22316 2 31JUL2019
22317 1 31JUL2019
22318 2 31JUL2019
22313 1 31AUG2019
22314 2 31AUG2019
22315 2 31AUG2019
22316 2 31AUG2019
22317 1 31AUG2019
22318 2 31AUG2019
22313 1 30SEP2019
22314 1 30SEP2019
22315 2 30SEP2019
22316 2 30SEP2019
22317 1 30SEP2019
22318 2 30SEP2019
22313 1 31OCT2019
22314 1 31OCT2019
22315 2 31OCT2019
22316 2 31OCT2019
22317 2 31OCT2019
22318 2 31OCT2019
;

data temp;
    set have;
    month_name=put(Date, monname. -l);
run;

proc sort data=temp;
    by id;
run;

proc transpose data=temp out=temp2(drop=_:);
    by id;
    id month_name;
    var Stage;
run;

data want;
    set temp2;
    array m{*} may--September;
    month=vname(m[whichn(max(of m[*]), of m[*])]);
    if range(of m[*])=0 & m[1] in (1, 2) then call missing(month);
run;

 

Result

 

 

id       May  June  July  August  September  October  month
22313    2    2     2     1       1          1        May
22314    1    1     2     2       1          1        July
22315    3    3     3     2       2          2        May
22316    3    3     2     2       2          2        May
22317    1    1     1     1       1          2        
22318    2    2     2     2       2          2        

 

kishoresamineni
Quartz | Level 8

last piece of code i got below error.

 

ERROR: Array subscript out of range at line 27 column 17.
id=xxxxx May=. June=. July=. August=. September=. October=1 month= _ERROR_=1 _N_=1380

kishoresamineni
Quartz | Level 8

i could see from the source data...this particular ID started from 31OCT2019 itself.

could you please help how to overcome such scenarios

PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input id Stage Date :date9.;
format Date date9.;
datalines;
22313 2 31MAY2019
22314 1 31MAY2019
22315 3 31MAY2019
22316 3 31MAY2019
22317 1 31MAY2019
22318 2 31MAY2019
22313 2 30JUN2019
22314 1 30JUN2019
22315 3 30JUN2019
22316 3 30JUN2019
22317 1 30JUN2019
22318 2 30JUN2019
22313 2 31JUL2019
22314 2 31JUL2019
22315 3 31JUL2019
22316 2 31JUL2019
22317 1 31JUL2019
22318 2 31JUL2019
22313 1 31AUG2019
22314 2 31AUG2019
22315 2 31AUG2019
22316 2 31AUG2019
22317 1 31AUG2019
22318 2 31AUG2019
22313 1 30SEP2019
22314 1 30SEP2019
22315 2 30SEP2019
22316 2 30SEP2019
22317 1 30SEP2019
22318 2 30SEP2019
22313 1 31OCT2019
22314 1 31OCT2019
22315 2 31OCT2019
22316 2 31OCT2019
22317 2 31OCT2019
22318 2 31OCT2019
22319 1 31OCT2019
;

data temp;
    set have;
    month_name=put(Date, monname. -l);
run;

proc sort data=temp;
    by id;
run;

proc transpose data=temp out=temp2(drop=_:);
    by id;
    id month_name;
    var Stage;
run;

data want;
    set temp2;
    array m{*} may--September;
    if nmiss(of m[*])=dim(m) then call missing (month);
    else do;
    month=vname(m[whichn(max(of m[*]), of m[*])]);
    if range(of m[*])=0 & m[1] in (1, 2) then call missing(month);
    end;
run;
kishoresamineni
Quartz | Level 8

i am sorry no luck

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
29:11
NOTE: Invalid numeric data, 'May' , at line 29 column 11.
id=xxxxx May=1 June=1 July=1 August=1 September=1 October=1 month=. _ERROR_=1 _N_=1

Kurt_Bremser
Super User

@kishoresamineni wrote:

i am sorry no luck

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
29:11
NOTE: Invalid numeric data, 'May' , at line 29 column 11.
id=xxxxx May=1 June=1 July=1 August=1 September=1 October=1 month=. _ERROR_=1 _N_=1


Please post the whole log of that step, at least from the data statement up to first the line with the variable listing.

Use the {i} button for posting logs. This is mandatory.

kishoresamineni
Quartz | Level 8
32         data getmonth;
33         set have;
34         by id;
35         retain change month;
36         length month $9;
ERROR: Character length cannot be used with numeric variable Month.
37         l_stage = lag(stage);
38         if first.id
39         then do;
40           change = 0;
41           month = '';
42         end;
43         else if l_stage ne stage then change = 1;
44         if stage in (2,3) and not last.id and (l_stage not in (2,3) or first.id) then month = idvar;
45         if last.id;
46         if change = 0 then month = '';
47         keep id month;
48         run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      41:11   46:28   
WARNING: The variable id in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.GETMONTH may be incomplete.  When this step was stopped there were 0 observations and 1 variables.
WARNING: Data set WORK.GETMONTH was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              604.40k
      OS Memory           22364.00k
      Timestamp           20/11/2019 03:20:42 PM
      Step Count                        178  Switch Count  0
Kurt_Bremser
Super User

Did you put a variable called month into the "have" dataset?

Because my code as posted works, see this log:

73         data have;
 74         input id :$5. stage date :date9.;
 75         format date yymmddd10.;
 76         idvar = put(date,monname.);
 77         datalines;
 
 NOTE: The data set WORK.HAVE has 36 observations and 4 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 114        ;
 115        
 116        proc sort data=have;
 117        by id date;
 118        run;
 
 NOTE: There were 36 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.HAVE has 36 observations and 4 variables.
 NOTE:  Verwendet wurde: PROZEDUR SORT - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 119        
 120        data getmonth;
 121        set have;
 122        by id;
 123        retain change month;
 124        length month $9;
 125        l_stage = lag(stage);
 126        if first.id
 127        then do;
 128          change = 0;
 129          month = '';
 130        end;
 131        else if l_stage ne stage then change = 1;
 132        if stage in (2,3) and not last.id and (l_stage not in (2,3) or first.id) then month = idvar;
 133        if last.id;
 134        if change = 0 then month = '';
 135        keep id month;
 136        run;
 
 NOTE: There were 36 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.GETMONTH has 6 observations and 2 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.01 seconds
       
 
 137        
 138        proc transpose data=have out=trans (drop=_name_);
 139        by id;
 140        var stage;
 141        id idvar;
 142        run;
 
 NOTE: There were 36 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.TRANS has 6 observations and 7 variables.
 NOTE:  Verwendet wurde: PROZEDUR TRANSPOSE - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.01 seconds
       
 
 143        
 144        data want;
 145        merge
 146          trans
 147          getmonth
 148        ;
 149        by id;
 150        run;
 
 NOTE: There were 6 observations read from the data set WORK.TRANS.
 NOTE: There were 6 observations read from the data set WORK.GETMONTH.
 NOTE: The data set WORK.WANT has 6 observations and 8 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 151        
 152        proc print data=want noobs;
 153        run;
 
 NOTE: There were 6 observations read from the data set WORK.WANT.
 NOTE:  Verwendet wurde: PROZEDUR PRINT - (Gesamtverarbeitungszeit):
       real time           0.04 seconds
       cpu time            0.03 seconds
kishoresamineni
Quartz | Level 8

sorry i modified the variable names now you code is working fine, unfortunately it is giving me the same results what i got earlier

results what i got from you program

e.g., 1 2 2 2 3 2 June (this should be September Month)

1 1 1 1 2 2 September (Month should be blank but we are getting September)

even this should be blank but we got June month

1 2 2 2 2 2 June

kishoresamineni
Quartz | Level 8

Hi KurtBremser,

 

thanks for helping me out in sorting out this issue.

below is a sample list that i extracted for validation most of them reported correctly after using your code but some of them

thrown errors. might be because of a messy formula =IF(G311<3,IFERROR(INDEX($B$1:$F$1,MATCH(3,B311:F311,0)),IF(G311<2,IFERROR(INDEX($B$1:$D$1,MATCH(2,B311:E311,0)),""),"")),"") business is using (B-F column names in Excel are May to September) 'G' Column October and 'H' is the Month column we are looking for:

 

could you please consider my request in fixing this issues

 

e.g.,

1 2 2 2 3 2 June (but it should be september Month, stage 3 is the
hightest one last 5 months out of six months)
3 3 2 2 2 2 May (reported correctly)
2 2 2 1 1 1 May (reported correctly)
3 3 3 2 2 2 May (reported correctly)
1 2 2 2 2 2 June (should be blank, may be because excel <2 condition)
3 2 2 2 2 2 May (reported correctly)
2 2 1 1 1 1 May (reported correctly)
2 3 3 3 3 3 May (should be blank, beacuse May-Sep ❤️ and
May - July < 2 this is what my understanding from excel formula)
3 3 2 2 2 2 May (reported correctly)
3 3 3 1 2 2 September (should be May, higher stage is in May)
2 2 2 2 3 3 May (should be blank)
2 2 2 2 1 1 May (reported correctly)
1 1 2 2 2 2 July (should be blank)
1 2 1 1 1 1 June (reported correctly)
1 1 2 1 2 2 September (should be blank)
3 3 3 3 3 2 May (reported correctly)
2 2 1 1 1 1 May (reported correctly)
2 2 3 2 3 2 May (should be July, higher stage started in July)
1 1 2 2 2 2 July (should be blank)
2 2 2 1 1 1 May (reported correctly)

 

 

 

 

 

 

 

kishoresamineni
Quartz | Level 8

Hi,

 

and one more request, every month i don't want to change the Month names in Array.

I am looking for something like automatic process where it picks last six months data and enables the conditions for that Monhs

Infact, i was using the macros to pull the last six months of data. Only issue is with the Month conditions as we discussing.

And i have seen scenarios where the ID might have started last month or two months before. 

ballardw
Super User

@kishoresamineni wrote:

Hi,

 

and one more request, every month i don't want to change the Month names in Array.

I am looking for something like automatic process where it picks last six months data and enables the conditions for that Monhs

Infact, i was using the macros to pull the last six months of data. Only issue is with the Month conditions as we discussing.

And i have seen scenarios where the ID might have started last month or two months before. 


Share the structure if not some example data of what you have before you turn macros loose on the data to "pick the last 6 months".

 

It may well be that part of your problem is that step.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 45 replies
  • 1404 views
  • 5 likes
  • 5 in conversation