BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Moksha
Pyrite | Level 9

I have the following input data :

SUBJECT    CMENDTC          ENDT

101                 2011-09-08        08SEP2011

101                 2011-09-08        08SEP2011

101                 2011-09-08         08SEP2011

102                2011-11-21              21NOV2011

103                                                                      

103

103

103               2011-10-29              29OCT2011

105                

106               2012-01-09              29OCT2012

107               2011-09-23               23SEP2011

107              2011-09-25                25SEP2011

107              2011-10-15                  15OCT2011

107              2011-09-26                 26SEP2011

108              2011-10-15                   15OCT2011

108               2011-11-14                   14NOV2011

109               

109               2011-09-20                20SEP2011

109               2011-09-25                 25SEP2011

109                2011-09-13                 13SEP2011

109

109                 2011-09-25                25SEP2011

109                 2011-09-25                 25SEP2011

109

110       

110                  2011-10-12                   12OCT2011

110                   2011-10-23                  23OCT2011

111                    2011-10-21                    21OCT2011

112 

113

113

114                  2011-10-05                     05OCT2011

114                   2011-10-02                    02OCT2011

114                   2011-10-02                    02OCT2011

 

CMENDTC is character datatype and ENDT is numeric.

 

Need the following output:  we need to check CMENDTC and if it is missing then ENDT is the latest date of CMENDTC for that subject. 

If a subject has records with only missing CMEMDTC and ENDT is missing.

 

SUBJECT    CMENDTC          ENDT

101                 2011-09-08        08SEP2011

101                 2011-09-08        08SEP2011

101                 2011-09-08         08SEP2011

102                2011-11-21              21NOV2011

103                                                 29OCT2011                     

103                                                  29OCT2011

103                                                  29OCT2011

103               2011-10-29              29OCT2011

105                

106               2012-01-09              29OCT2012

107               2011-09-23               23SEP2011

107              2011-09-25                25SEP2011

107              2011-10-15                  15OCT2011

107              2011-09-26                 26SEP2011

108              2011-10-15                   15OCT2011

108               2011-11-14                   14NOV2011

109                                                     25SEP2011

109               2011-09-20                20SEP2011

109               2011-09-25                 25SEP2011

109                2011-09-13                 13SEP2011

109                                                      25SEP2011

109                 2011-09-25                25SEP2011

109                 2011-09-25                 25SEP2011

109                                                       25SEP2011

110                                                        23OCT2011

110                  2011-10-12                   12OCT2011

110                   2011-10-23                  23OCT2011

111                    2011-10-21                    21OCT2011

112 

113

113

114                  2011-10-05                     05OCT2011

114                   2011-10-02                    02OCT2011

114                   2011-10-02                    02OCT2011

115                    2011-08                         31AUG2011

115                    2011-06                          30JUN2011

115                                                              31DEC2011

115                     2011                                 31DEC2011

115                                                               31DEC2011

 

Please, help.

 

           

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

There was an error in my PROC SUMMARY. This is the correct code

 

proc summary data=have1 nway;
    class subject;
    var date;
    output out=maxx max=max_endt;
run;
--
Paige Miller

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Please explain the logic by which 109 has values 25SEP2011 filled in (and not some other value) when CMENDTC is missing. Don't make us try to figure out what you want from an example — show us the example and TELL US the logic to get what you want.

--
Paige Miller
Moksha
Pyrite | Level 9

Sorry, I missed this in the initial post and hence edited the post. But, I am still adding more details. Here, for 109 has the 25SEP2011 as it is the latest date value for that subject. 

PaigeMiller
Diamond | Level 26

Thank you. Try this:

 

data have1;
    set have;
    date=input(cmendtc,yymmdd10.);
run;
proc summary data=have1 nway;
    class subject;
    var endt;
    output out=maxx max=max_endt;
run;
data want;
    merge have maxx;
    by subject;
    if missing(endt) then endt1=max_endt;
    else endt1=endt;
    format endt1 date9.;
run;
    
--
Paige Miller
Moksha
Pyrite | Level 9

Thank you, but getting the following error:

 

 proc summary data=have nway;
628      class subject;
629      var endt;
ERROR: Variable ENDT in list does not match type prescribed for this list.
630      output out=maxx max=max_endt;
631  run;

Actually, as date variable from have1 dataset is not being used anywhere and getting error as cmendtc has values of the form yyyy and yyyy-mm which requires imputation, hence I have modified the code as follows and ran but the getting the above error.

/*data have1;
    set have;
	date=input(cmendtc,yymmdd10.);
	end;
run;*/
/*proc summary data=have1 nway;*/
proc summary data=have nway;
    class subject;
    var endt;
    output out=maxx max=max_endt;
run;
data want;
/*    merge have maxx;*/
    set maxx;
    by subject;
    if missing(endt) then endt1=max_endt;
    else endt1=endt;
    format endt1 date9.;
run;

proc print data=want;run;

Can you please help. 

Tom
Super User Tom
Super User

Something is wrong here.  You said that ENDDT was a DATE variable.  So the PROC SUMMARY should work fine.

 

Did you somehow pass it a dataset where ENDDT was instead a character variable?

mkeintz
PROC Star
data need (keep=subject max_enddt) / view=need;
  do until (last.subject);
    set have (where=(enddt^=.));
    by subject;
    max_enddt=max(max_enddt,enddt);
  end;
run;


data want (drop=max_enddt);
   merge have  need;
   by subject;
   cmenddt=coalesce(cmenddt,max_enddt);
run;

It's programmed as two steps, but because the first step is a data set view it's only activated when the view (named NEED) is called for in the second step.  Reduces disk activity.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Moksha
Pyrite | Level 9

Thank you, but getting error. Since, have dataset has ENDT variable, I have modified the first part of the code as 

follows and ran:

data need (keep=subject max_enddt) / view=need;
  do until (last.subject);
/*    set have (where=(enddt^=.));*/
      set have (where=(endt^=.));
    by subject;
/*    max_enddt=max(max_enddt,enddt);*/
      max_enddt=max(max_enddt,endt);
  end;
run;

But, getting the following error:

1156  data need (keep=subject max_enddt) / view=need;
1157    do until (last.subject);
1158  /*    set have (where=(enddt^=.));*/
1159        set have (where=(endt^=.));
ERROR: WHERE clause operator requires compatible variables.
1160      by subject;
1161  /*    max_enddt=max(max_enddt,enddt);*/
1162        max_enddt=max(max_enddt,endt);
1163    end;
1164  run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

Please, help

Tom
Super User Tom
Super User

Again, that error means that ENDT is NOT a date variable.  SAS stores date variables as NUMERIC.  But the error message is saying that the where clause is mixing types. Since . is the NUMERIC missing value that means that ENDT is in fact a CHARACTER variable and not a DATE at all.

 

If you have ENDT as character in the style you showed then you can make a new NUMERIC variable by using the INPUT() function with the DATE informat.  Make sure to attach the DATE format to the new variable so it prints the dates in a style that humans can recognize.

 

Moksha
Pyrite | Level 9

Thank you. I have corrected the ENDT variable from character to numeric format:-

data have;
    infile datalines truncover;
    input SUBJECT : $3. CMENDTC : $20. ENDT : anydtdte.;
	format ENDT date9.;
    datalines;
101 2011-09-08 08SEP2011
101 2011-09-08 08SEP2011
101 2011-09-08 08SEP2011
102 2011-11-21 21NOV2011
103 . .
103 . .
103 . .
103 2011-10-29 29OCT2011
105 . .
106 2012-01-09 29OCT2012
107 2011-09-23 23SEP2011
107 2011-09-25 25SEP2011
107 2011-10-15 15OCT2011
107 2011-09-26 26SEP2011
108 2011-10-15 15OCT2011
108 2011-11-14 14NOV2011
109 . .
109 2011-09-20 20SEP2011
109 2011-09-25 25SEP2011
109 2011-09-13 13SEP2011
109 . .
109 2011-09-25 25SEP2011
109 2011-09-25 25SEP2011
109 . .
110 . .
110 2011-10-12 12OCT2011
110 2011-10-23 23OCT2011
111 2011-10-21 21OCT2011
112 . .
113 . .
113 . .
114 2011-10-05 05OCT2011
114 2011-10-02 02OCT2011
114 2011-10-02 02OCT2011
115 2011-08 31AUG2011
115 2011-06 30JUN2011
115 . 31DEC2011
115 2011 31DEC2011
115 . 31DEC2011
;
run;

But, the output is not as expected except for the subjects 112, 113 and 115: Attaching the screenshots:

Moksha_0-1748012843058.png

Moksha_1-1748012961809.png

 

Please, help.

PaigeMiller
Diamond | Level 26

There was an error in my PROC SUMMARY. This is the correct code

 

proc summary data=have1 nway;
    class subject;
    var date;
    output out=maxx max=max_endt;
run;
--
Paige Miller
Moksha
Pyrite | Level 9

Thank you very very much PaigeMiller. I have tried with this new proc summary and got the required output. Once again thank you very much and I thank Tom and mkeintz and entire community for extending their helping hands. This SAS community is a wonderful community.

Moksha
Pyrite | Level 9

Yes, Tom, by mistake I was passing ENDT as character variable. Now, with the following input dataset, I ran the code given by PaigeMiller but output is not as required.

Input dataset:

data have;
    infile datalines truncover;
    input SUBJECT : $3. CMENDTC : $20. ENDT : anydtdte.;
	format ENDT date9.;
    datalines;
101 2011-09-08 08SEP2011
101 2011-09-08 08SEP2011
101 2011-09-08 08SEP2011
102 2011-11-21 21NOV2011
103 . .
103 . .
103 . .
103 2011-10-29 29OCT2011
105 . .
106 2012-01-09 29OCT2012
107 2011-09-23 23SEP2011
107 2011-09-25 25SEP2011
107 2011-10-15 15OCT2011
107 2011-09-26 26SEP2011
108 2011-10-15 15OCT2011
108 2011-11-14 14NOV2011
109 . .
109 2011-09-20 20SEP2011
109 2011-09-25 25SEP2011
109 2011-09-13 13SEP2011
109 . .
109 2011-09-25 25SEP2011
109 2011-09-25 25SEP2011
109 . .
110 . .
110 2011-10-12 12OCT2011
110 2011-10-23 23OCT2011
111 2011-10-21 21OCT2011
112 . .
113 . .
113 . .
114 2011-10-05 05OCT2011
114 2011-10-02 02OCT2011
114 2011-10-02 02OCT2011
115 2011-08 31AUG2011
115 2011-06 30JUN2011
115 . 31DEC2011
115 2011 31DEC2011
115 . 31DEC2011
;
run;
data have1;
    set have;
	date=input(cmendtc,yymmdd10.);
	end;
run;
proc summary data=have1 nway;
    class subject;
    var endt;
    output out=maxx max=max_endt;
run;
data want;
    merge have maxx;
    by subject;
    if missing(endt) then endt1=max_endt;
    else endt1=endt;
    format endt1 date9.;
run;
proc print data=want;run;

Output screenshots:

Moksha_0-1748013998372.png

Moksha_1-1748014029893.png

Please, help

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1572 views
  • 2 likes
  • 4 in conversation