BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasuser123
Calcite | Level 5

Update to the logic

data value ;
input Case Type1 $ Value1 Date1 mmddyy10. Type2 $ Value2 Date2 mmddyy10. Type3 $ Value3 Date3 mmddyy10.;
cards;
1 V 7 12/14/2021 C 12 11/13/2020 V 24 06/05/2019
2 C 8 12/15/2021 B 13 11/14/2020 C 9 06/06/2019
3 V 7 12/14/2021 V 7 11/13/2020   V 7 12/16/2021
4 C 8 12/15/2021 C 8 11/14/2020  C 8 06/06/2019
5 . 6 12/15/2021  B 13 11/14/2020 B 8 06/06/2019
;
/*REVISED LOGIC
CASE#1 Regardless of other Types, select V with highest Value and associated Date
CASE#2 If V type doesn't exist, then pick C type with the lowest Value and associated Date
CASE#3 and CASE#4 If same types exist with same values (for example only C or only V WITH VALUE OF 7), then pick the Value with oldest Date
CASE#5 If V or C don't exist but type is blank or B, then pick the lowest Value and associated Date*/;
run;

Orange colored ones need to be in Value_New and Date_New

 

 

 

 

_________________________________________________________________________________________________________________

 

Using SAS arrays and do loop, I am trying to create new two columns Value_New and Date_New. The logic : 

If type is V, then pick  the highest value and Date associated with that V
If only type C exists, than take the lowest value and Date associated with that C
For any type, if there is only one type (only V or only C), and if values are equal, than take the one with earliest date.

 

CaseType1Value1Date1Type2Value2Date2Type3Value3Date3Value_NewDate_New
1V712/14/2021C1211/13/2020V246/5/2019246/5/2019
2C812/15/2021V1311/14/2020C86/6/201986/6/2019
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Technique when using a long dataset:

data value;
input Case Type1 $ Value1 Date1 mmddyy10. Type2 $ Value2 Date2 mmddyy10. Type3 $ Value3 Date3 mmddyy10.;
cards;
1 V 7 12/14/2021 C 12 11/13/2020 V 24 06/05/2019
2 C 8 12/15/2021 B 13 11/14/2020 C 9 06/06/2019
3 V 7 12/14/2021 V 7 11/13/2020   V 7 12/16/2021
4 C 8 12/15/2021 C 8 11/14/2020  C 8 06/06/2019
5 . 6 12/15/2021  B 13 11/14/2020 B 8 06/06/2019
;

data long;
set value;
array t {*} type:;
array v {*} value:;
array d {*} date:;
do i = 1 to dim(t);
  type = t{i};
  value = v{i};
  date = d{i};
  output;
end;
format date yymmdd10.;
keep case type value date;
run;

proc sort data=long;
by case date;
run;

data want;
do until (last.case);
  set long;
  by case;
  select (type);
    when ('V') do;
      if _type ne 'V' or (_type = 'V' and _value lt value)
      then do;
        _type = 'V';
        _value = value;
        _date = date;
      end;
    end;
    when ('C') do;
      if _type in ('B','') or (_type = 'C' and (_value gt value or _value = .))
      then do;
        _type = 'C';
        _value = value;
        _date = date;
      end;
    end;
    when ('B','') do;
      if _type in ('B','') and (_value gt value or _value = .)
      then do;
        _type = type;
        _value = value;
        _date = date;
      end;
    end;
  end;
end;
format _date yymmdd10.;
drop type value date;
rename
  _type = type
  _value = value
  _date = date
;
run;

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

I don't think your rules are complete.

What do want as the answer when there are both V and C types?

 

Wouldn't this whole operation be a lot easier if you had a separated observation for each TYPE/VALUE/DATE combination?

Reeza
Super User

Your rules and data don't align. 

 


@sasuser123 wrote:

Using SAS arrays and do loop, I am trying to create new two columns Value_New and Date_New. The logic : 

If type is V, then pick  the highest value and Date associated with that V
If only type C exists, than take the lowest value and Date associated with that C
For any type, if there is only one type (only V or only C), and if values are equal, than take the one with earliest date.

 

Case Type1 Value1 Date1 Type2 Value2 Date2 Type3 Value3 Date3 Value_New Date_New
1 V 7 12/14/2021 C 12 11/13/2020 V 24 6/5/2019 24 6/5/2019
2 C 8 12/15/2021 V 13 11/14/2020 C 8 6/6/2019 8 <- not max? 6/6/2019

Both rows have a V but you take the minimum in the second case? I agree with others that a long format is more usuable here rather than a wide format but we do need clarification on the logic first.

 

Here's a rough idea to get you started but I wouldn't recommend this approach. 

data have (drop = value_new date_new);
infile cards dlm='09'x;
informat case 8. type1 - type3 $8. Value1-Value3 Value_new 8. Date1-Date3 Date_new mmddyy10. ;
format case 8. type1 - type3 $8. Value1-Value3 Value_new 8. Date1-Date3 Date_new mmddyy10. ;
input Case	Type1	Value1	Date1	Type2	Value2	Date2	Type3	Value3	Date3	Value_New	Date_New;
cards;
1	V	7	12/14/2021	C	12	11/13/2020	V	24	6/5/2019	24	6/5/2019
2	C	8	12/15/2021	V	13	11/14/2020	C	8	6/6/2019	8	6/6/2019
3	V	7	12/14/2021	V	12	11/13/2020	V	24	6/5/2019	24	6/5/2019
4	C	8	12/15/2021	C	13	11/14/2020	C	8	6/6/2019	8	6/6/2019
;;;;
run;

proc print data=have;run;

data want;
set have;
array _val(3) value1-value3;
array _date(3) date1-date3;
array _type(3) type1-type3;

C_Found = 0; V_Found=0;
if whichc('V', of _type(*))>0 then V_Found = 1;
if whichc('C', of _type(*))>0 then C_Found = 1;


if V_FOUND then do;
value_new = max(of _val(*));
date_new = _date(whichn(value_new, of _val(*)));
end;
*have both V/C;
else do;
value_new = min(of _val(*));
date_new = _date(whichn(value_new, of _val(*)));
end;

format date_new mmddyys10.;

run;

proc print data=want;
run;

 

 

sasuser123
Calcite | Level 5

Thanks, this code is very helpful, I run it and identified additional info. I  realized that there are blank and B types that I didn't include in the original post.  V type is priority and need to have the highest value of V type.  If V type doesn't exist, than I need C type with the lowest value of type C.  When C or V don't exist then I can take any types with lowest value.  When types are same and values are same, such as all types are C and values=8, then I need the one with the oldest date.  I hope this makes sense.

 

Reeza
Super User
Have you tried modifying the code to account for those?
sasuser123
Calcite | Level 5

Yes, I modified the code but it stopped picking the highest value for V and doesn't pick the oldest date if types and values are same.

 

data want;
set value;
array _val(3) value1-value3;
array _date(3) date1-date3;
array _type(3) type1-type3;

C_Found = 0; V_Found=0;b_Found = 0;
if whichc('V', of _type(*))>0 then V_Found = 1;
if whichc('C', of _type(*))>0 then C_Found = 1;
if whichc(' ', of _type(*))>0 then B_Found = 1;
if whichc('B', of _type(*))>0 then B_Found = 1;


if V_FOUND=1 then do;
value_new = max(of _val(*));
date_new = _date(whichn(value_new, of _val(*)));
end;


if C_FOUND=1 then do;
value_new = min(of _val(*));
date_new = _date(whichn(value_new, of _val(*)));
end;

else do;
value_new = min(of _val(*));
date_new = _date(whichn(value_new, of _val(*)));
end;

format date_new mmddyys10.;

run;

Kurt_Bremser
Super User

Technique when using a long dataset:

data value;
input Case Type1 $ Value1 Date1 mmddyy10. Type2 $ Value2 Date2 mmddyy10. Type3 $ Value3 Date3 mmddyy10.;
cards;
1 V 7 12/14/2021 C 12 11/13/2020 V 24 06/05/2019
2 C 8 12/15/2021 B 13 11/14/2020 C 9 06/06/2019
3 V 7 12/14/2021 V 7 11/13/2020   V 7 12/16/2021
4 C 8 12/15/2021 C 8 11/14/2020  C 8 06/06/2019
5 . 6 12/15/2021  B 13 11/14/2020 B 8 06/06/2019
;

data long;
set value;
array t {*} type:;
array v {*} value:;
array d {*} date:;
do i = 1 to dim(t);
  type = t{i};
  value = v{i};
  date = d{i};
  output;
end;
format date yymmdd10.;
keep case type value date;
run;

proc sort data=long;
by case date;
run;

data want;
do until (last.case);
  set long;
  by case;
  select (type);
    when ('V') do;
      if _type ne 'V' or (_type = 'V' and _value lt value)
      then do;
        _type = 'V';
        _value = value;
        _date = date;
      end;
    end;
    when ('C') do;
      if _type in ('B','') or (_type = 'C' and (_value gt value or _value = .))
      then do;
        _type = 'C';
        _value = value;
        _date = date;
      end;
    end;
    when ('B','') do;
      if _type in ('B','') and (_value gt value or _value = .)
      then do;
        _type = type;
        _value = value;
        _date = date;
      end;
    end;
  end;
end;
format _date yymmdd10.;
drop type value date;
rename
  _type = type
  _value = value
  _date = date
;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1281 views
  • 3 likes
  • 4 in conversation