BookmarkSubscribeRSS Feed
bknitch
Quartz | Level 8

I have a dataset where i'm trying to determine fallout/ outliers from a previous year. Here is an example of the table below: 

Member YR A1 A2 A3
123 2019 1 0 1
123 2020 1 0 1
124 2019 1 1 0
124 2020 0 1 0
125 2019 0 1 1
125 2020 1 1 1
126 2019 0 1 1
126 2020 0 0 1

The A1-A3 are hierarchy based, A1 being most important and A3 being least. I'm trying to identify members and their corresponding A1 - A3 that didn't match from the previous year. So an expected output would be something like below: 

Member YR A1 A2 A3
124 2020 Missing Found NoMatch
126 2020 NoMatch Missing Found

 

The goal is to find the A1-A3 variable assigned to a member that's missing, matched or has No Match on the previous year. In this case I'm looking at 2019 and matching it to 2020, so I only need 2020 rows/columns back. 

 

I've tried building arrays and transposing the data with incld and excld columns for each year to find the missing or Nomatches and i cant seem to get the desired output. Any help would be appreciated. 

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Please explain in more detail the criteria that lead to "Missing","Found" or "NoMatch"

--
Paige Miller
bknitch
Quartz | Level 8
@PaigeMiller in this case it would be Member= 124 had an A1= 1 in 2019 but an A1=0 in 2020 this qualifies as 'Missing', A2=1 in 2019 and A2=1 in 2020 so this qualifies as 'Match' and A3=0 for 2019 and A3=0 for 2020 so this qualifies as 'NoMatch'.

So if a member had an A1 - A3 populated in 2019 and its not populated in 2020 this would be a 'Missing'. If a Member had an A1-A3 populated in 2019 and 2020 then this is a Match and if a Member had an A1-A3 not populated for 2019 and 2020 this is a 'Nomatch'.
PaigeMiller
Diamond | Level 26

Okay, thanks, but you also said

 

The A1-A3 are hierarchy based, A1 being most important and A3 being least.

I don't see how that works into the logic here.

--
Paige Miller
bknitch
Quartz | Level 8
So for that I'm only concerned where if Missing is higher in the hierarchy. Meaning, if an A3 was populated in 2019 and not populated in 2020 but an A2 was populated in 2020 and not in 2019 I would exclude this/ or not look at this record. Hope that makes sense...
maguiremq
SAS Super FREQ

Where does 'found' come in? 

 

Edit: nevermind, I see.

maguiremq
SAS Super FREQ

This way isn't terribly concise. Someone may have a better method, but it seems to match your output.

proc format;
	value $matches
		"1,1" = "Found"
		"0,0" = "NoMatch"
		"1,0" = "Missing"
		"0,1" = "NoMatch";
run;

data want (drop = a: output_flag i);
	merge 	have (in = a where = (yr = 2019) rename = (a1 = a1_19 a2 = a2_19 a3 = a3_19))
		  	have (in = b where = (yr = 2020) rename = (a1 = a1_20 a2 = a2_20 a3 = a3_20));
	by		member;

			_a1_1920 = put(catx(",", a1_19, a1_20), $matches.);
			_a2_1920 = put(catx(",", a2_19, a2_20), $matches.);
			_a3_1920 = put(catx(",", a3_19, a3_20), $matches.);

	array checkmiss [*] _a:;

	do i = 1 to dim(checkmiss);
		if checkmiss[i] = "Missing" then output_flag = "1";
	end;

	if output_flag = "1" then output;

run;

 

Obs Member YR _a1_1920 _a2_1920 _a3_1920 
1 124 2020 Missing Found NoMatch 
2 126 2020 NoMatch Missing Found 

 

PhilC
Rhodochrosite | Level 12
data have;
  infile datalines;
  input member year A1 - A3;
datalines;
123 2019 1 0 1
123 2020 1 0 1
124 2019 1 1 0
124 2020 0 1 0
125 2019 0 1 1
125 2020 1 1 1
126 2019 0 1 1
126 2020 0 0 1
;;;;;;
run;

proc format;
  value FMnM /*Found-Missing-noMatch*/
  1="Missing"	/*1,0*/
  2="New?"   /*0,1*/
  3="Found"  /*1,1*/
  0="NoMatch" /*0,0*/
  ;
run;

proc sql;
  create table want as
    select this.member, this.year,
           this.A1*2+last.A1 format=FMnM. as A1,
           this.A2*2+last.A2 format=FMnM. as A2,
           this.A3*2+last.A3 format=FMnM. as A3
    from have as this, have as last
    where this.year=last.year+1 
      AND this.Member=last.member;
quit;

 

bknitch
Quartz | Level 8

@maguiremq this worked quite well. Thank you for this. I modified it slightly to include hierarchy. Thank you very much. Both you and another use provided solutions.  

PaigeMiller
Diamond | Level 26

@bknitch wrote:
So for that I'm only concerned where if Missing is higher in the hierarchy. Meaning, if an A3 was populated in 2019 and not populated in 2020 but an A2 was populated in 2020 and not in 2019 I would exclude this/ or not look at this record. Hope that makes sense...

Sorry, no I still don't understand. What does "exclude" mean in the context of your earlier description of how to assign the "Missing" "Found" "NoMatch" to each cell?

 

But I think we have gone far enough down this path. If the code provided so far meets your need, fine, problem solved. If not, I am asking you to re-write the requirements from scratch, to address all of these issues, so that the explanation is clear and in one description of the logic, so we don't have to scroll up and down and re-read earlier comments to put it all together and understand.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 773 views
  • 2 likes
  • 4 in conversation