BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Etoo12121
Obsidian | Level 7

Hi all,

 

I'm stumped on something and hoping to get some help.

I have data that has claims and multiple claim lines. What I want to do is keep only the claims where BOTH code 29 and 54 appear on the claim. So in the sample below, I would want only claim 2339 and 5498. I created new variables  (CountA and CountB ) and thinking of using 

if first.claim = last.claim and first.CountA ne last.CountA then output; else delete; but I'm not sure that would work

 

ClaimLineCodeCountACountB
233912910
233925401
297012910
718215401
767012910
293815401
293825401
549812910
549825401
411812910
411822910

 

any ideas how I can tackle this?

1 ACCEPTED SOLUTION

Accepted Solutions
jimbarbour
Meteorite | Level 14

I think you could do a sub-select type join where you join the data to itself selecting only records that have code 29 and also exist in a sub-select that only has records with 54.

 

Jim

 

This code:

Data	Have;
	LENGTH
		Claim	$4
		Line	$1
		Code	$2
		CountA	$1
		CountB	$1
		;

	INFILE	DATALINES DSD DLM='09'X	MISSOVER;

	INPUT
		Claim	$
		Line	$
		Code	$
		CountA	$
		CountB	$
		;

DATALINES;
2339	1	29	1	0 
2339	2	54	0	1 
2970	1	29	1	0 
7182	1	54	0	1 
7670	1	29	1	0 
2938	1	54	0	1 
2938	2	54	0	1 
5498	1	29	1	0 
5498	2	54	0	1 
4118	1	29	1	0 
4118	2	29	1	0 
;
RUN;

PROC	SQL	NOPRINT;
	CREATE	TABLE	Want	AS
		SELECT	*	FROM	Have
			WHERE	Code	=	'29'
				AND	Claim	IN	(
				SELECT	Claim	FROM	Have
					WHERE	Code	=	'54'
					);
QUIT;

Yields:

jimbarbour_0-1623296631536.png

 

View solution in original post

6 REPLIES 6
jimbarbour
Meteorite | Level 14

I think you could do a sub-select type join where you join the data to itself selecting only records that have code 29 and also exist in a sub-select that only has records with 54.

 

Jim

 

This code:

Data	Have;
	LENGTH
		Claim	$4
		Line	$1
		Code	$2
		CountA	$1
		CountB	$1
		;

	INFILE	DATALINES DSD DLM='09'X	MISSOVER;

	INPUT
		Claim	$
		Line	$
		Code	$
		CountA	$
		CountB	$
		;

DATALINES;
2339	1	29	1	0 
2339	2	54	0	1 
2970	1	29	1	0 
7182	1	54	0	1 
7670	1	29	1	0 
2938	1	54	0	1 
2938	2	54	0	1 
5498	1	29	1	0 
5498	2	54	0	1 
4118	1	29	1	0 
4118	2	29	1	0 
;
RUN;

PROC	SQL	NOPRINT;
	CREATE	TABLE	Want	AS
		SELECT	*	FROM	Have
			WHERE	Code	=	'29'
				AND	Claim	IN	(
				SELECT	Claim	FROM	Have
					WHERE	Code	=	'54'
					);
QUIT;

Yields:

jimbarbour_0-1623296631536.png

 

Etoo12121
Obsidian | Level 7
Thank you @jimbarbour. worked like a charm
jimbarbour
Meteorite | Level 14

You're welcome.  Glad I was able to help.

 

Jim

andreas_lds
Jade | Level 19

Not sure, what you expect as result exactly. I removed the count-variables.

data have;
   length
      Claim $4
      Line 8
      Code $2
   ;

   input Claim Line Code;

   datalines;
2339  1  29
2339  2  54
2970  1  29
7182  1  54
7670  1  29
2938  1  54
2938  2  54
5498  1  29
5498  2  54
4118  1  29
4118  2  29
;

proc sort data=have;
   by claim line;
run;

data interesting;
   set have;
   by claim;
   
   lastCode = lag(Code);
   
   if first.Claim and last.Claim then delete;
   
   if first.Claim then lastCode = ' ';
   
   if last.Claim then do;
      if lastCode = '29' and Code = '54' or lastCode = '54' and Code = '29' then output;
   end;
   
   keep claim;
run;

data want;
   merge have interesting(in= isInteresting);
   by Claim;
   
   if isInteresting;
run;
Kurt_Bremser
Super User

Use a double DO loop:

data want;
has_29 = 0;
has_54 = 0;
do until (last.claim);
  set have;
  by claim notsorted;
  if code = "29" then has_29 = 1;
  if code = "54" then has_54 = 1;
end;
do until (last.claim);
  set have;
  by claim notsorted;
  if has_29 and has_54 then output;
end;
drop has_29 has_54;
run;
tarheel13
Rhodochrosite | Level 12
data have;
   length
      Claim $4
      Line 8
      Code $2
   ;

   input Claim Line Code;

   datalines;
2339  1  29
2339  2  54
2970  1  29
7182  1  54
7670  1  29
2938  1  54
2938  2  54
5498  1  29
5498  2  54
4118  1  29
4118  2  29
;
run;

data claims;
	set have;
	has29=code=29;
	has54=code=54;
run;

proc sql;
	create table claims2 as select claim,max(has29) as has29, max(has54) as has54
		from claims
		group by claim;
quit;

data claims3;
	set claims2;
	if has29=1 and has54=1;
run;

This will keep only claim 2339 and 5498. First, I create boolean variables and then select the max of them and group by claim to collapse down to 1 row per claim. In another data step, I use a subsetting if to keep only where the booleans are both equal to 1 (has code = 29 and has code = 54) within the same claim. 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1560 views
  • 1 like
  • 5 in conversation