BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
JH74
Obsidian | Level 7

Hi.  I have a data set that contains the fields claim_number claim_line_number, and dialysis.  I need to create a new varible named remove_clm and this variable should be populated with a "Y" for each claim line for the claim number, if ANY line has dialysis = "Y".  

 

Data Example:

Claim_Number Claim_Line_Number Dialysis
1234567 1 Y
1234567 2 Y
1234567 3 N
1234567 4 N
7654321 1 N
7654321 2 N
7654321 3 N
7654321 4 N
7654321 5 N
7654321 6 Y
2468100 1 Y
2468100 2 Y
2468100 3 Y
3456789 1 Y
3456789 2 N
3456789 3 Y
3456789 4 Y
3456789 5 N

 

I have tried this code:

proc sort data=save.dataset out=dataset_srt;
by claim_number dialysis;
run;
 
data test;
set dataset_srt;
by claim_number dialysis;
retain remove_claim;
if first.claim_number then remove_claim='N';
if dialysis = "Y" then remove_claim = 'Y';
run;
 
But, the above code is not giving me a "Y" for each line of the claim, only the line where Dialysis = "Y".  Like this:
Claim_Number Claim_Line_Number Dialysis Remove_Claim
1234567 1 Y Y
1234567 2 Y Y
1234567 3 N N
1234567 4 N N
7654321 1 N N
7654321 2 N N
7654321 3 N N
7654321 4 N N
7654321 5 N N
7654321 6 Y Y
2468100 1 Y Y
2468100 2 Y Y
2468100 3 Y Y
3456789 1 Y Y
3456789 2 N N
3456789 3 Y Y
3456789 4 Y Y
3456789 5 N N

 

The output I am trying to get is this:

Claim_Number Claim_Line_Number Dialysis Remove_Claim
1234567 1 Y Y
1234567 2 Y Y
1234567 3 N Y
1234567 4 N Y
7654321 1 N Y
7654321 2 N Y
7654321 3 N Y
7654321 4 N Y
7654321 5 N Y
7654321 6 Y Y
2468100 1 Y Y
2468100 2 Y Y
2468100 3 Y Y
3456789 1 Y Y
3456789 2 N Y
3456789 3 Y Y
3456789 4 Y Y
3456789 5 N Y

 

I appreciate any help.  Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards expandtabs;
input Claim_Number	Claim_Line_Number	Dialysis $;
cards;
1234567	1	Y
1234567	2	Y
1234567	3	N
1234567	4	N
7654321	1	N
7654321	2	N
7654321	3	N
7654321	4	N
7654321	5	N
7654321	6	Y
2468100	1	Y
2468100	2	Y
2468100	3	Y
3456789	1	Y
3456789	2	N
3456789	3	Y
3456789	4	Y
3456789	5	N
;
proc sql;
create table want as
select *,max(Dialysis) as Remove_Claim from have group by Claim_Number;
quit;

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

If the data is already sorted by claim_number, there is no need to sort it further.

 

You can, for each claim number do a self-merge, of all the lines with dialysis='Y', merge with all lines for the claim number, as below (untested, in the absence of a data sample in the form of a working data step:

 

data want;
  merge save_dataset (where=(dialysis='Y') in=found_a_yes)
        save_dataset ;
  by claim_number;
  remove_claim=ifc(found_a_yes=1,'Y','N');
run;

 

If there is even a single observations with dialysis='Y' input by the merge statement, then the dummy variable found_a_yes will equal one for all merged observations for that claim_number.   And since the subset of dialysis=1 is listed first in the merge statement, its values will be overwritten by the values taken from the second target of the merge statement, i.e. the complete list of observations, in original order.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User
data want;
merge
  have
  have (
    keep=claim_number dialysis
    rename=(dialysis=remove_claim)
    where=(remove_claim = "Y")
  )
;
by claim_number;
run;

Since dialysis already has the same content as your wanted variable, you can use it directly.

Ksharp
Super User
data have;
infile cards expandtabs;
input Claim_Number	Claim_Line_Number	Dialysis $;
cards;
1234567	1	Y
1234567	2	Y
1234567	3	N
1234567	4	N
7654321	1	N
7654321	2	N
7654321	3	N
7654321	4	N
7654321	5	N
7654321	6	Y
2468100	1	Y
2468100	2	Y
2468100	3	Y
3456789	1	Y
3456789	2	N
3456789	3	Y
3456789	4	Y
3456789	5	N
;
proc sql;
create table want as
select *,max(Dialysis) as Remove_Claim from have group by Claim_Number;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 390 views
  • 5 likes
  • 4 in conversation