BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
arzon
Calcite | Level 5

Hi Dears,

 

I would greatly appreciate your help to delete some matched observations from a dataset. Briefly, I have a dataset in which cases are matched with controls and then a variable (exclusion_variable) is used to exclude ID that met the criterion of exclusion. As presented in the table below, I would like to delete the corresponding controls as the case (ID=1x) is to be excluded. 

 

ID    matching_ID   case   exclusion_variable

1x     aa                  1         1

2x     aa                  0         0

3x     aa                  0         0

 

Thank you very match for your valuable help!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Why do your refer BOTH to variable CASE and variable EXCLUSION_VAR?  Do you just mean to filter out all matching_id's for which any record has an EXCLUSION_VAR=1?

 

Edited addition:

 

If so, then this will do:

 

data have;
  input id :$2. matching_id :$2. case exclusion_variable;
datalines;
1x     aa                  1         1
2x     aa                  0         0
3x     aa                  0         0
4X     bb                  0         0
run;

data want (drop=rc);
  set have;
  if _n_=1 then do;
    declare hash exclude_id (dataset:'have (keep=matching_id exclusion_variable where=(exclusion_variable=1))');
    exclude_id.definekey('matching_id');
    exclude_id.definedone();
  end;
  rc=exclude_id.check();
  if rc^=0;
run;
        

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
Reeza
Super User

@arzon welcome to the SAS forums. 

Unfortunately your question is unclear. Can you please clarify what you have and what you want as output?

 


@arzon wrote:

Hi Dears,

 

I would greatly appreciate your help to delete some matched observations from a dataset. Briefly, I have a dataset in which cases are matched with controls and then a variable (exclusion_variable) is used to exclude ID that met the criterion of exclusion. As presented in the table below, I would like to delete the corresponding controls as the case (ID=1x) is to be excluded. 

 

ID    matching_ID   case   exclusion_variable

1x     aa                  1         1

2x     aa                  0         0

3x     aa                  0         0

 

Thank you very match for your valuable help!


 

 

arzon
Calcite | Level 5

Hi Reeza,

Thanks for your reply and sorry for being unclear (I am not fully fluent in English and that may explain why)

To clarify, lets consider the 'exclusion_variable' as 'death' and that the ID=1x  is dead (case=1) and is to be excluded from the dataset (if death=1 then delete). As the ID 2x, and  3x are matched to the ID 1x (i.e. same matching_ID) I would like to delete these ID from the dataset to preserve the matching nature of the overall dataset.  So, How can I can suppress these matched ID?

Hope this clarifies.

 

Thank you very much

 

 

Reeza
Super User

So based on your sample data what do you expect as output?

arzon
Calcite | Level 5

ID    matching_ID   case   exclusion_variable

1x     aa                  1         1

2x     aa                  0         0

3x     aa                  0         0

4x     bb                  1         0

5x     bb                  0         0

6x     bb                  0         0

 

From the above table, my output will be:

 

ID    matching_ID   case   exclusion_variable

 

4x     bb                  1         0

5x     bb                  0         0

6x     bb                  0         0

 

 

mkeintz
PROC Star

Why do your refer BOTH to variable CASE and variable EXCLUSION_VAR?  Do you just mean to filter out all matching_id's for which any record has an EXCLUSION_VAR=1?

 

Edited addition:

 

If so, then this will do:

 

data have;
  input id :$2. matching_id :$2. case exclusion_variable;
datalines;
1x     aa                  1         1
2x     aa                  0         0
3x     aa                  0         0
4X     bb                  0         0
run;

data want (drop=rc);
  set have;
  if _n_=1 then do;
    declare hash exclude_id (dataset:'have (keep=matching_id exclusion_variable where=(exclusion_variable=1))');
    exclude_id.definekey('matching_id');
    exclude_id.definedone();
  end;
  rc=exclude_id.check();
  if rc^=0;
run;
        

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
arzon
Calcite | Level 5

Hi  mkeintz

 

 

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1906 views
  • 0 likes
  • 3 in conversation