How do I use check the value of variables in one dataset by using another dataset?

Reply
Occasional Contributor
Posts: 8

How do I use check the value of variables in one dataset by using another dataset?

I have two datasets

1. Contains records for all the surgeries/procedures I want to identify (variable of interest CODE, there are 40+ observations/records = 40+ different procedure codes)

2. Adminstrative records, where for each record/observation there are up to 50 procedure variables (eg PROCEDURE1-PROCEDURE50).

 

Objective - For each record in dataset 2 (admin records), I want to check all the non-missing values for the procedure variables with the procedure codes from dataset 1 to see if the admin record contains a procedure of interest.

 

Example - DATATSET 1

 

CODE CODE_NODASH
14215-00 1421500
14215-01 1421501
30511-00 3051100
30511-01 3051101
30511-02 3051102
30511-03 3051103
30511-04 3051104
30511-05 3051105
30511-06 3051106
30511-07 3051107
30511-08 3051108

 

Example - DATASET 2

(I have made up this dataset - it contains 13 records with only 10 procedure codes. There should be 3 records in this dataset that contain at least one of the procedure codes listed in the example dataset above, IDs 141781, 143255, 259028)

SAS Output

ID DT_PROCEDURE PROCEDURE1 PROCEDURE2 PROCEDURE3 PROCEDURE4 PROCEDURE5 PROCEDURE6 PROCEDURE7 PROCEDURE8 PROCEDURE9 PROCEDURE10
127280 28DEC2001 95550-03                  
127282 12NOV2002 48239-00 47726-00 92514-99 95550-03            
127284 26DEC2006 90465-01                  
127284 25NOV2011 90465-05                  
127286 21DEC2016 35608-00 92514-29                
127288 11JAN2007 90468-01 90472-00 92507-99              
127302 23AUG2012 92508-99 95550-01                
127303 30JUL2002 97324-00 92514-99                
127303 28AUG2008 35703-00 35630-00 92514-19              
127303 19FEB2010 90465-05 92507-99                
141781 18MAY2015 14215-01 92515-99                
143255 05DEC2013 30511-06 92514-49 95550-03              
259028 01NOV2007 14215-00 92515-29                

 

I hope this makes sense! I realise i could use an array for procedures1-procedures10 and then cut-n-paste the desired procedures code into a if-in statement , such as

array codes [*] procedure_code: ;
do i=1 to dim(codes); if codes[i] in ('14215-00' '14215-01' '30511-00' '30511-01' '30511-02' '30511-03' '30511-04' '30511-05' '30511-06' '30511-07' '30511-08' ) then surgery=1; end;

But sometimes the codes are of interest will change. Currently, I can pull the codes from an existing dataset.

 

I was wondering if there is anyway this can be coded so that I don't have to manually list the codes of interest.

 

thanks

(and apologies for the centre-justified text. I don't know how to correct it!)

 

Super User
Super User
Posts: 9,599

Re: How do I use check the value of variables in one dataset by using another dataset?

First, post test data in the form of a datastep, not here to type that in.

Second, it is never really, from a programming point of view, a good idea to have transposed data.  For instance, if the number of procedures changes, then you need to re-do your array and loop, or write code to try to work out how many.  Normalised is a far better structure as the structure does not change with more variables, but observation counts increase.  So:

ID DT_PROCEDURE PROCEDURE1

  

Could become:

ID   DT_PROCEDURE  PROCEDURE_NO  PROCEDURE

 

In this way you can still transpose if you need to for an output.  You can minimise the data stored, as you don't need the obs*variables total, only the variables*actual data (i.e. row one of your test data, you have 49 blank data items which are totally irrelevant and only there due to the choice in structure), can simplify your issue here because you could then merge data together or use loopkups, and you will simplify your code a lot.  There really are no benefits to the Excel (transposed) way of thinking other than as a report for someone to look at - which you can do at the end when all processing is done.

Occasional Contributor
Posts: 8

Re: How do I use check the value of variables in one dataset by using another dataset?

Ah, sorry about the wrong posting of the data. I will remember for next time.
And I should have realised that a 'long' data structure would be better for the admin data! Thanks!
Ask a Question
Discussion stats
  • 2 replies
  • 86 views
  • 0 likes
  • 2 in conversation