I have a dataset containing several observations. I am interested in looking at just 6 subjects to see how they influence the results of a Variable1 if I change it to be the value in Variable2. I want to change the value of one subject at a time (6 passes) then change 2 subjects at a time (15 passes) then 3 subjects at a time (20 passess), etc. up to changing all 6 patients. I know the combinations that the patients make (a total of 63 passess), but it is not as easy as setting up 6 nested do-loops because that would run far too many iterations. I hope this makes sense. Any suggestions on how to go about this?
Here is a very small example. Suppose I have 5 subjects, and I am only interested in doing what I described above to 2 of them (a total of 6 passes). I want to change the subjects value from Var1 to Var2.
Original Data:
Subject | Var1 | Var2 |
---|---|---|
1 | 7 | 4 |
2 | 10 | 15 |
3 | 3 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change 1 subject at a time:
Change subject 1
Subject | Var1 | Var2 |
---|---|---|
1 | 4 | 4 |
2 | 10 | 15 |
3 | 3 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change subject 2
Subject | Var1 | Var2 |
---|---|---|
1 | 7 | 4 |
2 | 15 | 15 |
3 | 3 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change subject 3
Subject | Var1 | Var2 |
---|---|---|
1 | 7 | 4 |
2 | 10 | 15 |
3 | 9 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change 2 subjects at a time:
Change subjects 1&2
Subject | Var1 | Var2 |
---|---|---|
1 | 4 | 4 |
2 | 15 | 15 |
3 | 3 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change subjects 1&3
Subject | Var1 | Var2 |
---|---|---|
1 | 4 | 4 |
2 | 10 | 15 |
3 | 9 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
Change subjects 2&3
Subject | Var1 | Var2 |
---|---|---|
1 | 7 | 4 |
2 | 15 | 15 |
3 | 9 | 9 |
4 | 2 | 2 |
5 | 9 | 9 |
I believe that this paper will help answer my own question:
http://www2.sas.com/proceedings/sugi23/Posters/p177.pdf
I will just need to list all the possible combinations for 1 at a time, then 2 at a time, etc. up to choosing all 6. Then I can run a macro loop to store the combined subject numbers and then use a data step to change the subjects in that list. It will take a bit more looping than that, but it should do the trick.
Just a couple of questions before we look at how to program it.
Would it be acceptable to add 63 variables to each observation? It sounds like you are dealing with a small enough data set, but I just wanted to check.
Can we refer to the key SUBJECT values as 1, 2, 3, 4, 5, 6, or do you need to go through this exercise several times with several variables and several sets of subjects?
For your first question, I have a total of 149 observations. I don't understand the point in adding 63 variables for each observation. Do you intend to keep the 63 variables unchanged for the 143 observations that are not of interested and have a different value for the 6 other observations?
For the second question, I believe you could refer to the subjects as 1, 2, 3, 4, 5, 6. Are you wondering if maybe in the future I will only be interested in 5 subjects or may 15 subjects? A program may be easy to adjust if that is the case.
Yes, that's exactly where I'm headed. For example, create variables like:
change_100000 = contains a copy of var2 for subject 1, a copy of var1 for all other subjects
change_101000 = contains a copy of var2 for subjects 1 and 3, a copy of var1 for all other subjects
That way, you get to save the possible combinations permanently, and play with them to your heart's content.
If you're talking about 15 subjects, though, these isn't such a good way to go since the number of added variables becomes unmanageable.
Hi:
This might not be the right forum for posting your question. Do you want an output report? Do you want an output table or data set? I suspect that the answer will be for you to use some form of SAS Macro processing and DATA step processing and/or those combined with hash tables, but the code inside the macro program will depend on how you envision getting your results. Also, if you want a report, what ODS destination do you want HTML, RTF or PDF output?
You say that "I am interested in looking at just 6 subjects to see how they influence the results of a Variable1 if I change it to be the value in Variable2." What procedures are you using to analyze how the results are influenced? What procedures have you tried? Will you be passing the changed data to another procedure? Or do you just need to see the changed tables? Or do you need to generate graphs to look at the influence?
I'd recommend reposting this question in the SAS Macro Facility, Data Step and SAS Language Elements forum, with a bit more information about the code and your process and your results.
If your question involves using PROC REPORT, PROC PRINT or PROC TABULATE, and the Output Delivery System, then this would be the appropriate forum. But to do the kind of looping and changing that you describe, it sounds to me like you will need DATA step and Macro techniques.
cynthia
I am using PROC LIFETEST and PROC PHREG to do some survival analysis. I am changing the survival time of a patient to either an earlier or later time and seeing what happens to the hazard ratio and p-value. My output (I don't care if it is HTML, RTF or PDF) is just a simple table that shows 3 variables: Subject(s) changed, new hazard ratio (and p-value), and the difference between the original hazard ratio and the newly derived hazard ratio.
I agree that this will require some macro programming. I was attempting that. I can repost there.
Thanks.
By the way, I have a macro that does it, but I had to call the macro 63 times by passing the list of subjects that I wanted to change. My main purpose in this forum discussion is to see if there is an easier way to input a list of values (or subjects) and quickly output all the 63 possible combinations of subjects. It would be similar to the var1|var2|var3|var4|var5|var6 format in something like the PROC GLM model statement. This notation is the same as writing var1 var2 var1*var2 var3 var1*var3 var2*var3....
Is there a way to do this in a datastep? I doubt there is a function for this, but maybe an easy way to program it with a macro or something.
That's really a question to post in the DATA step and Macro forum. Or, since it involves PHREG and LIFETEST, in the Statistical procedures forum, since it's likely that other using these procedures have wanted to do the same things.
cynthia
I believe that this paper will help answer my own question:
http://www2.sas.com/proceedings/sugi23/Posters/p177.pdf
I will just need to list all the possible combinations for 1 at a time, then 2 at a time, etc. up to choosing all 6. Then I can run a macro loop to store the combined subject numbers and then use a data step to change the subjects in that list. It will take a bit more looping than that, but it should do the trick.
I wasn't going to post again, since you already found the answer to your question. But I had nagging concerns about one item ... the effects of increasing the number of subjects from 6 to 15. If you do that, you will have to iterate over 30,000 times. I just wondered if you were prepared for that. How do you make sense of that much output? How do you determine if a result is statistically significant when you run 30,000+ tests on a data set with 159 subjects? You might very well have thought of these issues. But if not, I just wanted to raise them before you let macro loops run loose.
That is a good point. I don't know that I will get up to that, but I will keep it in mind in case I am asked to look at more subjects at one time. Thank you for pointing that out!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.