Calculating Difference Scores

joebacon · Posted 04-02-2020 12:20 PM

Hi all,

I got asked what I thought was a relatively easy question today, but after getting into it, I couldn't answer it on my own. I am turning to you all as I am sure that someone else has dealt with something similar. I am trying to create difference scores, but all I have to go off of is a screenshot of an SPSS dataset as they say they can't share the whole set with me. I do recognize that I am in a SAS forum, but it's mostly about the logic behind the problem. I also recognize that we do not generally like pictures but rather the data itself. Again, I apologize but am working with what I have.

I have edited the picture for clarity. I'm trying to create a new variable that takes a participant's value for Stimulation at time 0 from the alcohol session minus the value for Stimulation at time 0 from the placebo session (i.e., AmPstim; Alcohol minus Placebo for stimulation; two cells highlighted red in picture above). However, because each value is on a different row it won't let me create a difference score. My suggestion was to switch to wide format to create the difference score and then switch back, but I was wondering if anyone knew a way to do this with syntax while keeping it in long format(SPSS would be preferred but I can pretty easily translate SAS to SPSS). Alternatively, I would love any lessons on how others would proceed with this type of problem in the future or any applications of why one way would be better as I cannot wrap my head around the logic to program is.

Thank you in advance and I apologize for not adhering to all of the rules.

joebacon · Posted 04-02-2020 12:24 PM

By "know of a way to do this with syntax", I simply mean "to keep it in long format and still do the same thing". I find the logic quite tricky but it seems like a complex If-Then or a "when" statement. I default to you experts.

Reeza · Posted 04-02-2020 12:40 PM

I may be missing it in your message, but what do you want as output?

joebacon · Posted 04-02-2020 12:43 PM

The difference score of Astimulation_tot (when order= alcohol) - Pstimulation_tot (when order= placebo) for each of the assessment_r that they share for each participant.

I hope that clarifies it a bit more.

Reeza · Posted 04-02-2020 01:31 PM

Where would that go in the screenshot?

joebacon · Posted 04-02-2020 01:38 PM

It would be a new variable called "AmPStim".

Reeza · Posted 04-02-2020 01:40 PM

In SAS I would merge the data with itself, rather than make it a single row and then do the subtraction. You can do the similar functionality in SPSS by merging, doing the subtraction/create a new variable and only keeping the results.

joebacon · Posted 04-02-2020 01:44 PM

Can you explain what "merge the data with itself" means? I am not familiar with that concept. I was under the impression you could only merge two (or more) different datasets.

ballardw · Posted 04-02-2020 01:36 PM

From values in the data set how do we determine which value is the "value for Stimulation at time 0 from the alcohol session" and "Stimulation at time 0 from the placebo session" (not order in a picture but a rule, set of variable values or similar)

You do not have any variable labeled "time" so this is somewhat important.

How do we identify "participant"? What role does drinking_session play? (I suspect that this has some correspondence to time 0 but you didn't tell us that).

Second is what does the desired output look like.

This is some pseudo code that places a difference on what I think is the row with the placebo value.

data want;
   set have;
   retain alc ;
   by study_id drinking_session;
   if first.study_id then alc=youralcoholvariable;
   if first.drinking_session and drinking_session=2 
      then difference = alc -yourplacebovariable;
run;

This assumes the data is sorted by study_id drinking_session and that the orders of the variables of interest appear on the first record the drinking session.

It has been a long time since I did anything with SPSS and do not remember an equivalent to the SAS First and Last processing. Good luck.

joebacon · Posted 04-02-2020 01:54 PM

I apologize. I was evidently not very clear judging by the confusion regarding this. Still, thank you for your help!

To answer some of your questions in case someone else comes along and wants this valuable information,

@ballardw wrote:
From values in the data set how do we determine which value is the "value for Stimulation at time 0 from the alcohol session" and "Stimulation at time 0 from the placebo session" (not order in a picture but a rule, set of variable values or similar)

These should be the variables highlighted in the red boxes. The output that I desire is Astimulation_tot (when order= alcohol) - Pstimulation_tot (when order= placebo) for each of the assessment_r that they share for each participant.

@ballardw wrote:
You do not have any variable labeled "time" so this is somewhat important.
How do we identify "participant"? What role does drinking_session play? (I suspect that this has some correspondence to time 0 but you didn't tell us that).

That was one of the comments I had as well is that "time" is really important here. I believe the "time" variable to be "assessment_r" as it ranges from 0 to 11 and repeats this pattern when order=placebo and order=alcohol.

The participant is based on the "study_id" which is unique for each participant. Drinking_session is only ever 1 or 2. Some participants did alcohol first and some did placebo. Again, this is how I interpreted it as I got roughly the same description you got, but I do have the ability to ask him some questions.

@ballardw wrote:
Second is what does the desired output look like.

This is some pseudo code that places a difference on what I think is the row with the placebo value.
data want;
   set have;
   retain alc ;
   by study_id drinking_session;
   if first.study_id then alc=youralcoholvariable;
   if first.drinking_session and drinking_session=2 
      then difference = alc -yourplacebovariable;
run;
This assumes the data is sorted by study_id drinking_session and that the orders of the variables of interest appear on the first record the drinking session.

It has been a long time since I did anything with SPSS and do not remember an equivalent to the SAS First and Last processing. Good luck.

The desired output would be a new variable called "AmPStim" that would be the difference scores explained above. For instance, the first few would be 8-0 = 8; 54 - 3 = 51; 36-2= 34; 22-0= 22.

The data is sorted by study_id.

I appreciate you taking the time to write all this out for me! I can look up the first and last processing for SPSS as it has been a while for me as well.

Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Re: Calculating Difference Scores

Registration is open

SAS Training: Just a Click Away