BookmarkSubscribeRSS Feed
joebacon
Pyrite | Level 9

Hi all,

I got asked what I thought was a relatively easy question today, but after getting into it, I couldn't answer it on my own. I am turning to you all as I am sure that someone else has dealt with something similar. I am trying to create difference scores, but all I have to go off of is a screenshot of an SPSS dataset as they say they can't share the whole set with me. I do recognize that I am in a SAS forum, but it's mostly about the logic behind the problem. I also recognize that we do not generally like pictures but rather the data itself. Again, I apologize but am working with what I have.

 

I have edited the picture for clarity. I'm trying to create a new variable that takes a participant's value for Stimulation at time 0 from the alcohol session minus the value for Stimulation at time 0 from the placebo session (i.e., AmPstim; Alcohol minus Placebo for stimulation; two cells highlighted red in picture above). However, because each value is on a different row it won't let me create a difference score. My suggestion was to switch to wide format to create the difference score and then switch back, but I was wondering if anyone knew a way to do this with syntax while keeping it in long format(SPSS would be preferred but I can pretty easily translate SAS to SPSS). Alternatively, I would love any lessons on how others would proceed with this type of problem in the future or any applications of why one way would be better as I cannot wrap my head around the logic to program is.


Thank you in advance and I apologize for not adhering to all of the rules.

SPSS_EDGE.png

9 REPLIES 9
joebacon
Pyrite | Level 9
By "know of a way to do this with syntax", I simply mean "to keep it in long format and still do the same thing". I find the logic quite tricky but it seems like a complex If-Then or a "when" statement. I default to you experts.
Reeza
Super User

I may be missing it in your message, but what do you want as output?

joebacon
Pyrite | Level 9
The difference score of Astimulation_tot (when order= alcohol) - Pstimulation_tot (when order= placebo) for each of the assessment_r that they share for each participant.

I hope that clarifies it a bit more.
Reeza
Super User

Where would that go in the screenshot?

joebacon
Pyrite | Level 9
It would be a new variable called "AmPStim".
Reeza
Super User

In SAS I would merge the data with itself, rather than make it a single row and then do the subtraction. You can do the similar functionality in SPSS by merging, doing the subtraction/create a new variable and only keeping the results. 

 

 

joebacon
Pyrite | Level 9
Can you explain what "merge the data with itself" means? I am not familiar with that concept. I was under the impression you could only merge two (or more) different datasets.
ballardw
Super User

From values in the data set how do we determine which value is the "value for Stimulation at time 0 from the alcohol session" and "Stimulation at time 0 from the placebo session" (not order in a picture but a rule, set of variable values or similar)

 

You do not have any variable labeled "time" so this is somewhat important.

How do we identify "participant"? What role does drinking_session play? (I suspect that this has some correspondence to time 0 but you didn't tell us that).

 

Second is what does the desired output look like.

 

This is some pseudo code that places a difference on what I think is the row with the placebo value.

data want;
   set have;
   retain alc ;
   by study_id drinking_session;
   if first.study_id then alc=youralcoholvariable;
   if first.drinking_session and drinking_session=2 
      then difference = alc -yourplacebovariable;
run;

This assumes the data is sorted by study_id drinking_session and that the orders of the variables of interest appear on the first record the drinking session.

 

It has been a long time since I did anything with SPSS and do not remember an equivalent to the SAS First and Last processing. Good luck.

joebacon
Pyrite | Level 9

I apologize. I was evidently not very clear judging by the confusion regarding this. Still, thank you for your help!

 

To answer some of your questions in case someone else comes along and wants this valuable information, 


@ballardw wrote:

From values in the data set how do we determine which value is the "value for Stimulation at time 0 from the alcohol session" and "Stimulation at time 0 from the placebo session" (not order in a picture but a rule, set of variable values or similar)

 


These should be the variables highlighted in the red boxes. The output that I desire is  Astimulation_tot (when order= alcohol) - Pstimulation_tot (when order= placebo) for each of the assessment_r that they share for each participant.

 


@ballardw wrote:

You do not have any variable labeled "time" so this is somewhat important.

How do we identify "participant"? What role does drinking_session play? (I suspect that this has some correspondence to time 0 but you didn't tell us that).

 


That was one of the comments I had as well is that "time" is really important here. I believe the "time" variable to be "assessment_r" as it ranges from 0 to 11 and repeats this pattern when order=placebo and order=alcohol.

 

The participant is based on the "study_id" which is unique for each participant. Drinking_session is only ever 1 or 2. Some participants did alcohol first and some did placebo. Again, this is how I interpreted it as I got roughly the same description you got, but I do have the ability to ask him some questions. 

 


@ballardw wrote:

Second is what does the desired output look like.

 

This is some pseudo code that places a difference on what I think is the row with the placebo value.

data want;
   set have;
   retain alc ;
   by study_id drinking_session;
   if first.study_id then alc=youralcoholvariable;
   if first.drinking_session and drinking_session=2 
      then difference = alc -yourplacebovariable;
run;

This assumes the data is sorted by study_id drinking_session and that the orders of the variables of interest appear on the first record the drinking session.

 

It has been a long time since I did anything with SPSS and do not remember an equivalent to the SAS First and Last processing. Good luck.


The desired output would be a new variable called "AmPStim" that would be the difference scores explained above. For instance, the first few would be 8-0 = 8; 54 - 3 = 51; 36-2= 34; 22-0= 22. 

 

The data is sorted by study_id.

 

I appreciate you taking the time to write all this out for me! I can look up the first and last processing for SPSS as it has been a while for me as well.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 997 views
  • 0 likes
  • 3 in conversation