BookmarkSubscribeRSS Feed
CathyVI
Pyrite | Level 9

I created the following synthetic data.

data test;

input Name $ run squat jump pushup;

cards;

Bob 1 0 0 1

Mat 0 1 0 1

Cat 1 0 1 1

Dog 1 0 0 0

Fred 0 1 1 1

Sam 0 0 1 0

Pat 0 0 0 1

Jack 1 1 1 1

Bat 0 0 0 0

;

run;

 

I have the following table to fill

 One exerciseMore than one exercise
Run  
Pushup  
Jump  
Squat  

I want to know within my data when someone only do anyone of the exercise or if they did multiple exrcise. For example: When bob ran&pushup- he cannot be in one exercise column but dog can because he only ran. After which I will find the sum for each exercise.

I thought maybe I need to change my data from long to wide but have no idea of what to do next.

 

9 REPLIES 9
Reeza
Super User
This would be infinitely easier if you show a fully worked example. I have a strong suspicion of what you want but I'd be guessing. Please provide sample data and sample output that aligns, preferably using the same terms here. So given the sample data, what would you expect to see in the output? And is this representative of your data or do you also need to handle multiple days?
CathyVI
Pyrite | Level 9

@Reeza If I understand you correctly the dataset I created represented how my real data looks-like. The output should be in percentage, so my output should be similar to this table below;

 One exercise (%)Multiple exercises(%)
jump  100 (20.00)800(79.11)
run  212(24.26)534(66.74

If you look at my previous dataset, in some observation two exercises e.g. jump and run have values whereas only jump has a value in another observation. I want to know if jump was the only exercise done or if there were two exercises-what is the counts. One thing am not sure about is if the two output will be mutually exclusive or not. I hope this is helpful.

Reeza
Super User
How did you get those numbers from the table above? It's hard to try and align the example to the output when the numbers don't align.
CathyVI
Pyrite | Level 9

@Reeza @PhilC 

The numbers in the output are made up. I tried usgin this code but my output is not what I wanted. My desired output is when fent is 1 and other drugs are 0, and  when fent is 1 and other drugs is 1.

data test2;

set test;

where fent=1;

if fent=1 then do;

if oxy=0 or hydro=0 or codeine =0 then single=1;

else if oxy=1 or hydro=1 or codeine =1 then single =2;

end;

run;

PhilC
Rhodochrosite | Level 12

I have can produce this using 1 Data step, two PROC TRANSPOSEs, and a PROC TABULATE.  This seems too hard.  Why is your synthetic data structured this way?  Can you choose another structure?

PhilC
Rhodochrosite | Level 12
Proc format;
  value OneExercise
    0 = "More than one exercise"
    1 = "One exercise"
    ;
data test;
 input Name $ run squat jump pushup;
 array exercise[4] run--pushup;
 OneExercise=sum(of run--pushup)=1;
 format OneExercise OneExercise.;
cards;
Bob 1 0 0 1
Mat 0 1 0 1
Cat 1 0 1 1
Dog 1 0 0 0
Fred 0 1 1 1
Sam 0 0 1 0
Pat 0 0 0 1
Jack 1 1 1 1
Bat 0 0 0 0
;
PROC SORT	;
	BY OneExercise;
RUN;
PROC TRANSPOSE DATA=test 
	OUT=Transpose1
	PREFIX=Count
	NAME=Exercise
;
	BY OneExercise;
	VAR run squat jump pushup;
PROC SORT data=Transpose1	;
	BY OneExercise Exercise;
PROC TRANSPOSE DATA=Transpose1
	OUT=Transpose2 (drop=_NAME_)
	PREFIX=Count
;
	BY OneExercise Exercise;
	VAR Count:;
PROC TABULATE DATA=Transpose2	 	;
	VAR Count1;
	CLASS OneExercise /	ORDER=UNFORMATTED MISSING;
	CLASS Exercise /	ORDER=UNFORMATTED MISSING;
	TABLE /* Row Dimension */
        Exercise =" ",
        /* Column Dimension */
        OneExercise=" "*
          Count1=" "*
            Sum=" "*f=3.0		;
RUN;
  More than one exercise One exercise
jump 3 1
pushup 5 1
run 3 1
squat 3 0

 

 

Tom
Super User Tom
Super User

How could you have a count of 100 from only 9 input observations.  Show the report you want for the example data you posted.

CathyVI
Pyrite | Level 9

@Tom Please ignore the initial output table. I made a mistake.

Using the data above, I trried the following code it seems to work but I need a second opnion to see if it work on a large dataset;

data test2;

set test;

where fent=1;

if fent=1 then do;

if sum(oxy,hydro,codeine)=0 then F_drug=1;

else if sum(oxy,hydro,codeine)>0 then F_drug =2;

end;

run;

proc freq data=test2;

table F_drug;

run;

Tom
Super User Tom
Super User

That doesn't seem to have anything to do with your original suggested output.  Looks like you are just classifying each observation as to whether any of the list of variables is 1 or not.  It is probably easier to just code that new variable using the same 0/1 pattern as your used for the original variables.  

data test1;
  set test;
  any = max(of  run squat jump pushup) ;
run;
proc freq ;
 tables any;
run;

Use the OF keyword in the MAX() function (or any function that accepts multiple inputs) will allow you to use variable lists.  So this would also work for your example since those are the only numeric variables in the dataset.

any = max(of _numeric_) ;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1242 views
  • 0 likes
  • 4 in conversation