Hello,
I have a dataset with repeating IDs and I want to keep them repeating under a certain condition.
So my table looks like this:
SAMPLEID | ResponseNumber | Item | Amount | Kcal. |
---|---|---|---|---|
1 | 1 | A | 55 | 100 |
1 | 1 | B | 66 | 200 |
1 | 1 | C | 77 | 300 |
1 | 2 | A | 55 | 150 |
1 | 2 | B | 66 | 250 |
1 | 2 | C | 77 | 350 |
1 | 2 | D | 88 | 450 |
2 | 1 | A | 33 | 60 |
2 | 1 | B | 44 | 70 |
2 | 1 | C | 55 | 80 |
2 | 2 | A | 22 | 50 |
2 | 2 | B | 33 | 100 |
2 | 2 | C | 44 | 150 |
And I would like it to be like this:
SAMPLEID | ResponseNumber | ItemA | itemB | ItemC | ItemD | Kcal |
---|---|---|---|---|---|---|
1 | 1 | 55 | 66 | 77 | 0 | 600 |
1 | 2 | 55 | 66 | 77 | 88 | 1200 |
2 | 1 | 33 | 44 | 55 | 0 | 210 |
2 | 2 | 22 | 33 | 44 | 0 | 300 |
etc... |
I used
proc freq data=old;
tables SAMPLEID*Item out=new;
weight Amount / zeros;
run;
(I used proc freq so that if a respondent got the same Item a few times, I get a sum of its amount.)
proc transpose data=new prefix=item out=transposed;
by SAMPLEID;
id Amount;
var count;
run;
What i got was transposed table that summarized all amounts for each item and dropped KCAL.. I think it gets summarized at the point of proc freq... that's why using LET after proc transpose will not change anything i guess..
What should I add/change to have the IDs repeating if ResponseNumber changes from 1 to 2..? Is there any way to mark/rename all the repeating IDs ?.. E.g. If ResponseNumber = 1 then ID changes from 1 to 1-1 , from 2 to 2-2 and so on... and if ResponseNumber = 2, then 1-2, 2-2, 3-2 and so on..
Thank you very much!
Best,
Anastasia
Why not just do the SUM and the TRANSPOSE in two different steps and merge the results?
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
proc transpose data=have prefix=item out = t1 ;
by sampleid responsenumber ;
id item ;
var amount ;
run;
proc summary data=have ;
by sampleid responsenumber ;
var kcal ;
output out=t2 sum=kcal ;
run;
data want ;
merge t1 t2 ;
drop _name_ _freq_ _type_;
by sampleid responsenumber ;
run;
Add responseNumber into your BY Statement.
proc transpose data=new prefix=item out=transposed;
by SAMPLEID ResponseNumber;
id Amount;
var count;
run;
Since its not quite a 'straight' transpose you may want to transpose using a data step instead:
SAS Learning Module: Reshaping data long to wide using the data step
a solution:
proc transpose data=t_a prefix=item out=t_b1;
by SAMPLEID ResponseNumber;
id item;
var amount;
run;
data t_b2(keep= SAMPLEID ResponseNumber zSum rename=(zSum=Kcal));
set t_a;
by SAMPLEID ResponseNumber;
if first.ResponseNumber then zSum=0;
zSum+kcal;
if last.ResponseNumber then output;
run;
/** if you want the null values to become zeros', use coalesce() function **/
data t_want(drop=i _name_);
merge t_b1(in=a) t_b2(in=b);
by SAMPLEID ResponseNumber;
ARRAY NN(*) _NUMERIC_;
DO I = 1 TO DIM(NN);
NN(I) = COALESCE(NN(I),0);
END;
run;
Why not just do the SUM and the TRANSPOSE in two different steps and merge the results?
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
proc transpose data=have prefix=item out = t1 ;
by sampleid responsenumber ;
id item ;
var amount ;
run;
proc summary data=have ;
by sampleid responsenumber ;
var kcal ;
output out=t2 sum=kcal ;
run;
data want ;
merge t1 t2 ;
drop _name_ _freq_ _type_;
by sampleid responsenumber ;
run;
Maybe a solution without transpose is worth a thought:
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
Data Want (Keep=Item: Total SAMPLEID ResponseNumber) ;
Array Arr[4] ItemA ItemB ItemC ItemD;
Do i=1 By 1 Until (Last.ResponseNumber);
Set have;
By SAMPLEID ResponseNumber;
Arr=Amount;
End;
Do j=1 To i;
Set Have;
By SAMPLEID ResponseNumber;
Total=Sum(Kcal,Total);
End;
Run;
Thanks everyone! Tom's code worked with a few modifications.. I had to use proc sort to make sure the program can read all the entries, proc summary (for items & amounts and another one for items & kcal), merge two summaries and then transpose!
Thanks a lot!
Tom, thank you very much!
Is there any way to mark/rename all the repeating IDs ?.. E.g. If ResponseNumber = 2 then ID changes from 1 to 1_2 and so on... ?
Thanks,
Anastasia
Do you mean just use both variables when generating the column name?
proc transpose data=have prefix=Response_ delim=_ out = t1 ;
by sampleid ;
id item responsenumber ;
var amount ;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.