Hello,
I have a dataset with repeating IDs and I want to keep them repeating under a certain condition.
So my table looks like this:
SAMPLEID | ResponseNumber | Item | Amount | Kcal. |
---|---|---|---|---|
1 | 1 | A | 55 | 100 |
1 | 1 | B | 66 | 200 |
1 | 1 | C | 77 | 300 |
1 | 2 | A | 55 | 150 |
1 | 2 | B | 66 | 250 |
1 | 2 | C | 77 | 350 |
1 | 2 | D | 88 | 450 |
2 | 1 | A | 33 | 60 |
2 | 1 | B | 44 | 70 |
2 | 1 | C | 55 | 80 |
2 | 2 | A | 22 | 50 |
2 | 2 | B | 33 | 100 |
2 | 2 | C | 44 | 150 |
And I would like it to be like this:
SAMPLEID | ResponseNumber | ItemA | itemB | ItemC | ItemD | Kcal |
---|---|---|---|---|---|---|
1 | 1 | 55 | 66 | 77 | 0 | 600 |
1 | 2 | 55 | 66 | 77 | 88 | 1200 |
2 | 1 | 33 | 44 | 55 | 0 | 210 |
2 | 2 | 22 | 33 | 44 | 0 | 300 |
etc... |
I used
proc freq data=old;
tables SAMPLEID*Item out=new;
weight Amount / zeros;
run;
(I used proc freq so that if a respondent got the same Item a few times, I get a sum of its amount.)
proc transpose data=new prefix=item out=transposed;
by SAMPLEID;
id Amount;
var count;
run;
What i got was transposed table that summarized all amounts for each item and dropped KCAL.. I think it gets summarized at the point of proc freq... that's why using LET after proc transpose will not change anything i guess..
What should I add/change to have the IDs repeating if ResponseNumber changes from 1 to 2..? Is there any way to mark/rename all the repeating IDs ?.. E.g. If ResponseNumber = 1 then ID changes from 1 to 1-1 , from 2 to 2-2 and so on... and if ResponseNumber = 2, then 1-2, 2-2, 3-2 and so on..
Thank you very much!
Best,
Anastasia
Why not just do the SUM and the TRANSPOSE in two different steps and merge the results?
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
proc transpose data=have prefix=item out = t1 ;
by sampleid responsenumber ;
id item ;
var amount ;
run;
proc summary data=have ;
by sampleid responsenumber ;
var kcal ;
output out=t2 sum=kcal ;
run;
data want ;
merge t1 t2 ;
drop _name_ _freq_ _type_;
by sampleid responsenumber ;
run;
Add responseNumber into your BY Statement.
proc transpose data=new prefix=item out=transposed;
by SAMPLEID ResponseNumber;
id Amount;
var count;
run;
Since its not quite a 'straight' transpose you may want to transpose using a data step instead:
SAS Learning Module: Reshaping data long to wide using the data step
a solution:
proc transpose data=t_a prefix=item out=t_b1;
by SAMPLEID ResponseNumber;
id item;
var amount;
run;
data t_b2(keep= SAMPLEID ResponseNumber zSum rename=(zSum=Kcal));
set t_a;
by SAMPLEID ResponseNumber;
if first.ResponseNumber then zSum=0;
zSum+kcal;
if last.ResponseNumber then output;
run;
/** if you want the null values to become zeros', use coalesce() function **/
data t_want(drop=i _name_);
merge t_b1(in=a) t_b2(in=b);
by SAMPLEID ResponseNumber;
ARRAY NN(*) _NUMERIC_;
DO I = 1 TO DIM(NN);
NN(I) = COALESCE(NN(I),0);
END;
run;
Why not just do the SUM and the TRANSPOSE in two different steps and merge the results?
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
proc transpose data=have prefix=item out = t1 ;
by sampleid responsenumber ;
id item ;
var amount ;
run;
proc summary data=have ;
by sampleid responsenumber ;
var kcal ;
output out=t2 sum=kcal ;
run;
data want ;
merge t1 t2 ;
drop _name_ _freq_ _type_;
by sampleid responsenumber ;
run;
Maybe a solution without transpose is worth a thought:
data have ;
input SAMPLEID ResponseNumber Item $ Amount Kcal;
cards;
1 1 A 55 100
1 1 B 66 200
1 1 C 77 300
1 2 A 55 150
1 2 B 66 250
1 2 C 77 350
1 2 D 88 450
2 1 A 33 60
2 1 B 44 70
2 1 C 55 80
2 2 A 22 50
2 2 B 33 100
2 2 C 44 150
;;;;
Data Want (Keep=Item: Total SAMPLEID ResponseNumber) ;
Array Arr[4] ItemA ItemB ItemC ItemD;
Do i=1 By 1 Until (Last.ResponseNumber);
Set have;
By SAMPLEID ResponseNumber;
Arr=Amount;
End;
Do j=1 To i;
Set Have;
By SAMPLEID ResponseNumber;
Total=Sum(Kcal,Total);
End;
Run;
Thanks everyone! Tom's code worked with a few modifications.. I had to use proc sort to make sure the program can read all the entries, proc summary (for items & amounts and another one for items & kcal), merge two summaries and then transpose!
Thanks a lot!
Tom, thank you very much!
Is there any way to mark/rename all the repeating IDs ?.. E.g. If ResponseNumber = 2 then ID changes from 1 to 1_2 and so on... ?
Thanks,
Anastasia
Do you mean just use both variables when generating the column name?
proc transpose data=have prefix=Response_ delim=_ out = t1 ;
by sampleid ;
id item responsenumber ;
var amount ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.