- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
First off if this has been answered I apologize, I've been looking around the forums and I've found a bunch of answers that are close, but don't quite work for what I'm looking for. Here's what I've got
data have;
input studyID FSA Exposure ED_binary ;
cards;
1 B2Y 384 1
1 B2Y 384 0
1 B2y 384 1
2 BgT 1000 0
3 M6D 400 1
3 M6D 400 1
3 M6D 400 1
run;
Essentially what I am looking to do is get one record per person. I need to collapse studyid FSA and Exposure, and then sum the ed _binary variable. so the resuting dataset would look like :
data want;
input studyID FSA Exposure ED_binary ;
cards;
1 B2Y 384 2
2 BgT 1000 0
3 M6D 400 3
run;
the closest i've come is with proc SQL but that doesn't give me a dataset, just the report. I've tried proc means as well but my system just keeps hanging (I have a fairly large dataset).
as always any thoughts would be much appreciated.
Thank so much
Rightcoast
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input studyID FSA $3. Exposure ED_binary ;
cards;
1 B2Y 384 1
1 B2Y 384 0
1 B2Y 384 1 was B2y
2 BgT 1000 0
3 M6D 400 1
3 M6D 400 1
3 M6D 400 1
run;
data want (drop=cumsum);
set have;
by studyid fsa exposure;
cumsum+ed_binary;
if not first.exposure then ed_binary=cumsum;
else cumsum=ed_binary;
if last.exposure;
run;
This assumes data are sorted by studyid fsa exposure. Note I corrected the value of FSA in the 3rd record.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
infile cards truncover;
input studyID FSA $ Exposure ED_binary ;
cards;
1 B2Y 384 1
1 B2Y 384 0
1 B2Y 384 1
2 BgT 1000 0
3 M6D 400 1
3 M6D 400 1
3 M6D 400 1
;
run;
proc sql;
create table want as
select studyid, fsa, exposure, sum(ed_binary) as ed_binary
from have
group by studyid, fsa, exposure;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks so much, this one worked great as well!
Much appreciated.
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input studyID FSA $3. Exposure ED_binary ;
cards;
1 B2Y 384 1
1 B2Y 384 0
1 B2Y 384 1 was B2y
2 BgT 1000 0
3 M6D 400 1
3 M6D 400 1
3 M6D 400 1
run;
data want (drop=cumsum);
set have;
by studyid fsa exposure;
cumsum+ed_binary;
if not first.exposure then ed_binary=cumsum;
else cumsum=ed_binary;
if last.exposure;
run;
This assumes data are sorted by studyid fsa exposure. Note I corrected the value of FSA in the 3rd record.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This did the trick perfectly, and actually helped me to locate another little quirk in my dataset that needs to be fixed, so you were even more helpful than you thought!
Thanks so much
Mike