Hello, I am trying to create a new column of a count based on multiple conditions.
Current Code:
libname Example "~/my_courses/Homework/FinalHomework";
data Sugar;
length DistrictGroup $ 30;
infile '~/my_courses/Homework/FinalHomework/CaneData2.csv/' dsd firstobs=2;
input District $ DistrictGroup $ DistrictPosition $ SoilID SoilName $ Area Variety $ Ratoon $ Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97 Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
;
run;
data SugarLong;
set Sugar;
array mon{*} _numeric_;
do _n_=9 to dim(mon);
Month=vname(mon[_n_]);
Count=mon[_n_];
output;
end;
drop Month District DistrictPosition SoilID SoilName Area Variety Ratoon Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97
Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
run;
data SugarLongResult;
set SugarLong;
select;
when (Count > 0) Result='Yes';
otherwise Result='No';
end;
Proc print data=SugarLongResult (obs=10);
print;
My current output: DistrictGroup Count Result
123456789101112131415161718
| Cairns/Mulgrave(dry) | 0.0 | No |
| Cairns/Mulgrave(dry) | 6.0 | Yes |
| Cairns/Mulgrave(dry) | 0.0 | No |
| Cairns/Mulgrave(dry) | 308.0 | Yes |
| Cairns/Mulgrave(dry) | 35.0 | Yes |
| Cairns/Mulgrave(dry) | 353.5 | Yes |
| Cairns/Mulgrave(dry) | 505.0 | Yes |
| Cairns/Mulgrave(dry) | 476.0 | Yes |
| Cairns/Mulgrave(dry) | 453.0 | Yes |
| Cairns/Mulgrave(dry) | 0.0 | No |
| Cairns/Mulgrave(dry) | 220.0 | Yes |
| Cairns/Mulgrave(dry) | 127.0 | Yes |
| Cairns/Mulgrave(dry) | 47.0 | Yes |
| Cairns/Mulgrave(dry) | 48.0 | Yes |
| Cairns/Mulgrave(dry) | 73.0 | Yes |
| Cairns/Mulgrave(dry) | 66.0 | Yes |
| Cairns/Mulgrave(dry) | 198.0 | Yes |
| Cairns/Mulgrave(dry) | 222.0 | Yes |
What I'm looking to do is to get the data grouped by district group and yes/no (From above Data):
District Group Result CountFinal
Cains/Mulgrave(Dry) Yes 15
Cains/Mulgrave(Dry) No 3
All help is appreciated (:!
Proc Freq Code for your reference :
proc freq data=SugarLongResult noprint;
tables DistrictGroup * Result / out= want(keep=DistrictGroup Result count) ;
run;
Try a PROC FREQ instead of SQL for your last step.
@cassylovescats wrote:
Hello, I am trying to create a new column of a count based on multiple conditions.
Current Code:
libname Example "~/my_courses/Homework/FinalHomework";
data Sugar;
length DistrictGroup $ 30;
infile '~/my_courses/Homework/FinalHomework/CaneData2.csv/' dsd firstobs=2;
input District $ DistrictGroup $ DistrictPosition $ SoilID SoilName $ Area Variety $ Ratoon $ Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97 Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
;
run;data SugarLong;
set Sugar;
array mon{*} _numeric_;
do _n_=9 to dim(mon);
Month=vname(mon[_n_]);
Count=mon[_n_];
output;
end;
drop Month District DistrictPosition SoilID SoilName Area Variety Ratoon Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97
Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
run;data SugarLongResult;
set SugarLong;
select;
when (Count > 0) Result='Yes';
otherwise Result='No';
end;Proc print data=SugarLongResult (obs=10);
print;
My current output: DistrictGroup Count Result
123456789101112131415161718
Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 6.0 Yes Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 308.0 Yes Cairns/Mulgrave(dry) 35.0 Yes Cairns/Mulgrave(dry) 353.5 Yes Cairns/Mulgrave(dry) 505.0 Yes Cairns/Mulgrave(dry) 476.0 Yes Cairns/Mulgrave(dry) 453.0 Yes Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 220.0 Yes Cairns/Mulgrave(dry) 127.0 Yes Cairns/Mulgrave(dry) 47.0 Yes Cairns/Mulgrave(dry) 48.0 Yes Cairns/Mulgrave(dry) 73.0 Yes Cairns/Mulgrave(dry) 66.0 Yes Cairns/Mulgrave(dry) 198.0 Yes Cairns/Mulgrave(dry) 222.0 Yes What I'm looking to do is to get the data grouped by district group and yes/no (From above Data):
District Group Result CountFinal
Cains/Mulgrave(Dry) Yes 15
Cains/Mulgrave(Dry) No 3
All help is appreciated (:!
Proc Freq Code for your reference :
proc freq data=SugarLongResult noprint;
tables DistrictGroup * Result / out= want(keep=DistrictGroup Result count) ;
run;
Would I be able to run chi-square off this approach?
Oh my goodness, sorry, my brain is fried, I've been working on this code for about 6 hours today.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Get started using SAS Studio to write, run and debug your SAS programs.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.