Hello, I am trying to create a new column of a count based on multiple conditions.
Current Code:
libname Example "~/my_courses/Homework/FinalHomework";
data Sugar;
length DistrictGroup $ 30;
infile '~/my_courses/Homework/FinalHomework/CaneData2.csv/' dsd firstobs=2;
input District $ DistrictGroup $ DistrictPosition $ SoilID SoilName $ Area Variety $ Ratoon $ Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97 Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
;
run;
data SugarLong;
set Sugar;
array mon{*} _numeric_;
do _n_=9 to dim(mon);
Month=vname(mon[_n_]);
Count=mon[_n_];
output;
end;
drop Month District DistrictPosition SoilID SoilName Area Variety Ratoon Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97
Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
run;
data SugarLongResult;
set SugarLong;
select;
when (Count > 0) Result='Yes';
otherwise Result='No';
end;
Proc print data=SugarLongResult (obs=10);
print;
My current output: DistrictGroup Count Result
123456789101112131415161718
Cairns/Mulgrave(dry) | 0.0 | No |
Cairns/Mulgrave(dry) | 6.0 | Yes |
Cairns/Mulgrave(dry) | 0.0 | No |
Cairns/Mulgrave(dry) | 308.0 | Yes |
Cairns/Mulgrave(dry) | 35.0 | Yes |
Cairns/Mulgrave(dry) | 353.5 | Yes |
Cairns/Mulgrave(dry) | 505.0 | Yes |
Cairns/Mulgrave(dry) | 476.0 | Yes |
Cairns/Mulgrave(dry) | 453.0 | Yes |
Cairns/Mulgrave(dry) | 0.0 | No |
Cairns/Mulgrave(dry) | 220.0 | Yes |
Cairns/Mulgrave(dry) | 127.0 | Yes |
Cairns/Mulgrave(dry) | 47.0 | Yes |
Cairns/Mulgrave(dry) | 48.0 | Yes |
Cairns/Mulgrave(dry) | 73.0 | Yes |
Cairns/Mulgrave(dry) | 66.0 | Yes |
Cairns/Mulgrave(dry) | 198.0 | Yes |
Cairns/Mulgrave(dry) | 222.0 | Yes |
What I'm looking to do is to get the data grouped by district group and yes/no (From above Data):
District Group Result CountFinal
Cains/Mulgrave(Dry) Yes 15
Cains/Mulgrave(Dry) No 3
All help is appreciated (:!
Proc Freq Code for your reference :
proc freq data=SugarLongResult noprint;
tables DistrictGroup * Result / out= want(keep=DistrictGroup Result count) ;
run;
Try a PROC FREQ instead of SQL for your last step.
@cassylovescats wrote:
Hello, I am trying to create a new column of a count based on multiple conditions.
Current Code:
libname Example "~/my_courses/Homework/FinalHomework";
data Sugar;
length DistrictGroup $ 30;
infile '~/my_courses/Homework/FinalHomework/CaneData2.csv/' dsd firstobs=2;
input District $ DistrictGroup $ DistrictPosition $ SoilID SoilName $ Area Variety $ Ratoon $ Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97 Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
;
run;data SugarLong;
set Sugar;
array mon{*} _numeric_;
do _n_=9 to dim(mon);
Month=vname(mon[_n_]);
Count=mon[_n_];
output;
end;
drop Month District DistrictPosition SoilID SoilName Area Variety Ratoon Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97
Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
run;data SugarLongResult;
set SugarLong;
select;
when (Count > 0) Result='Yes';
otherwise Result='No';
end;Proc print data=SugarLongResult (obs=10);
print;
My current output: DistrictGroup Count Result
123456789101112131415161718
Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 6.0 Yes Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 308.0 Yes Cairns/Mulgrave(dry) 35.0 Yes Cairns/Mulgrave(dry) 353.5 Yes Cairns/Mulgrave(dry) 505.0 Yes Cairns/Mulgrave(dry) 476.0 Yes Cairns/Mulgrave(dry) 453.0 Yes Cairns/Mulgrave(dry) 0.0 No Cairns/Mulgrave(dry) 220.0 Yes Cairns/Mulgrave(dry) 127.0 Yes Cairns/Mulgrave(dry) 47.0 Yes Cairns/Mulgrave(dry) 48.0 Yes Cairns/Mulgrave(dry) 73.0 Yes Cairns/Mulgrave(dry) 66.0 Yes Cairns/Mulgrave(dry) 198.0 Yes Cairns/Mulgrave(dry) 222.0 Yes What I'm looking to do is to get the data grouped by district group and yes/no (From above Data):
District Group Result CountFinal
Cains/Mulgrave(Dry) Yes 15
Cains/Mulgrave(Dry) No 3
All help is appreciated (:!
Proc Freq Code for your reference :
proc freq data=SugarLongResult noprint;
tables DistrictGroup * Result / out= want(keep=DistrictGroup Result count) ;
run;
Would I be able to run chi-square off this approach?
Oh my goodness, sorry, my brain is fried, I've been working on this code for about 6 hours today.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.