Hello,
This is my current code:
libname Example "~/my_courses/Homework/FinalHomework";
data Sugar;
length DistrictGroup $ 30;
infile '~/my_courses/Homework/FinalHomework/CaneData2.csv/' dsd firstobs=2;
input District $ DistrictGroup $ DistrictPosition $ SoilID SoilName $ Area Variety $ Ratoon $ Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97 Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
;
run;
data SugarLong;
set Sugar;
array mon{*} _numeric_;
do _n_=9 to dim(mon);
Month=vname(mon[_n_]);
Count=mon[_n_];
output;
end;
drop Month DistrictGroup SoilID SoilName Area Variety Ratoon Age HarvestMonth HarvestDuration TonnHect Fibre Sugar Jul96 Aug96 Sep96 Oct96 Nov96 Dec96 Jan97 Feb97 Mar97 Apr97 May97
Jun97 Jul97 Aug97 Sep97 Oct97 Nov97 Dec97;
run;
data SugarLongResult;
set SugarLong;
select;
when (Count > 0) Result='Yes';
otherwise Result='No';
end;
Proc print data=SugarLongResult (obs=50);
var District DistrictPosition Result;
print;
proc sort data=SugarLongResult out=SugarLongResultDupe NODUPKEY;
by District DistrictPosition Result;
run;
proc freq data=SugarLongResultDupe;
tables DistrictPosition* Result / out=SugarLongResultFinal;
run;
proc print data=SugarLongResultFinal;
run;
There are only 15 District options but the output is currently adding up to 20. The problem seems to be with S, W, and C, it looks like it's including the count of No in both No and Yes.
The correct numbers are below:
DistrictPosition | Result | Count |
N | Yes | 2 |
N | No | 0 |
E | Yes | 2 |
E | No | 0 |
S | Yes | 0 |
S | No | 2 |
W | Yes | 4 |
W | No | 2 |
C | Yes | 2 |
C | No | 1 |
This is the current output which is almost correct:
Why are you making an array named MON that includes variables like:
SoilID Area Age HarvestMonth HarvestDuration TonnHect Fibre Sugar
It looks to me like the reason you are seeing 20 instead of 15 is because for some DISTRICTPOSITION you have some observations with YES and some with NO. Is it possible that the same DISTRICTPOSITION value appears in more than one DISTRICT value?
Do you want to calculate the YES/NO rule so that the values are the same for all observations from the same district? Perhaps you want to SUM the COUNT variable over all of the months, or take the MAX over all of the months? If so then there was no need to transpose it at all.
You should really show the code with messages from the log when getting unexpected output.
Apologies.
I believe this is where the error is. (Observations = 20)
@cassylovescats wrote:
Apologies.
I believe this is where the error is. (Observations = 20)
Then you likely need to look very closely at your data before the Proc Sort step and afterwards.
Or share your full data set SugarLong.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
To properly diagnose this, we would need to see Data set SugarLongResultDupe.
You show us some data, but it's not clear what data you are showing us.
Why are you making an array named MON that includes variables like:
SoilID Area Age HarvestMonth HarvestDuration TonnHect Fibre Sugar
It looks to me like the reason you are seeing 20 instead of 15 is because for some DISTRICTPOSITION you have some observations with YES and some with NO. Is it possible that the same DISTRICTPOSITION value appears in more than one DISTRICT value?
Do you want to calculate the YES/NO rule so that the values are the same for all observations from the same district? Perhaps you want to SUM the COUNT variable over all of the months, or take the MAX over all of the months? If so then there was no need to transpose it at all.
That is exactly the issue I am finding.
I needed to transpose because I need to run chi-square, I have the mon because I copied the code from another forum and it worked. I am awful at SAS and just need to do this for a final presentation.
I made these codes one by one until it got to this result, I am not really sure how to backtrack at this point.
Post the first file you used, namely '~/my_courses/Homework/FinalHomework/CaneData2.csv/'.
Art, CEO, AnalystFinder.com
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.