I am learning about SAS programming. I am trying to create a normal probability plot under the PROC REG statement. Attached it is my data set. This is my model/code: DATA census; INFILE "/folders/myfolders/census.txt" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT ClassGrade Ageyears Height_cm Footlength_cm Armspan_cm SchoolSleep NonSchoolSleep TextSent TextReceived; RUN; PROC PRINT DATA= census; RUN; PROC REG DATA=census; MODEL TextSent = TextReceived; RUN; ODS GRAPHICS ON; PROC REG DATA=census PLOTS TextSent*TextReceived npp.*r.; MODEL TextSent = TextReceived; title 'Normal Probability Plot On The Residuals'; RUN; But my code is not working. How can I possibly generate my desired plot? Normal Probability Plot on the residuals?
... View more
Yes. It is not generating the output that I need. I need to determine whether or not the mean of Potassium for the cereals whose manufacturer is General Mills or Kelloggs is different with a significance level of 0.01
... View more
I am trying to perform a hypothesis test to determine if more than 10% of cereals are manufactured by Post. Significance level=0.10. Here it is my code and the data set: DATA cereal; INFILE "/folders/myfolders/cereal.txt" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT Name :$50. Manufacturer $ Type $ Calories Protein Sodium Fiber Carbohydrates Sugars Potassium Vitamins Weight Cups; RUN; PROC PRINT DATA=cereal; RUN; PROC TTEST DATA= Cereal; VAR Manufacturer; RUN;
... View more
I am trying to determine whether or not the mean of Potassium for the cereals whose manufacturer is General Mills or Kelloggs is different with a significance level of 0.01. I am having trouble getting the right observations selected, and therefore tusing PROC TTEST for the hypothesis test. Here it is my code and the data set: DATA cereal; INFILE "/folders/myfolders/cereal.txt" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT Name :$50. Manufacturer $ Type $ Calories Protein Sodium Fiber Carbohydrates Sugars Potassium Vitamins Weight Cups; RUN; PROC PRINT DATA=cereal; RUN; DATA cereal; SET Cereal; IF Manufacturer='General Mills' | Manufacturer='Kelloggs'; RUN; PROC PRINT DATA=Cereal; RUN; PROC TTEST DATA=Cereal ALPHA=0.01; CLASS Manufacturer; VAR Potassium; RUN; PROC PRINT DATA=Cereal; RUN;
... View more
I am trying to create a new variable that has a value of 'A' for the first 30 observations (from 1 to 30), and a value of 'B' for the remaining 30 observations (from 31 to 60). Here it is my code and the data set: DATA draft; INFILE "/folders/myfolders/2017NBADraft2.txt" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT LastName $ FirstName $ Team $ Position $ Birthdate :ANYDTDTE10. Height Wingspan Weight College $ Year $; RUN; PROC PRINT DATA=draft; RUN; DATA draft; SET draft; IF Obs < 31 then New = 'A'; ELSE IF Obs > 30 then New = 'B'; RUN; PROC PRINT DATA=draft; RUN; This is assigning A to all the variables and creating a new variable 'Obs'... how can I optimize this code in order to generate the desired output?
... View more
I am trying to label the point in the scatter plot with the smallest carbohydrate content, but so far I just have entered some text in the bottom left of the graph. Here it is my code and the data set: DATA work.cereal; INFILE "/folders/myfolders/cereal.txt" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT Name :$50. Manufacturer $ Type $ Calories Protein Sodium Fiber Carbohydrates Sugars Potassium Vitamins Weight Cups; RUN; PROC PRINT DATA=work.cereal; RUN; ODS GRAPHICS ON; PROC SGPLOT DATA=cereal; SCATTER X=Sugars Y=Carbohydrates; XAXIS GRID VALUES=(0 TO 15 BY 3); YAXIS GRID; INSET 'Quaker Oatmeal' / POSITION= BOTTOMLEFT; TITLE 'Sugars vs Carbohydrates of Cereals'; RUN;
... View more
I am trying to import this data set into SAS University Edition, but I am getting an error that suggests that my file isn't saved in myfolders folder, even though it is saved there under the same name. This is the error: ERROR: Physical file does not exist, /folders/myfolders/2017NBADraft2.csv. NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.DRAFT may be incomplete. When this step was stopped there were 0 observations and 10 variables. WARNING: Data set WORK.DRAFT was not replaced because this step was stopped. and this is my code: DATA work.draft; INFILE "/folders/myfolders/2017NBADraft2.csv" DLM=',' FIRSTOBS=2 DSD MISSOVER; INPUT LastName $ FirstName $ Team $ Position $ Birthdate :ANYDTDTE10. Height Wingspan Weight College $ Year $; RUN;
... View more
I am having trouble importing this data set into SAS. It is a tab delimited file, that also contains " in the title names and date data. How can I fix it? Here it is my code, and right after it is the error message that I am getting: proc import datafile="/folders/myfolders/KCHomes.txt" REPLACE; DELIMITER='09'x; GETNAMES=YES; run; DATA KCHomes; RUN; ERROR: The table "WORK.KCHOMES" cannot be opened because it does not contain any columns
... View more
I am trying to change the format of column Gross, which contains values like $50,319,942. I wish to make my program treat this variable as numeric. This is my code but it is not displaying the values of this column. DATA movies; INFILE "/folders/myfolders/GrossDays.txt" DLM='09'x MISSOVER FIRSTOBS=2 DSD; INPUT Title :$100. Date :ANYDTDTE20. Gross Theaters Distributor $; RUN; PROC PRINT DATA=movies; FORMAT Date DATE9.; RUN; PROC PRINT DATA = music; FORMAT Date DATE. ; FORMAT Gross 10. ; RUN; DATA movies; SET movies; TheaterAvg = (Gross/Theaters); RUN; PROC PRINT DATA = movies; RUN;
... View more
I am working with the data set TopGrossingAlbumsR.txt, triying to create a new character variable that indicates the decade when the album was released. I am not sure about the right way to approach this question. How should I initialize the char variable? how could I specify the date format in the IF statement and finally, how could I assign the value of decade to my new variable? The values of 'ReleaseDate' are written in the format: 30NOV1982 I want my new char variable 'Decade' to indicate 1980 for this example. Here it is my code and I have attached the data set below: DATA music; INFILE '/folders/myfolders/TopGrossingAlbumsR.txt' dsd firstobs=2; LENGTH Album $80. Artist $50. Genre $20.; INFORMAT ReleaseDate ANYDTDTE.; FORMAT ReleaseDate DATE9.; INPUT Album Artist ReleaseDate TotalCertifiedCopies ClaimedSales Genre; RUN; PROC PRINT DATA = music; FORMAT ReleaseDate DATE. ; RUN; DATA music; SET music; Decade = ' '; IF ReleaseDate = ' ' THEN Decade = YYYY; IF Genre = 'Metal' THEN Genre = 'Rock'; IF Genre = 'Grunge' THEN Genre = 'Rock'; IF Genre = 'Soundtrack' THEN Genre = 'Other'; IF Genre = 'Country' THEN Genre = 'Other'; RUN; PROC PRINT DATA = music; RUN;
... View more
PROC IMPORT OUT= work.capture DATAFILE= "/folders/myfolders/Capture.txt" DBMS=CSV REPLACE; DELIMITER='09'x; GETNAMES=YES; RUN; PROC PRINT DATA=work.capture; RUN; DATA capture; SET capture; NumCap = 0; IF (Cap1 == 'yes') THEN DO; NumCap +1; IF (Cap2 == 'yes') THEN DO; NumCap +1; IF (Cap3 == 'yes') THEN DO; NumCap +1; IF (Cap4 == 'yes') THEN DO; NumCap +1; IF (Cap5 == 'yes') THEN DO; NumCap +1; IF (Cap6 == 'yes') THEN DO; NumCap +1; RUN; DATA capture; SET capture; IF (AgeGroup == 'semi-adult') THEN DO; AgeGroup = 'Adult'; RUN;
... View more