Background info on the assignment:
STAT 325 HOMEWORK 6
sql
Data description: Kroger warehouse, which has two hubs, supplies 5 supermarkets in local San Diego.
You may want to analyze the sale and logistic record of 2014 by using “proc sql” in SAS.
market.csv:
Column 1: Item ID;
Column 2: Item Name;
Column 3: Unit Price;
Column 4: Sale Amount;
Column 5: Store Branch;
warehouse.csv:
Column 1: Item ID;
Column 2: Item Name;
Column 3: Warehouse Hub Name;
Column 4: Store Branch;
Code:
/*Question 1*/
data market;
length item $20. branch $20. id $6. price $2. amount $15.;
infile '/home/u62108616/sasuser.v94/S325/market.csv' dsd truncover;
format price DOLLAR1.2 amount COMMA15.0;
run;
proc sql;
select*
from market;
quit;
Error Messages/Warnings:
NOTE 484-185: Format $DOLLAR was not found or could not be loaded.
ERROR 230-185: Invalid character format/informat decimal specified.
73 ! format price DOLLAR1.2 amount COMMA15.0;
_________
484
NOTE 484-185: Format $COMMA was not found or could not be loaded.
74 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MARKET may be incomplete. When this step was stopped there were 0 observations and 5 variables.
WARNING: Data set WORK.MARKET was not replaced because this step was stopped.
Question 1: Read in the market.csv data file, apply the dollar w.d format and comma separated format to the variables
of price and amount
What am I doing wrong here and how should I fix it?
The LENGTH statement has NOTHING to do with FORMATS (of INFORMATS). It is for telling SAS what type of variable you want to create and how much space it takes to store it in the dataset.
Do not add periods to the lengths. Do not tell SAS to store only the first 3 bytes of the 8 bytes needed to represent a floating point number. Remember that SAS only has two types of variables: fixed length character strings and 64bit floating point numbers.
length item $20 branch $20 id $6 price amount 8;
It also looks like your input statement does not match the data in the text file.
It could be the CSV is corrupt or mall formed. Why would the ID values have periods in them?
As to why the AMOUNT variable is missing we probably need to see the LOG (please post the text of the log and not another photograph) and perhaps also the text of some of the lines from the CSV file. If you cannot figure out how to view the text of the CSV file then you can add the LIST statement to your SAS data step and SAS will display the lines read from the CSV file into the SAS log.
You have created variable AMOUNT as a character variable. It must be numeric in order to use the COMMA format.
By the way, there's really no reason for AMOUNT to be character, it really is numeric.
Your LENGTH statement made AMOUNT a character variable. Change its length to 8 (no dollar sign). Same applies to other variables such as PRICE.
Also note, you didn't input any variables from the data. You should probably add an INPUT statement to the first DATA step.
Current Output vs Intended Output:
Updated Code:
/*Question 1*/
data market;
length item $20. branch $20. id $6. price 3. amount 8.;
infile '/home/u62108616/sasuser.v94/S325/market.csv' dsd truncover;
input item $ branch $ id $ price amount;
format price DOLLAR10.2 amount COMMA15.0;
run;
proc sql;
select*
from market;
quit;
I know question 1 doesnt correlate to any of the tables produced in the intended output, however, it seems like my variables are still a bit messed up
The LENGTH statement sets the storage length of variables. For numeric variables, leave it at 8, or you will lose precision unnecessarily.
Also, your INPUT statement does not reflect the order of variables in the assignment. id should be last.
The LENGTH statement has NOTHING to do with FORMATS (of INFORMATS). It is for telling SAS what type of variable you want to create and how much space it takes to store it in the dataset.
Do not add periods to the lengths. Do not tell SAS to store only the first 3 bytes of the 8 bytes needed to represent a floating point number. Remember that SAS only has two types of variables: fixed length character strings and 64bit floating point numbers.
length item $20 branch $20 id $6 price amount 8;
It also looks like your input statement does not match the data in the text file.
It could be the CSV is corrupt or mall formed. Why would the ID values have periods in them?
As to why the AMOUNT variable is missing we probably need to see the LOG (please post the text of the log and not another photograph) and perhaps also the text of some of the lines from the CSV file. If you cannot figure out how to view the text of the CSV file then you can add the LIST statement to your SAS data step and SAS will display the lines read from the CSV file into the SAS log.
It is good to use the LENGTH statement to explicitly define the variables. But you should not define the numeric variables as character. Also you do not need to include a decimal point when defining the length. The length can only be an integer value so no decimal point is needed.
length item $20 branch $20 id $6 price amount 8;
You still need to tell SAS to actually read the variables from the lines of the CSV file by adding an INPUT statement.
You cannot have a display width that has more decimal places than the total width. Fix the width used on the DOLLAR format specification. Make sure to take into account room for the decimal point (if requested), the $ character and any thousands block separators (commas).
data market;
length item $20 branch $20 id $6 price amount 8;
infile '/home/u62108616/sasuser.v94/S325/market.csv' dsd truncover;
input item branch id price amount;
format price DOLLAR12.2 amount COMMA15.0;
run;
No need to use SQL to print data. Just use PROC PRINT.
proc print data=market;
run;
Also does the CSV files have a header line or not? If so you will want to add the FIRSTOBS=2 option to the INFILE statement to tell it to skip the header line.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.