Write a SAS program to access the Microsoft Excel spreadsheet.
Create a comma-delimited raw data file to store information that includes sex, categories, and number of individuals in each category.
Please post the code you already have; post the log if you have problems, and describe where your results do not meet your expectations.
I am new to the SAS and dont know much about it. i had the one assignment was really simple but then this that i dont know much about. If you can help me that about be appreciated. I could come up with this but it is not working and i have to use the excel file and output should be in .CSV file. Right now the output is the whole excel file. I have attached the file.
So I saved the Excel to a csv file and copied that to my SAS server. It's much easier to get the right types and contents if you don't force SAS to make guesses (what you have to do with the xlsx format). On top of that, I don't have ACCESS to PC Files licensed.
Excel used a semicolon as delimiter (although csv explicitly means comma seperated volume), so I had to adapt my input data step.
data in (keep=sex category ny kc);
infile
'$HOME/sascommunity/nyfile.csv' /* filename on my server */
dlm=';' /* use semicolon as delimiter */
dsd /* in case a delimiter or an empty cell appears somewhere */
firstobs=4 /* start in line 4, everything before that is irrelevant */
;
length /* set lengths for the variables needed */
sex $6
category $50
ny 8
var3 $10 /* var3 is just a dummy, as we only need the population estimates */
kc 8
;
format ny kc comma12.; /* Nice format for the numbers */
label /* Nice names for the csv file */
sex = 'Sex'
category = 'Category'
ny = 'Pop. Estimate New York'
kc = 'Pop. Estimate Kings County'
;
retain sex '' flag 0; /* keep these variables across observations */
input
category
ny : comma12.
var3
kc : comma12.
;
/* determine the sex: */
if category in ('Male:','Female:') then sex = scan(category,1,':');
/* check the category fulltime or not: */
if substr(category,1,6) = 'Worked' then flag = 1;
if substr(category,1,5) = 'Other' then flag = 0;
if flag and category in ( /* only if we are in the right category and sub-category */
'$55,000 to $64,999',
'$65,000 to $74,999',
'$75,000 to $99,999',
'$100,000 or more'
); /* this is called a subsetting if */
run;
/* check the dataset */
proc print data=in noobs;
run;
/* the easiest part: create the csv */
proc export data=in file='$HOME/sascommunity/want.csv' dbms=csv replace label;
run;
Output of the proc print:
sex category ny kc Male $55,000 to $64,999 244,273 16,330 Male $65,000 to $74,999 199,338 14,708 Male $75,000 to $99,999 311,572 25,495 Male $100,000 or more 462,618 59,399 Female $55,000 to $64,999 176,444 13,596 Female $65,000 to $74,999 130,768 11,639 Female $75,000 to $99,999 184,576 18,605 Female $100,000 or more 180,911 29,356
The csv file:
"Sex","Category","Pop. Estimate New York","Pop. Estimate Kings County" Male,"$55,000 to $64,999","244,273","16,330" Male,"$65,000 to $74,999","199,338","14,708" Male,"$75,000 to $99,999","311,572","25,495" Male,"$100,000 or more","462,618","59,399" Female,"$55,000 to $64,999","176,444","13,596" Female,"$65,000 to $74,999","130,768","11,639" Female,"$75,000 to $99,999","184,576","18,605" Female,"$100,000 or more","180,911","29,356"
thanks for quick reply. I tried that but only title's are showing up in the output.
I looked at the logs closer and it is not reading the number.
NOTE: Invalid data for ny in line 5 1-102.
NOTE: Invalid data for kc in line 7 1-63.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
7 With income:,"2,488,381","+/-15,432","180,569","+/-3,300" 63
NOTE: Invalid data errors for file ''/home/myfolder/nyfile.csv'' occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
sex= category=Male:,"5,148,815","+/-3,463","326,951",+/-302 ny=. var3=No income, kc=. flag=0 _ERROR_=1 _N_=1
NOTE: Invalid data for ny in line 9 1-52.
NOTE: Invalid data for kc in line 11 1-57.
Does anyone know how to format those numbers?
thank you so much @KurtBremser. I think i got it what was wrong with the code.
Hi @Kurt_Bremser! Looks like you've helped to solve this problem before -- maybe just the County name has changed 😉
First, save the whole shambles to a csv file, so that you can take control over the import process (especially where you start to read, and how you name the columns). Read that with a suitable data step.
Then, in a further data step, retain a variable with which you keep the sex, and use a subsetting if to only keep observations that meet your criteria.
For the last part, you can once again write a data step, or use proc export.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.