04-13-2016 05:31 AM
infile datalines missover;
input name $ sex $ age city $ company $ ;
Rahul 22 Banglore TCS
Mahesh M 26 Mumbai AXIS
Kiran 24 Pune CYTEL
swati F 26 Mumbai KOTAK
Mahesh M 23 Mumbai AXIS
It is not assigning missing valuea as blank to variable sex
04-13-2016 05:54 AM
That's not how missover works, especially since you don't have delimiter to identify missing.
SAS really can't know that 22 isn't a valid sex.
04-13-2016 06:00 AM
This is a well covered topic if you search here or the Internet.
If if your data doesn't have specific delimiters besides a space your going to have a hard time reading in your file.
04-13-2016 06:12 AM
it is hard to believe you don't have at least 2 consecutive delimiters between your variables in case, for example, sex is missing.
if the case adding dsd delimiter=' '; to your infile statement will do it.
04-13-2016 06:17 AM
You would need to do this yourself. There is no logical way to identify "2" as sex anymore than "M" or "F". Consider using delimited data in future.
Post process example:
data sid8 (keep=name sex age city company); length name sex age city company str $40; infile datalines dlm="¬"; input str $; name=scan(str,1," "); next=2; if scan(str,2," ") in ("M","F") then do; sex=scan(str,2," "); next=3; end; age=scan(str,next," "); next=next+1; city=scan(str,next," "); next=next+1; company=scan(str,next," "); datalines; Rahul 22 Banglore TCS Mahesh M 26 Mumbai AXIS Kiran 24 Pune CYTEL swati F 26 Mumbai KOTAK Mahesh M 23 Mumbai AXIS ; run;
04-13-2016 10:55 AM
What do you want the values of your other variables to be when "missing" the sex value?
Your current code gets missing values of age because it reads the city and city has the value of company.
If you actually have a lot of data of this poor quality you are going to have to parse the input string and treat the case of missing sex differently.
data sid8; infile datalines missover; input @ ; if upcase(scan(_infile_,2)) in ('M','F') then input name $ sex $ age city $ company $ ; else input name $ age city $ company ; datalines; Rahul 22 Banglore TCS Mahesh M 26 Mumbai AXIS Kiran 24 Pune CYTEL swati F 26 Mumbai KOTAK Mahesh M 23 Mumbai AXIS ; run;
However it would be much preferabl to get the data layout in a better form such as suggested by @Loko or @RW9as the number of exceptions are likely to increase such imbedded spaces in names of people, cities or companies and varying number of name parts for people.