03-16-2018 03:28 PM
Ok, I know this is pretty basic. But I need to get this right.
When I already have a set, and I know the variable names, I can use the scan function. However, when doing original data list,
my problem here is that the Location, such as New Jersey, has a space, so I am using the &, but there are cases where a $15. informat, for shorter location names, included data in the name field. I could just manually place spaces after the name to make sure that there are no numbers in the name, but there must be a better way to do this? Any thoughts would be appreciated.
Libname SasData '/folders/myfolders/SASData' ;
Data HIV_Surveillance ;
Input Residence_Location & $15.
Reported_Cases : 8.
All_Cases : 8. ;
Atlanta 6355 14776
Baltimore 3490 10176
Boston 2769 9334
New Jersey 4893 9457
Above, 6355 ends up with the name Atlanta. I could just increase the spaces after atlanta to make up the 15 characters, but is there a better way to do this? Thanks!
03-16-2018 04:05 PM
Here are two different ways one requires the dsd option and quoting the values with a space in the middle. Second has TWO, no need to pad to 15) spaces after the first variable for all records
Data HIV_Surveillance ; infile datalines dsd dlm=' '; informat Residence_Location $15. Reported_Cases 8. All_Cases 8. ; Input Residence_Location Reported_Cases All_Cases ; Datalines ; Atlanta 6355 14776 Baltimore 3490 10176 Boston 2769 9334 "New Jersey" 4893 9457 ; Data HIV_Surveillance ; input Residence_Location & :$15. Reported_Cases :8. All_Cases :8. ; Datalines ; Atlanta 6355 14776 Baltimore 3490 10176 Boston 2769 9334 New Jersey 4893 9457 ; run;
Better might be to use another delimeter such as , or ; instead of default space
03-17-2018 09:57 AM
Data HIV_Surveillance ; input; p=anydigit(_infile_); Residence_Location=substr(_infile_,1,p-1); Reported_Cases=scan(substr(_infile_,p),1); All_Cases=scan(substr(_infile_,p),2); drop p; Datalines ; Atlanta 6355 14776 Baltimore 3490 10176 Boston 2769 9334 New Jersey 4893 9457 ; proc print;run;