I have a variable with categorical observation.
eg
id eth
1 1
2 1,2,3
3 2,5
4 4
and so on.....
The data is present in a csv format file. When i import the data, SAS does not recognize the comma between the observations. My proc freq appears something like this:
eth frequency
1 1258
2 456
13 445
14 454
16 23
Instead of 13, 14, 16, I want sas to recognize they are separate option like 1 and 3, 1 and 4. My goal is to count the total 1s,2s,3s and so on. But here instead of 1and 3, the observation appears as 13.
Please show us the SAS code you are using.
libname ab'path'
data new;
set ab.pre;
run;
I just used the import command from the file menu and imported the CSV file(on my local drive) and named it pre. It read all the variables and observations perfectly fine.
then I used the proc freq for the variable eth which had the above-mentioned issue.
proc freq data = new;
table eth;
run;
I guess that when you read
id eth
1 1
2 1,2,3
3 2,5
4 4
you expect the second line to become three observations. The Import proc will not do that. You will need a more refined input step to read that data the way you intend.
If you have a dataset like that:
data have;
input id eth $30.;
cards;
1 1
2 1,2,3
3 2,5
4 4
;
The you will get a frequency count like:
Cumulative Cumulative eth Frequency Percent Frequency Percent ------------------------------------------------------------------------------ 1 1 25.00 1 25.00 1,2,3 1 25.00 2 50.00 2,5 1 25.00 3 75.00 4 1 25.00 4 100.00
If you want treat those comma separated values as separate records/response then first convert the data.
data want ;
set have;
do i=1 to max(1,countw(eth,','));
ethnum = scan(eth,i,',');
output;
end;
run;
The FREQ Procedure Cumulative Cumulative ethnum Frequency Percent Frequency Percent ---------------------------------------------------------------------------- 1 2 28.57 2 28.57 2 2 28.57 4 57.14 3 1 14.29 5 71.43 4 1 14.29 6 85.71 5 1 14.29 7 100.00
@vijay24 wrote:
I have a variable with categorical observation.
eg
id eth
1 1
2 1,2,3
3 2,5
4 4
and so on.....
The data is present in a csv format file. When i import the data, SAS does not recognize the comma between the observations. My proc freq appears something like this:
In a CSV file if values separated by commas are supposed to appear as a single value then the file should have the text appearing as "1,2,3" , within quotes to indicate such. If your text file does not have the quotes then it is not complying with the rules for a CSV. Or possible you want them as a single value when you shouldn't. Generally such as you show is very difficult to work with.
And depending on how you read the data there could be other issues if the values did appear that way. If you use proc import to read a file then SAS makes guesses as to column types and lengths and by default does not look at very many rows. Potentially if the first 20 rows did not have any of the grouped values then SAS may have set the field to be numeric and then such values would be considered invalid data as 1,2,3 is not a valid numeric value.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.