Hello friends.
I have a problem to create a data set with long char variable.
I see that the longest value of var_name is 10 digits.
What is the correct way to solve the problem?
Here find one data set where data is separated by comma and another data set where values are separated by tab.
Data a;
infile cards truncover;
informat var_name $10.;
input pop var_name $10. val;
cards;
1 nr_laks 28520
1 nr_laks_11 1347
1 pct_11 0.0472
1 pct_11_P18 0.7728
1 geni 0.7803
2 nr_laks 32315
2 nr_laks_11 490
2 pct_11 0.01516
2 pct_11_P18 0.7693
2 geni 0.7714
;
run;
Data a;
infile datalines dlm="," dsd;
input pop var_name $ val;
cards;
1,nr_laks,28520
1,nr_laks_11,1347
1,pct_11,0.0472
1,pct_11_P18,0.7728
1,geni,0.7803
2,nr_laks,32315
2,nr_laks_11,490
2,pct_11,0.01516
2,pct_11_P18,0.7693
2,geni,0.7714
;
run;
You want to do this without comma separation, correct?
What is the correct way to solve the problem?
What is the problem? You haven't explained.
@Ronein wrote:
Hello friends.
I have a problem to create a data set with long char variable.
I see that the longest value of var_name is 10 digits.
What is the correct way to solve the problem?
Here find one data set where data is separated by comma and another data set where values are separated by tab.
Data a; infile cards truncover; informat var_name $10.; input pop var_name $10. val; cards; 1 nr_laks 28520 1 nr_laks_11 1347 1 pct_11 0.0472 1 pct_11_P18 0.7728 1 geni 0.7803 2 nr_laks 32315 2 nr_laks_11 490 2 pct_11 0.01516 2 pct_11_P18 0.7693 2 geni 0.7714 ; run; Data a; infile datalines dlm="," dsd; input pop var_name $ val; cards; 1,nr_laks,28520 1,nr_laks_11,1347 1,pct_11,0.0472 1,pct_11_P18,0.7728 1,geni,0.7803 2,nr_laks,32315 2,nr_laks_11,490 2,pct_11,0.01516 2,pct_11_P18,0.7693 2,geni,0.7714 ; run;
If you specify and INFORMAT do not place it on the INPUT statement. The format on the input statement overrides the informat behavior.
Data a; infile cards truncover; informat var_name $10.; input pop var_name val; cards; 1 nr_laks 28520 1 nr_laks_11 1347 1 pct_11 0.0472 1 pct_11_P18 0.7728 1 geni 0.7803 2 nr_laks 32315 2 nr_laks_11 490 2 pct_11 0.01516 2 pct_11_P18 0.7693 2 geni 0.7714 ; run;
When you place a format like $10. on the input statement will read 10 characters, including delimiters and following variable characters. You can modify this behavior by including the informat modifier :
Data a; infile cards truncover; input pop var_name :$10. val; cards; 1 nr_laks 28520 1 nr_laks_11 1347 1 pct_11 0.0472 1 pct_11_P18 0.7728 1 geni 0.7803 2 nr_laks 32315 2 nr_laks_11 490 2 pct_11 0.01516 2 pct_11_P18 0.7693 2 geni 0.7714 ; run;
The actual problem is not the length of the variable but the way you asked the INPUT statement to read from the line of text. By using formatted input mode you are telling INPUT to read exactly 10 characters. So when the value is short it reads the space and some of the characters from the next field on the line.
Also there is no need to attach the $ informat to character variables. You appear to be trying to use the INFORMAT statement to define the variable. It works because SAS will set the length of a variable when it first sees the variable used. So if you use it first in a INFORMAT statement it will guess that you want the length of the variable to match the width of the informat specification you attached to it.
So you could use:
length var_name $10;
input pop var_name val;
But note that will make VAR_NAME the first variable in the dataset. You could expand the LENGTH statement to define all of your variables (it is general good practice to define your variables before using them, even in languages like SAS data step that don't require them to be defined.)
length pop 8 var_name $10 val 8;
input pop var_name val;
But you could also just add the colon modifier in front of the in-line informat in your INPUT statement. That will allow you to specify an informat in the INPUT statement, but still just read the next field from the line (called LIST MODE input). Then there is no need for the LENGTH statement (any new variable will be defined as numeric if there is nothing to indicate otherwise).
input pop var_name :$10. val;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.