Can you please help me with to get all character fields value’s length in new fields then use all new fields in PROC MEANS to get the statistics details as mentioned below.
Below are sample inputs and output details which I am expecting.
Input:
Ques | Status | Interval | Num1 | Mode | address |
How are you? | Accepted | Monthly | 234 | one-way ticket | Hyatt Regency Louisville |
where are you going? | declined | hourly | 431 | Two-way ticket | Louisville Marriott Downtown |
What can you tell me about yourself? | declined | weekly | 755 | Car allowance | Mayo Clinic Sports Medicine |
Can you list your strengths? | Not applicable | by weekly | 753 | Car allowance | DoubleTree Suites |
What weaknesses do you have? | Accepted | hourly | 941 | one-way ticket | Sheboygan Falls |
Why should I consider hiring you? | Accepted | weekly | 526 | Two-way ticket | Chevy Chase Pavilion |
Where do you see yourself five years from now? | Not applicable | Monthly | 621 | Car allowance | Fox River Mall |
Why do you want to work here? | Accepted | weekly | 834 | one-way ticket | Merchandise Mart Plaza |
Output 1
Num1 | New_Ques | new_ Status | new_ Interval | new_ Mode | new_ address |
234 | 13 | 8 | 7 | 14 | 24 |
431 | 20 | 8 | 6 | 14 | 28 |
755 | 36 | 8 | 6 | 13 | 27 |
753 | 28 | 14 | 9 | 13 | 17 |
941 | 28 | 8 | 6 | 14 | 15 |
526 | 33 | 8 | 6 | 14 | 20 |
621 | 46 | 14 | 7 | 13 | 14 |
834 | 29 | 8 | 6 | 14 | 22 |
Expected Output by using Output 1
Var | Mean | Std | Min | Max | P5 | P25 | P50 | P75 |
Num1 | 636.875 | 231.7717 | 234 | 941 | 234 | 478.5 | 234 | 794.5 |
New_Ques | 29.125 | 9.948977 | 13 | 46 | 13 | 24 | 13 | 34.5 |
new_ Status | 9.5 | 2.77746 | 8 | 14 | 8 | 8 | 8 | 11 |
new_ Interval | 6.625 | 1.06066 | 6 | 9 | 6 | 6 | 6 | 7 |
new_ Mode | 13.625 | 0.517549 | 13 | 14 | 13 | 13 | 13 | 14 |
new_ address | 20.875 | 5.303301 | 14 | 28 | 14 | 16 | 14 | 25.5 |
Important Note: the original table has more than 100 fields and 2+ billion records. So the code should be more efficiency and performance.
It is more like a homework .
Does your teacher ask to that ?
data class;
set sashelp.class;
run;
data _null_;
set sashelp.vcolumn(where=(libname='WORK' and memname='CLASS' and type='char')) end=last;
if _n_=1 then call execute('proc sql;create table want as select ');
call execute(cat('lengthn(',name,') as ',name));
if last then call execute('from class;quit;');
else call execute(',');
run;
ods output summary=final_want;
proc means data=want Mean Std Min Max P5 P25 P50 P75 stackods;
var _numeric_;
run;
How are you going from the input to the output table? Your logic for that step is unclear to me.
Also, what have you tried so far?
@ez123 wrote:
Can you please help me with to get all character fields value’s length in new fields then use all new fields in PROC MEANS to get the statistics details as mentioned below.
Below are sample inputs and output details which I am expecting.
Input:
Ques
Status
Interval
Num1
Mode
address
How are you?
Accepted
Monthly
234
one-way ticket
Hyatt Regency Louisville
where are you going?
declined
hourly
431
Two-way ticket
Louisville Marriott Downtown
What can you tell me about yourself?
declined
weekly
755
Car allowance
Mayo Clinic Sports Medicine
Can you list your strengths?
Not applicable
by weekly
753
Car allowance
DoubleTree Suites
What weaknesses do you have?
Accepted
hourly
941
one-way ticket
Sheboygan Falls
Why should I consider hiring you?
Accepted
weekly
526
Two-way ticket
Chevy Chase Pavilion
Where do you see yourself five years from now?
Not applicable
Monthly
621
Car allowance
Fox River Mall
Why do you want to work here?
Accepted
weekly
834
one-way ticket
Merchandise Mart Plaza
Output 1
Num1
New_Ques
new_ Status
new_ Interval
new_ Mode
new_ address
234
13
8
7
14
24
431
20
8
6
14
28
755
36
8
6
13
27
753
28
14
9
13
17
941
28
8
6
14
15
526
33
8
6
14
20
621
46
14
7
13
14
834
29
8
6
14
22
Expected Output by using Output 1
Var
Mean
Std
Min
Max
P5
P25
P50
P75
Num1
636.875
231.7717
234
941
234
478.5
234
794.5
New_Ques
29.125
9.948977
13
46
13
24
13
34.5
new_ Status
9.5
2.77746
8
14
8
8
8
11
new_ Interval
6.625
1.06066
6
9
6
6
6
7
new_ Mode
13.625
0.517549
13
14
13
13
13
14
new_ address
20.875
5.303301
14
28
14
16
14
25.5
Important Note: the original table has more than 100 fields and 2+ billion records. So the code should be more efficiency and performance.
proc means data=means Mean Std Min Max P5 P25 P50 P75;
run;
Try running the simple proc means and check if there is any performance issue.
Did the approach and code I outlined not illustrate that?
Did it not work? OTher than explicitly listing out your variables, which I mentioned you needed to do, it should work. If it doesn’t, post the log and indicate any errors please.
@ez123 wrote:
I need help for following steps.
1. get all character fields value’s length in new fields (new fileds name should ne "New_&oldfield" (Just add prefix "New_").
2. Use PROC MEANS to get the statistics details for _numeric_ fileds.
nvm.
1. Create an array of all character variables you want to analyze. If all, use _character_ declaration instead.
2. Loop through the array and create a new set of variables with the length
3. Run a proc means on the result.
Make sure your variable names line up between the two arrays to have the names you want and the rest is relatively trivial.
data output1;
set have;
array _var(20) _character_;
array _len(20) len1-len20;
do i=1 to dim(_var);
_len(i) = length(_var(i));
end;
run;
proc means data=output1;
var len1-len20;
run;
@ez123 wrote:
Can you please help me with to get all character fields value’s length in new fields then use all new fields in PROC MEANS to get the statistics details as mentioned below.
Below are sample inputs and output details which I am expecting.
Input:
Ques
Status
Interval
Num1
Mode
address
How are you?
Accepted
Monthly
234
one-way ticket
Hyatt Regency Louisville
where are you going?
declined
hourly
431
Two-way ticket
Louisville Marriott Downtown
What can you tell me about yourself?
declined
weekly
755
Car allowance
Mayo Clinic Sports Medicine
Can you list your strengths?
Not applicable
by weekly
753
Car allowance
DoubleTree Suites
What weaknesses do you have?
Accepted
hourly
941
one-way ticket
Sheboygan Falls
Why should I consider hiring you?
Accepted
weekly
526
Two-way ticket
Chevy Chase Pavilion
Where do you see yourself five years from now?
Not applicable
Monthly
621
Car allowance
Fox River Mall
Why do you want to work here?
Accepted
weekly
834
one-way ticket
Merchandise Mart Plaza
Output 1
Num1
New_Ques
new_ Status
new_ Interval
new_ Mode
new_ address
234
13
8
7
14
24
431
20
8
6
14
28
755
36
8
6
13
27
753
28
14
9
13
17
941
28
8
6
14
15
526
33
8
6
14
20
621
46
14
7
13
14
834
29
8
6
14
22
Expected Output by using Output 1
Var
Mean
Std
Min
Max
P5
P25
P50
P75
Num1
636.875
231.7717
234
941
234
478.5
234
794.5
New_Ques
29.125
9.948977
13
46
13
24
13
34.5
new_ Status
9.5
2.77746
8
14
8
8
8
11
new_ Interval
6.625
1.06066
6
9
6
6
6
7
new_ Mode
13.625
0.517549
13
14
13
13
13
14
new_ address
20.875
5.303301
14
28
14
16
14
25.5
Important Note: the original table has more than 100 fields and 2+ billion records. So the code should be more efficiency and performance.
data class;
set sashelp.class;
run;
data _null_;
set sashelp.vcolumn(where=(libname='WORK' and memname='CLASS' and type='char')) end=last;
if _n_=1 then call execute('proc sql;create table want as select ');
call execute(cat('lengthn(',name,') as ',name));
if last then call execute('from class;quit;');
else call execute(',');
run;
Hi Ksharp, This is awesome solution to find the length of values for all fields. Can you please tell me how to get the PROC mean reprts in output dataset?
1. Use PROC MEANS to get the statistics details from want dataset for _numeric_ fileds.
2. Write the output in SAS dataset.
It is more like a homework .
Does your teacher ask to that ?
data class;
set sashelp.class;
run;
data _null_;
set sashelp.vcolumn(where=(libname='WORK' and memname='CLASS' and type='char')) end=last;
if _n_=1 then call execute('proc sql;create table want as select ');
call execute(cat('lengthn(',name,') as ',name));
if last then call execute('from class;quit;');
else call execute(',');
run;
ods output summary=final_want;
proc means data=want Mean Std Min Max P5 P25 P50 P75 stackods;
var _numeric_;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.