- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
my dataset has some numeric and some character variables
company_name | esg2021 | esg2020 | esg2019 | esg2018 |
Talwalkars Better Value Fitness Ltd | ||||
Tarapur Transformers Ltd | ||||
GB Global Ltd | ||||
Jaypee Infratech Ltd | ||||
SJVN Ltd | ||||
Rainbow Papers Ltd | ||||
Sundaram Multi Pap Ltd | ||||
Pioneer Distilleries Ltd | ||||
Technofab Engineering Ltd | ||||
Denso India Pvt Ltd | ||||
Manappuram Finance Ltd | 53.26 | 56.28 | ||
Lakshmi Finance and Industrial Corp Ltd | ||||
Hindustan Media Ventures Ltd | ||||
Tilaknagar Industries Ltd | ||||
Indian Metals and Ferro Alloys Ltd | ||||
Bliss GVS Pharma Ltd | ||||
Emami Realty Ltd | ||||
Bajaj Consumer Care Ltd | ||||
Prakash Steelage Ltd | ||||
Asian Hotels (West) Ltd | ||||
Sakthi Finance Ltd | ||||
Asian Hotels East Ltd | ||||
Industrial Investment Trust Ltd |
the data is relating to esg of the companies from esg2021 to esg2001(ie.,year 2021 to 2001)
in this except, company_name, all other variables should be in numeric form. but, some of the esg variables are in character and some are in numeric form.
how to convert all the esg variables in to numeric
thanks in advance
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you must put a calculated column like for example:
esg2021_new = esg2021 * 1;
so this new variable becomes numerical
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
WHY are some of the esg... variables character? Are you importing from an Excel?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
NA
>99
<1
SAS didn't want to make those variables character. It was forced to do so because some of the values contain characters.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@srikanthyadav44 wrote:
my dataset has some numeric and some character variables
in this except, company_name, all other variables should be in numeric form. but, some of the esg variables are in character and some are in numeric form.
....
how to convert all the esg variables in to numeric
By fixing the step reading the file. If the source is an excel-file, convert it to csv and write a data step to read it. Anything else will fail sooner or later.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you must put a calculated column like for example:
esg2021_new = esg2021 * 1;
so this new variable becomes numerical
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@robsonandrade0 wrote:
you must put a calculated column like for example:
esg2021_new = esg2021 * 1;
so this new variable becomes numerical
Bad idea, using this code notes (that should be at least warnings) are written to the log about automatic type conversation. Most likely causing errors if esg2021 has non-numeric content.
esg_2021_new = input(esg2012, ?? best.);
The ?? take care of values that aren't numeric, the input-function returns a missing value in those cases.
The data seems to be in spreadsheet-style, easy to read for humans, difficult to work with in code. So i would do something like this
/* untested code */
data transposed;
set have;
length year esg 8;
array values[2001:2021] esg2001-esg2021;
do year = lbound(values) to hbound(values);
esg = input(values[year], ?? best.);
if missing(esg) and not missing(values[year]) then do;
/* maybe write a note to the log if conversion failed */
end;
output;
end;
drop i esg2001-esg2021;
run;
Maybe using proc transpose before converting the esg-variables is even better, but i am to lazy to change the code now 😉
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
your technique to convert the character variable to numeric is working.
can you please help me out in writing the SAS CODE with array function for the same purpose, because i have large number of variables to convert them into numeric.
i have some numeric and character variables in the file. i think it is better to multiple all the variables with 1 and create a new variable and them remove the old one.
it can be done easy, if i use array function. please help me out in this regard.
i have to do the following functions with the code
1. except date, i have to multiple all variables from Y1 to Y100 with 1 and create a new variable
2. remove the old variable
3. assign old variable name to new variable
the model of my input file is as follows.
Date | Y1 | Y2 | Y3 | Y4 | Y5 |
1-Apr-09 | 1.85 | 0.8 | 4.62 | 5.46 | |
2-Apr-09 | 2.12 | 9.8 | 7.09 | 1.61 | |
6-Apr-09 | -3.14 | 2.5 | 11.63 | 10.48 | |
8-Apr-09 | 1.76 | 4.39 | -0.97 | -4.23 | |
9-Apr-09 | -0.68 | 11.09 | 2.22 | 2.69 | |
13-Apr-09 | 0.75 | 2.63 | 1.76 | 1.34 | |
15-Apr-09 | 3.85 | 4.22 | 7.12 | 4.25 | |
16-Apr-09 | 0.36 | -5.08 | -5.17 | -4.33 | |
17-Apr-09 | -0.9 | 2.74 | 0.17 | -1.18 | |
20-Apr-09 | 1.25 | -0.53 | 0.17 | 0.71 | |
21-Apr-09 | -0.82 | 3.74 | 2.78 | 0.24 | |
22-Apr-09 | 5.26 | -3.52 | 1.03 | 2.4 | |
23-Apr-09 | 0.3 | -0.15 | 6.44 | 0.6 | |
24-Apr-09 | 4.88 | 0 | 14.55 | 3.36 | |
27-Apr-09 | -2.98 | 0.77 | -4.29 | -3.61 | |
28-Apr-09 | -2.32 | -3.92 | -7.17 | -5.24 | |
29-Apr-09 | 1.44 | 2.95 | 6.54 | 4.61 | |
4-May-09 | 0.38 | 3.22 | 9.99 | 4.88 | |
5-May-09 | -0.21 | -0.8 | -2.37 | 1.53 | |
6-May-09 | -2.09 | 0.09 | -1.31 | -2.1 | |
7-May-09 | -2.44 | 0.09 | 1.03 | 2 | |
8-May-09 | -3.55 | -2.33 | -4.47 | 0.77 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data exemplo;
input Date $ 1-7 Y1 $ 8-13 Y2 $ 14-19 Y3 $ 20-25 Y4 $ 26-26 Y5 $ 27-33;
datalines;
1Apr09 1.85 0.8 4.62 5.46
2Apr09 2.12 9.8 7.09 1.61
6Apr09 -3.14 2.5 11.63 10.48
8Apr09 1.76 4.39 -0.97 -4.23
9Apr09 -0.68 11.09 2.22 2.69
13Apr09 0.75 2.63 1.76 1.34
15Apr09 3.85 4.22 7.12 4.25
16Apr09 0.36 -5.08 -5.17 -4.33
17Apr09 -0.9 2.74 0.17 -1.18
20Apr09 1.25 -0.53 0.17 0.71
21Apr09 -0.82 3.74 2.78 0.24
22Apr09 5.26 -3.52 1.03 2.4
23Apr09 0.3 -0.15 6.44 0.6
24Apr09 4.88 0 14.55 3.36
27Apr09 -2.98 0.77 -4.29 -3.61
28Apr09 -2.32 -3.92 -7.17 -5.24
29Apr09 1.44 2.95 6.54 4.61
4May09 0.38 3.22 9.99 4.88
5May09 -0.21 -0.8 -2.37 1.53
6May09 -2.09 0.09 -1.31 -2.1
7May09 -2.44 0.09 1.03 2
8May09 -3.55 -2.33 -4.47 0.77
;
data teste (drop= i Y1-Y5);
set work.exemplo;
array X{5} X1-X5;
array Y{5} Y1-Y5;
do i=1 to 5;
X{i}=Y{i}*1;
end;
RUN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data exemplo;
input Date $ 1-7 Y1 $ 8-13 Y2 $ 14-19 Y3 $ 20-25 Y4 $ 26-26 Y5 $ 27-33;
datalines;
1Apr09 1.85 0.8 4.62 5.46
2Apr09 2.12 9.8 7.09 1.61
6Apr09 -3.14 2.5 11.63 10.48
8Apr09 1.76 4.39 -0.97 -4.23
9Apr09 -0.68 11.09 2.22 2.69
13Apr09 0.75 2.63 1.76 1.34
15Apr09 3.85 4.22 7.12 4.25
16Apr09 0.36 -5.08 -5.17 -4.33
17Apr09 -0.9 2.74 0.17 -1.18
20Apr09 1.25 -0.53 0.17 0.71
21Apr09 -0.82 3.74 2.78 0.24
22Apr09 5.26 -3.52 1.03 2.4
23Apr09 0.3 -0.15 6.44 0.6
24Apr09 4.88 0 14.55 3.36
27Apr09 -2.98 0.77 -4.29 -3.61
28Apr09 -2.32 -3.92 -7.17 -5.24
29Apr09 1.44 2.95 6.54 4.61
4May09 0.38 3.22 9.99 4.88
5May09 -0.21 -0.8 -2.37 1.53
6May09 -2.09 0.09 -1.31 -2.1
7May09 -2.44 0.09 1.03 2
8May09 -3.55 -2.33 -4.47 0.77
;
data teste (drop= i Y1-Y5);
set work.exemplo;
array X{5} X1-X5;
array Y{5} Y1-Y5;
do i=1 to 5;
X{i}=Y{i}*1;
end;
RUN;