BookmarkSubscribeRSS Feed
ashwinangiah
Fluorite | Level 6

Select a real-time dataset of your choice.

Perform the following steps:

a) Data Trimming 

b) Data Cleaning

c) Understand data behavior with help of graphs and charts

d) Understand the data with various statistical tools.

Submit individual project: code along with interpretation.

Embed SAS code and output screenshots within the same word document.

9 REPLIES 9
PaigeMiller
Diamond | Level 26

What is your question?

 

--
Paige Miller
ashwinangiah
Fluorite | Level 6

using the attached excel sheet how to perform the following steps:

a) Data Trimming 

b) Data Cleaning

c) Understand data behavior with help of graphs and charts

d) Understand the data with various statistical tools.

please explain with code along with interpretation.

Embed SAS code and output screenshots within the same document to better understand.

PaigeMiller
Diamond | Level 26

Well, honestly, it sounds like you want someone to do your homework for you. And I am not (and many others are not) willing to do that.

 

However, if you write some SAS code, and it's not working, then show us the LOG, I (and many others) will be happy to provide help.

--
Paige Miller
ashwinangiah
Fluorite | Level 6

1. code for data trimming 

 

PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
OUT=Sales
dbms=xlsx;
Run;

data work.strip_trimleft_trimnleft;
set work.cities;
string = "*" || city || "*";
strip = "*" || strip(city) || "*";
trim_left = "*" || trim(left(city)) || "*";
trimn_left = "*" || trimn(left(city)) || "*";

drop city;
run;

 

2. code for data cleaning :

PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
OUT=Sales
dbms=xlsx;
Run;

data work.strip_trimleft_trimnleft;
set work.cities;
string = "*" || city || "*";
strip = "*" || strip(city) || "*";
trim_left = "*" || trim(left(city)) || "*";
trimn_left = "*" || trimn(left(city)) || "*";

drop city;
run;

data work.compress_compbl;
set work.cities;
string = "*" || city || "*";
compress = "*" || compress(city) || "*";
compbl = "*" || compbl(city) || "*";

drop city;
run;

 

but I am getting too many errors....

please help me with this sas programming.

ashwinangiah
Fluorite | Level 6
my log says:
WARNING: The data set WORK STRIP TRIMLEFT TRIMNLEFT may be incomplete. When this step was stopped there were 0 observations and 4

WARNING. The data WORK COMPRESS COMPBL may be complete. When this step was stopped there were 0 observations and 3 variables. and ERROR: File WORK.CITIES.DATA does not exist.
PaigeMiller
Diamond | Level 26

In my opinion, data trimming is not removing blanks from the city name. I don't know what is expected of "data trimming" in this homework assignment, but I doubt that's it. You need to make sure you are doing the "data trimming" that is required. Similarly, "data cleaning" is not whatever it is you show under your "data cleaning" code. I think you have gone down the wrong path. You might be wise to discuss these with your professor.

 

So you need to straighten out those items with your professor and make sure your are attempting the right things. We cannot help here until you know (and then let us know) what is needed.

 

I want you to go down the RIGHT path here, and so there's no point in fixing code that takes you down the wrong path.

 

In the future:

 

If you are having errors, then please show us the LOG. We need to see the ENTIRE log, that's 100%, every single character, so we can see your code as it appears in the log, and we can see the NOTEs, WARNINGs and ERRORs. Copy the log as text and paste it into the window that appears when you click on the </> icon. PLEASE FOLLOW THESE INSTRUCTIONS. Failure to follow the instruction only slows you down, as we will again ask for you to follow these instructions.

 

Next, I (and many people) do not download Excel files, as they are a security threat. The proper way to provide us data for use is via these instructions (and not via an Excel file or a screen capture of an Excel file).

--
Paige Miller
ashwinangiah
Fluorite | Level 6

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
74 OUT=Sales
75 dbms=xlsx;
76 Run;

NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with
options MSGLEVEL=I.
NOTE: The import data set has 700 observations and 16 variables.
NOTE: WORK.SALES data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.11 seconds
user cpu time 0.11 seconds
system cpu time 0.00 seconds
memory 2809.40k
OS Memory 31400.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 24 Switch Count 2
Page Faults 0
Page Reclaims 818
Page Swaps 0
Voluntary Context Switches 19
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


77
78 data work.strip_trimleft_trimnleft;
79 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
80 string = "*" || city || "*";
81 strip = "*" || strip(city) || "*";
82 trim_left = "*" || trim(left(city)) || "*";
83 trimn_left = "*" || trimn(left(city)) || "*";
84
85 drop city;
86 run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
80:21 81:26 82:34 83:36
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.STRIP_TRIMLEFT_TRIMNLEFT may be incomplete. When this step was stopped there were 0 observations and 4
variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 632.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 25 Switch Count 2
Page Faults 0
Page Reclaims 116
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


87
88 data work.compress_compbl;
89 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
90 string = "*" || city || "*";
91 compress = "*" || compress(city) || "*";
92 compbl = "*" || compbl(city) || "*";
93
94 drop city;
95 run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
90:21 91:32 92:28
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.COMPRESS_COMPBL may be incomplete. When this step was stopped there were 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 631.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 26 Switch Count 2
Page Faults 0
Page Reclaims 89
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


96
97 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
109
Reeza
Super User

You're using the wrong definition of TRIM. To trim statistics usually means to remove outliers/edge case data. You're assuming it applies to character variables. 

 

https://hollyemblem.medium.com/when-to-use-a-trimmed-mean-fd6aab347e46

 


@ashwinangiah wrote:

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
74 OUT=Sales
75 dbms=xlsx;
76 Run;

NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with
options MSGLEVEL=I.
NOTE: The import data set has 700 observations and 16 variables.
NOTE: WORK.SALES data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.11 seconds
user cpu time 0.11 seconds
system cpu time 0.00 seconds
memory 2809.40k
OS Memory 31400.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 24 Switch Count 2
Page Faults 0
Page Reclaims 818
Page Swaps 0
Voluntary Context Switches 19
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


77
78 data work.strip_trimleft_trimnleft;
79 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
80 string = "*" || city || "*";
81 strip = "*" || strip(city) || "*";
82 trim_left = "*" || trim(left(city)) || "*";
83 trimn_left = "*" || trimn(left(city)) || "*";
84
85 drop city;
86 run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
80:21 81:26 82:34 83:36
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.STRIP_TRIMLEFT_TRIMNLEFT may be incomplete. When this step was stopped there were 0 observations and 4
variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 632.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 25 Switch Count 2
Page Faults 0
Page Reclaims 116
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


87
88 data work.compress_compbl;
89 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
90 string = "*" || city || "*";
91 compress = "*" || compress(city) || "*";
92 compbl = "*" || compbl(city) || "*";
93
94 drop city;
95 run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
90:21 91:32 92:28
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.COMPRESS_COMPBL may be incomplete. When this step was stopped there were 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 631.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 26 Switch Count 2
Page Faults 0
Page Reclaims 89
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264


96
97 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
109

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 985 views
  • 2 likes
  • 4 in conversation