Select a real-time dataset of your choice.
Perform the following steps:
a) Data Trimming
b) Data Cleaning
c) Understand data behavior with help of graphs and charts
d) Understand the data with various statistical tools.
Submit individual project: code along with interpretation.
Embed SAS code and output screenshots within the same word document.
What is your question?
using the attached excel sheet how to perform the following steps:
a) Data Trimming
b) Data Cleaning
c) Understand data behavior with help of graphs and charts
d) Understand the data with various statistical tools.
please explain with code along with interpretation.
Embed SAS code and output screenshots within the same document to better understand.
Well, honestly, it sounds like you want someone to do your homework for you. And I am not (and many others are not) willing to do that.
However, if you write some SAS code, and it's not working, then show us the LOG, I (and many others) will be happy to provide help.
1. code for data trimming
PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
OUT=Sales
dbms=xlsx;
Run;
data work.strip_trimleft_trimnleft;
set work.cities;
string = "*" || city || "*";
strip = "*" || strip(city) || "*";
trim_left = "*" || trim(left(city)) || "*";
trimn_left = "*" || trimn(left(city)) || "*";
drop city;
run;
2. code for data cleaning :
PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
OUT=Sales
dbms=xlsx;
Run;
data work.strip_trimleft_trimnleft;
set work.cities;
string = "*" || city || "*";
strip = "*" || strip(city) || "*";
trim_left = "*" || trim(left(city)) || "*";
trimn_left = "*" || trimn(left(city)) || "*";
drop city;
run;
data work.compress_compbl;
set work.cities;
string = "*" || city || "*";
compress = "*" || compress(city) || "*";
compbl = "*" || compbl(city) || "*";
drop city;
run;
but I am getting too many errors....
please help me with this sas programming.
In my opinion, data trimming is not removing blanks from the city name. I don't know what is expected of "data trimming" in this homework assignment, but I doubt that's it. You need to make sure you are doing the "data trimming" that is required. Similarly, "data cleaning" is not whatever it is you show under your "data cleaning" code. I think you have gone down the wrong path. You might be wise to discuss these with your professor.
So you need to straighten out those items with your professor and make sure your are attempting the right things. We cannot help here until you know (and then let us know) what is needed.
I want you to go down the RIGHT path here, and so there's no point in fixing code that takes you down the wrong path.
In the future:
If you are having errors, then please show us the LOG. We need to see the ENTIRE log, that's 100%, every single character, so we can see your code as it appears in the log, and we can see the NOTEs, WARNINGs and ERRORs. Copy the log as text and paste it into the window that appears when you click on the </> icon. PLEASE FOLLOW THESE INSTRUCTIONS. Failure to follow the instruction only slows you down, as we will again ask for you to follow these instructions.
Next, I (and many people) do not download Excel files, as they are a security threat. The proper way to provide us data for use is via these instructions (and not via an Excel file or a screen capture of an Excel file).
From which fact do you assume that there is a dataset CITIES in WORK?
You imported a dataset SALES.
You're using the wrong definition of TRIM. To trim statistics usually means to remove outliers/edge case data. You're assuming it applies to character variables.
https://hollyemblem.medium.com/when-to-use-a-trimmed-mean-fd6aab347e46
@ashwinangiah wrote:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 PROC IMPORT DATAFILE="/home/u58805020/BI/Financial Sample.xlsx"
74 OUT=Sales
75 dbms=xlsx;
76 Run;
NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with
options MSGLEVEL=I.
NOTE: The import data set has 700 observations and 16 variables.
NOTE: WORK.SALES data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.11 seconds
user cpu time 0.11 seconds
system cpu time 0.00 seconds
memory 2809.40k
OS Memory 31400.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 24 Switch Count 2
Page Faults 0
Page Reclaims 818
Page Swaps 0
Voluntary Context Switches 19
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
77
78 data work.strip_trimleft_trimnleft;
79 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
80 string = "*" || city || "*";
81 strip = "*" || strip(city) || "*";
82 trim_left = "*" || trim(left(city)) || "*";
83 trimn_left = "*" || trimn(left(city)) || "*";
84
85 drop city;
86 run;
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
80:21 81:26 82:34 83:36
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.STRIP_TRIMLEFT_TRIMNLEFT may be incomplete. When this step was stopped there were 0 observations and 4
variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 632.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 25 Switch Count 2
Page Faults 0
Page Reclaims 116
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
87
88 data work.compress_compbl;
89 set work.cities;
ERROR: File WORK.CITIES.DATA does not exist.
90 string = "*" || city || "*";
91 compress = "*" || compress(city) || "*";
92 compbl = "*" || compbl(city) || "*";
93
94 drop city;
95 run;
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
90:21 91:32 92:28
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.COMPRESS_COMPBL may be incomplete. When this step was stopped there were 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 631.96k
OS Memory 29864.00k
Timestamp 07/17/2021 03:57:50 PM
Step Count 26 Switch Count 2
Page Faults 0
Page Reclaims 89
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
96
97 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
109
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.