Hi Tom,
Thanks for your reply. I have attached images for you to see; The first is the Excel file of the original dataset & 1 of the SAS dataset created. There is much data on the Excel Feb19 sheet. I have not attached the data itself for confidentiality reasons. I hope you understand and thank you for your tremendous effort in helping me solve this problem. I am aware of the date variable. Just not sure how to exclude it from stdize procedure. I know I can specify which variables to include, but considering there are 21 variables, and I must repeat this process with 2 other datasets, it would be nicer if there was a way to exclude the date alone. I am not interested in turning this into CSVs, because I already went down that path & there are other problems with conversion to csv which require extra coding and time which I, sadly, do not have. On a positive note, I am confident that we are very close to the solution.
🙂
:
Are those two photographs supposed to be of the same data? The dates look totally different. It is really hard to see how those are comparing the same things. Are you sure you are viewing in Excel the same file you are asking SAS to read?
Here is quick DATA _NULL_ you can use to help you look at the actual values in your SAS dataset.
This will read in the data from the FEB19 dataset and show the values of ALL of the variables (plus the automatic loop counter _N_) for the first and last observation.
data _null_;
set feb19 end=eof;
if _n_=1 or EOF then put (_n_ _all_) (=) ;
run;
Run that and look at the values of some of your key variables, like DATE, NETWORK and CHANNEL and then compare them to the values in the first (second row) and last observation in the Excel file.
Do they match? If not how are they different?
Not try the same thing for the JAN19 dataset.
Then try this little step to combine the two.
data _null_;
set jan19 feb19 end=eof;
if _n_=1 or eof then put (_n_ _all_) (=);
run;
Does the first observation match the first observation in the JAN19 dataset? Does the last match the last in the FEB19 dataset?
Here is what that results will look like using one of the sample datasets in SASHELP library.
523 data _null_; 524 set sashelp.class end=eof; 525 if _n_=1 or eof then put (_n_ _all_) (=) ; 526 run; _N_=1 Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 _N_=19 Name=William Sex=M Age=15 Height=66.5 Weight=112 NOTE: There were 19 observations read from the data set SASHELP.CLASS.
i FINALLY FOUND THE SOLUTION!!!
There are multiple aspects involved in this process so hopefully this will help whoever wants to use Excel files to import into SAS!
/*Assigning librefs*/
libname SASData "E:\DI_Project\Data\SAS";
libname WorkData "E:\DI_Project\Data\WorkData";
/*Excel LIBNAME REF FOR DATA IMPORT*/
libname Paid "E:\DI_Project\Data\RawData\Paidsearch2019.xlsx"
header=yes /*Tells SAS variable names are in 1st row*/
dbsaslabel=none /*Removes labels*/
scantime=yes /*Tells SAS to read date & time variables as datetime*/
mixed=yes /*Tells SAS not to put missing values if a variable has both character & numeric data*/
stringDates=no; /*Tells SAS NOT to treat dates as text strings or character variables*/
/*VERIFY VARIABLE LENGTHS & TYPES*/
proc contents data=paid."jan19$"n; run;
proc contents data=paid."feb19$"n; run;
proc contents data=paid."mar19$"n; run;
proc contents data=paid."apr19$"n; run;
proc contents data=paid."may19$"n; run;
proc contents data=paid."jun19$"n; run;
proc contents data=paid."jul19$"n; run;
proc contents data=paid."aug19$"n; run;
proc contents data=paid."sep19$"n; run;
proc contents data=paid."oct19$"n; run;
proc contents data=paid."nov19$"n; run;
proc contents data=paid."dec19$"n; run;
/*JANUARY IMPORT*/
/*Formatting January Date Variable*/
data jandate; /*Output file or File being written to*/
retain Date; /*Tells SAS to retain the date formatting from Excel (treat as a date)*/
format Date ddmmyy10.; /*Tells SAS to format date from date9. to ddmmyy10., but you can choose whatever format desired*/
set paid."Jan19$A1:U470"n; /*Input file or File being read from*/
run;
/*VERIFY DATE FORMAT CHANGE*/
proc contents data=jandate;
run;
/*METHOD 1: RE-ASSIGNING VARIABLE LENGTHS & TYPES*/
/*#1 - Create macrovariable varlist containing a list of variable names from Jan*/
proc contents data=jandate noprint out=metaclass;
run;
options nocenter nodate;
title 'Meta Data for jandate';
proc print data=metaclass;
var name type length varnum;
run;
data _null_;
length allvars $1000;
retain allvars '';
set metaclass end=eof;
allvars = trim(left(allvars))||''||left(name);
if eof then call symput('varlist',allvars);
run;
%put &varlist; /*Macrovariable is created!!!*/
/*#2 - MODIFY VARIABLE TYPE & LENGTH IN JANDATE*/
/*Essentially this is renaming the variables, creating new variables with the old names & inputting the data, so it's like you modified the types & lengths, when in reality you created new variables altogether.*/
data sasdata.jan19 (drop=v1-v21); /*Creating new permanent SAS dataset for merging later & renaming variables*/
retain &varlist; *<-- preserve variable order ;
length *<-- define new types/lengths;
Campaign_name $100 Campaign_ID $15 Network $15
Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions Channel
Bounce_rate Interaction_rate CTR Conversion_rate Daily_budget Cost CPC CPA Conversion_Value ROAS 8;
format *<-- recreate formats;
Campaign_name $CHAR100. Campaign_ID $CHAR15. Network $CHAR15.
Date ddmmyy10.
Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions 10.
Channel 1.
Bounce_rate Interaction_rate CTR Conversion_rate 4.2
Daily_budget Cost CPC CPA Conversion_Value ROAS DOLLAR10.2;
set jandate (rename=( Campaign_name=v1 Campaign_ID=v2 Date=v3 Network=v4 Channel=v5 Impressions=v6
Bounce_rate=v7 Ave_pageviews=v8 Ave_timeonsite_sec=v9 Ave_frequency=v10
Interaction_rate=v11 Daily_budget=v12 Clicks=v13 CTR=v14 Cost=v15 CPC=v16
Conversions=v17 Conversion_rate=v18 CPA=v19 Conversion_Value=v20 ROAS=v21));
Campaign_name = put(v1,char100.); *<-- redefine variable;
Campaign_ID = put(v2,char15.); *<-- redefine variable;
Date = v3;
Network = put(v4,char15.); *<-- redefine variable;
Channel = v5;
Impressions = v6;
Bounce_rate = v7;
Ave_pageviews = v8;
Ave_timeonsite_sec = v9;
Ave_frequency = v10;
Interaction_rate = v11;
Daily_budget = v12;
Clicks = v13;
CTR = v14;
Cost = v15;
CPC = v16;
Conversions = v17;
Conversion_rate = v18;
CPA = put(v19,8.); *<-- redefine variable;
Conversion_Value = v20;
ROAS = v21;
run;
/*FEBRUARY IMPORT: REPEAT EVERYTHING EXCEPT THE CREATION OF THE MACRO VARIABLE*/
/*Formatting January Date Variable*/
data febdate;
retain Date;
format Date ddmmyy10.;
set paid."feb19$A1:U374"n;
run;
/*Verifying Date Format Change*/
proc contents data=febdate;
run;
/*RE-ASSIGNING VARIABLE LENGTHS & TYPES*/
/* Modifying variable type and length of febdate */
data sasdata.feb19 (drop=v1-v21); /*Creating new permanent SAS dataset for merging later & renaming variables*/
retain &varlist; *<-- preserve variable order ;
length *<-- define new types/lengths;
Campaign_name $100 Campaign_ID $15 Network $15
Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions Channel
Bounce_rate Interaction_rate CTR Conversion_rate Daily_budget Cost CPC CPA Conversion_Value ROAS 8;
format *<-- recreate formats;
Campaign_name $CHAR100. Campaign_ID $CHAR15. Network $CHAR15.
Date ddmmyy10.
Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions 10.
Channel 1.
Bounce_rate Interaction_rate CTR Conversion_rate 4.2
Daily_budget Cost CPC CPA Conversion_Value ROAS DOLLAR10.2;
set febdate (rename=( Campaign_name=v1 Campaign_ID=v2 Date=v3 Network=v4 Channel=v5 Impressions=v6
Bounce_rate=v7 Ave_pageviews=v8 Ave_timeonsite_sec=v9 Ave_frequency=v10
Interaction_rate=v11 Daily_budget=v12 Clicks=v13 CTR=v14 Cost=v15 CPC=v16
Conversions=v17 Conversion_rate=v18 CPA=v19 Conversion_Value=v20 ROAS=v21));
Campaign_name = put(v1,char100.); *<-- redefine variable;
Campaign_ID = put(v2,char15.); *<-- redefine variable;
Date = v3;
Network = put(v4,char15.); *<-- redefine variable;
Channel = v5;
Impressions = v6;
Bounce_rate = v7;
Ave_pageviews = v8;
Ave_timeonsite_sec = v9;
Ave_frequency = v10;
Interaction_rate = v11;
Daily_budget = v12;
Clicks = v13;
CTR = v14;
Cost = v15;
CPC = v16;
Conversions = v17;
Conversion_rate = v18;
CPA = put(v19,8.); *<-- redefine variable;
Conversion_Value = v20;
ROAS = v21;
run;
/*Verifying sasdata.feb19*/
proc contents data=sasdata.feb19;
run;
/*MERGE DATASETS*/
data sasdata.all_paid;
set sasdata.jan19 sasdata.feb19;
run;
/*DATA NOT TRUNCATED!!!! YAAAASSSSSSS :)*/
*****NOTE: Obviously, the reassignment of the variables types & lengths part could be turned into a macro ultimately, or do-loop, but since I do not have time for this project, I will not cover this in this post. Enjoy!*****
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.