BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sleo007
Obsidian | Level 7
Yes, it actually has data. I didn't realize that the date would change format however. How do I prevent the date from changing to 0; like how do I exclude the date variable from proc stdize? Because none of the dates are 0.
Tom
Super User Tom
Super User
You did not do anything to change the format attached to the date value. But since date variables are numeric if you ran some code that changes all of the missing numeric values to zeros then you converted the missing dates to day number zero, which the first day of the year 1960. You can tell the proc you used to convert missing to zero which variables it should operate on.
Please show the data (not photographs) that is in the spreadsheet FEB19 that is not in the SAS dataset(s) that you have created from it. Is the data missing in both the dataset with just the data from one sheet and in the dataset that combined the 2 (or 12) sheets and/or datasets? Is the missing data at the start of the spreadsheet? Then end? Are all of the variables missing? Just some of the 21 variables missing? Which variables are missing? Those on the left of the spreadsheet? On the right?
Perhaps the ranges defined in your spreadsheet do not match the filled cells in the spreadsheet?
To really have a repeatable process flow your source data should be in a more useful format than an Excel file. Like a series of text files that you can read with a data step. Such as CSV files.
Sleo007
Obsidian | Level 7

Hi Tom,

 

Thanks for your reply. I have attached images for you to see; The first is the Excel file of the original dataset & 1 of the SAS dataset created. There is much data on the Excel Feb19 sheet. I have not attached the data itself for confidentiality reasons. I hope you understand and thank you for your tremendous effort in helping me solve this problem. I am aware of the date variable. Just not sure how to exclude it from stdize procedure. I know I can specify which variables to include, but considering there are 21 variables, and I must repeat this process with 2 other datasets, it would be nicer if there was a way to exclude the date alone. I am not interested in turning this into CSVs, because I already went down that path & there are other problems with conversion to csv which require extra coding and time which I, sadly, do not have. On a positive note, I am confident that we are very close to the solution.

🙂

 

:Untitled.pngFeb_Temp_missing.PNG

Tom
Super User Tom
Super User

Are those two photographs supposed to be of the same data? The dates look totally different.  It is really hard to see how those are comparing the same things. Are you sure you are viewing in Excel the same file you are asking SAS to read?

 

Here is quick DATA _NULL_ you can use to help you look at the actual values in your SAS dataset.

This will read in the data from the FEB19 dataset and show the values of ALL of the variables (plus the automatic loop counter _N_) for the first and last observation.

data _null_;
  set feb19 end=eof;
  if _n_=1 or EOF then put (_n_ _all_) (=) ;
run;

Run that and look at the values of some of your key variables, like DATE, NETWORK and CHANNEL and then compare them to the values in the first (second row) and last observation in the Excel file.

Do they match?  If not how are they different?

 

Not try the same thing for the JAN19 dataset.

 

Then try this little step to combine the two.  

data _null_;
  set jan19 feb19 end=eof;
  if _n_=1 or eof then put (_n_ _all_) (=);
run;

Does the first observation match the first observation in the JAN19 dataset? Does the last match the last in the FEB19 dataset?

 

Here is what that results will look like using one of the sample datasets in SASHELP library.

523   data _null_;
524    set sashelp.class end=eof;
525    if _n_=1 or eof then put (_n_ _all_) (=) ;
526   run;

_N_=1 Name=Alfred Sex=M Age=14 Height=69 Weight=112.5
_N_=19 Name=William Sex=M Age=15 Height=66.5 Weight=112
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
Sleo007
Obsidian | Level 7
Yes, I verified the January data & it's the same. Running your code now. Will update soon.
Sleo007
Obsidian | Level 7
So the last observation in January differs. I believe because the campaign ID has 1 number less than the others, & thus all the following variables get truncated. I will try to force length with the procedure from earlier with Campaign_ID to see if this helps solve the problem.
Sleo007
Obsidian | Level 7
So the error I am seeing now is:

26 data jan19;
27 length Campaign_Name $100 Network $15 Campaign_ID $10;
28 set paid.Jan19;
ERROR: Variable Campaign_ID has been defined as both character and numeric.
29 format _character_ ;
30 length CName $100 NNetwork $15 Campaign_ID $10;
31 if vlength(Campaign_Name)=100.
32 then CName=Campaign_Name;
33 else CName = put(Campaign_Name,$CHAR100.);
34 if vlength(Network)=15.
35 then NNetwork=Network;
36 else NNetwork = put(Network,$CHAR15.);
37 if vlength(Campaign_ID)=10.
38 then CID=Campaign_ID;
39 else CID = put(Campaign_ID,$CHAR10.);
40 Drop Campaign_Name Network Campaign_ID;
41 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.JAN19 may be incomplete. When this step was stopped there were 0 observations and 21 variables.
WARNING: Data set WORK.JAN19 was not replaced because this step was stopped.

I need to define it as a character variable. I also manually changed it to text in the Excel file, but SAS still reads it as a BEST. Any idea on how to convert variable type after libref statement or in data step?
Sleo007
Obsidian | Level 7

i FINALLY FOUND THE SOLUTION!!!

 

There are multiple aspects involved in this process so hopefully this will help whoever wants to use Excel files to import into SAS!

 

/*Assigning librefs*/
     libname SASData "E:\DI_Project\Data\SAS";
     libname WorkData "E:\DI_Project\Data\WorkData";

 

/*Excel LIBNAME REF FOR DATA IMPORT*/
     libname Paid "E:\DI_Project\Data\RawData\Paidsearch2019.xlsx"
          header=yes                  /*Tells SAS variable names are in 1st row*/
          dbsaslabel=none         /*Removes labels*/
          scantime=yes              /*Tells SAS to read date & time variables as datetime*/
          mixed=yes                  /*Tells SAS not to put missing values if a variable has both character & numeric data*/
          stringDates=no;          /*Tells SAS NOT to treat dates as text strings or character variables*/

 

/*VERIFY VARIABLE LENGTHS & TYPES*/
proc contents data=paid."jan19$"n; run;
proc contents data=paid."feb19$"n; run;
proc contents data=paid."mar19$"n; run;
proc contents data=paid."apr19$"n; run;
proc contents data=paid."may19$"n; run;
proc contents data=paid."jun19$"n; run;
proc contents data=paid."jul19$"n; run;
proc contents data=paid."aug19$"n; run;
proc contents data=paid."sep19$"n; run;
proc contents data=paid."oct19$"n; run;
proc contents data=paid."nov19$"n; run;
proc contents data=paid."dec19$"n; run;

 

/*JANUARY IMPORT*/
/*Formatting January Date Variable*/
data jandate;                                 /*Output file or File being written to*/
     retain Date;                              /*Tells SAS to retain the date formatting from Excel (treat as a date)*/
     format Date ddmmyy10.;         /*Tells SAS to format date from date9. to ddmmyy10., but you can choose whatever format desired*/
     set paid."Jan19$A1:U470"n;    /*Input file or File being read from*/
run;

 

/*VERIFY DATE FORMAT CHANGE*/
proc contents data=jandate;
run;

 

/*METHOD 1: RE-ASSIGNING VARIABLE LENGTHS & TYPES*/
/*#1 - Create macrovariable varlist containing a list of variable names from Jan*/
proc contents data=jandate noprint out=metaclass;
run;

 

options nocenter nodate;
     title 'Meta Data for jandate';
proc print data=metaclass;
     var name type length varnum;
run;

 

data _null_;
     length allvars $1000;
     retain allvars '';
     set metaclass end=eof;
     allvars = trim(left(allvars))||''||left(name);
     if eof then call symput('varlist',allvars);
run;

 

%put &varlist;          /*Macrovariable is created!!!*/

 

/*#2 - MODIFY VARIABLE TYPE & LENGTH IN JANDATE*/

/*Essentially this is renaming the variables, creating new variables with the old names & inputting the data, so it's like you modified the types & lengths, when in reality you created new variables altogether.*/


data sasdata.jan19 (drop=v1-v21);                  /*Creating new permanent SAS dataset for merging later & renaming variables*/
     retain &varlist;                                             *<-- preserve variable order ;
     length                                                          *<-- define new types/lengths;
          Campaign_name $100 Campaign_ID $15 Network $15
          Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions Channel
          Bounce_rate Interaction_rate CTR Conversion_rate Daily_budget Cost CPC CPA Conversion_Value ROAS 8;
     format                                                           *<-- recreate formats;
          Campaign_name $CHAR100. Campaign_ID $CHAR15. Network $CHAR15.
          Date ddmmyy10.
          Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions 10.
          Channel 1.
          Bounce_rate Interaction_rate CTR Conversion_rate 4.2
          Daily_budget Cost CPC CPA Conversion_Value ROAS DOLLAR10.2;
     set jandate (rename=(  Campaign_name=v1 Campaign_ID=v2 Date=v3 Network=v4 Channel=v5 Impressions=v6
                                          Bounce_rate=v7 Ave_pageviews=v8 Ave_timeonsite_sec=v9 Ave_frequency=v10
                                          Interaction_rate=v11 Daily_budget=v12 Clicks=v13 CTR=v14 Cost=v15 CPC=v16
                                          Conversions=v17 Conversion_rate=v18 CPA=v19 Conversion_Value=v20 ROAS=v21));

          Campaign_name = put(v1,char100.);         *<-- redefine variable;
          Campaign_ID = put(v2,char15.);                *<-- redefine variable;
          Date = v3;
          Network = put(v4,char15.);                         *<-- redefine variable;
          Channel = v5;
          Impressions = v6;
          Bounce_rate = v7;
          Ave_pageviews = v8;
          Ave_timeonsite_sec = v9;
          Ave_frequency = v10;
          Interaction_rate = v11;
          Daily_budget = v12;
          Clicks = v13;
          CTR = v14;
          Cost = v15;
          CPC = v16;
          Conversions = v17;
          Conversion_rate = v18;
          CPA = put(v19,8.);                                    *<-- redefine variable;
          Conversion_Value = v20;
          ROAS = v21;
run;

 

/*FEBRUARY IMPORT: REPEAT EVERYTHING EXCEPT THE CREATION OF THE MACRO VARIABLE*/
/*Formatting January Date Variable*/
data febdate;
     retain Date;
     format Date ddmmyy10.;
     set paid."feb19$A1:U374"n;
run;

 

/*Verifying Date Format Change*/
proc contents data=febdate;
run;

 

/*RE-ASSIGNING VARIABLE LENGTHS & TYPES*/
/* Modifying variable type and length of febdate */
data sasdata.feb19 (drop=v1-v21);                  /*Creating new permanent SAS dataset for merging later & renaming variables*/
     retain &varlist;                                             *<-- preserve variable order ;
     length                                                          *<-- define new types/lengths;
          Campaign_name $100 Campaign_ID $15 Network $15
          Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions Channel
          Bounce_rate Interaction_rate CTR Conversion_rate Daily_budget Cost CPC CPA Conversion_Value ROAS 8;
     format                                                           *<-- recreate formats;
          Campaign_name $CHAR100. Campaign_ID $CHAR15. Network $CHAR15.
          Date ddmmyy10.
          Impressions Ave_pageviews Ave_timeonsite_sec Ave_frequency Clicks Conversions 10.
          Channel 1.
          Bounce_rate Interaction_rate CTR Conversion_rate 4.2
          Daily_budget Cost CPC CPA Conversion_Value ROAS DOLLAR10.2;
     set febdate (rename=(  Campaign_name=v1 Campaign_ID=v2 Date=v3 Network=v4 Channel=v5 Impressions=v6
                                          Bounce_rate=v7 Ave_pageviews=v8 Ave_timeonsite_sec=v9 Ave_frequency=v10
                                          Interaction_rate=v11 Daily_budget=v12 Clicks=v13 CTR=v14 Cost=v15 CPC=v16
                                          Conversions=v17 Conversion_rate=v18 CPA=v19 Conversion_Value=v20 ROAS=v21));

          Campaign_name = put(v1,char100.);         *<-- redefine variable;
          Campaign_ID = put(v2,char15.);                *<-- redefine variable;
          Date = v3;
          Network = put(v4,char15.);                         *<-- redefine variable;
          Channel = v5;
          Impressions = v6;
          Bounce_rate = v7;
          Ave_pageviews = v8;
          Ave_timeonsite_sec = v9;
          Ave_frequency = v10;
          Interaction_rate = v11;
          Daily_budget = v12;
          Clicks = v13;
          CTR = v14;
          Cost = v15;
          CPC = v16;
          Conversions = v17;
          Conversion_rate = v18;
          CPA = put(v19,8.);                                    *<-- redefine variable;
          Conversion_Value = v20;
          ROAS = v21;
run;

 

/*Verifying sasdata.feb19*/
proc contents data=sasdata.feb19;
run;

 

/*MERGE DATASETS*/
data sasdata.all_paid;
     set sasdata.jan19 sasdata.feb19;
run;
/*DATA NOT TRUNCATED!!!! YAAAASSSSSSS :)*/

 

*****NOTE: Obviously, the reassignment of the variables types & lengths part could be turned into a macro ultimately, or do-loop, but since I do not have time for this project, I will not cover this in this post. Enjoy!*****

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 37 replies
  • 1506 views
  • 6 likes
  • 3 in conversation