BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KHayes
Calcite | Level 5

Hi,

I have a time series dataset - 20 years for 37 countries. The data starts at year 2000 and ends at year 2019 for each country. I am trying to tell data that my variable, Year, is a year that restarts to 2000 once it hits 2019. But I can't seem to get the code right to put it in a year format. I receive this message (below) which deletes the Year observations. 

 

NOTE: Numeric values have been converted to character values at the places given by:

      (Line):(Column).
      4:23
NOTE: Variable charyear is uninitialized.
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).

      740 at 4:8

 

My code is below

data dataset1.pce;

set dataset1.pce;
Year = mdy (1,1,input(charyear, 4.));
format year year4.;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So the log is clearly showing that YEAR is already numeric.  So simplify your code.

data dataset1.pce2;
  set dataset1.pce;
  year = mdy(1,1,year);
  format year year4.;
run;

That will convert numbers like 2,021 into numbers like 22,281 which represents the first day in the year 2021.

Example:

1457  data _null_;
1458    year=2021 ;
1459    put year= @ ;
1460    year=mdy(1,1,year);
1461    put '-> ' year comma. ' -> ' year date9. ' -> ' year year4. ;
1462  run;

year=2021 -> 22,281 -> 01JAN2021 -> 2021

 

View solution in original post

14 REPLIES 14
ballardw
Super User

@KHayes wrote:

Hi,

I have a time series dataset - 20 years for 37 countries. The data starts at year 2000 and ends at year 2019 for each country. I am trying to tell data that my variable, Year, is a year that restarts to 2000 once it hits 2019. But I can't seem to get the code right to put it in a year format. I receive this message (below) which deletes the Year observations. 

 

NOTE: Numeric values have been converted to character values at the places given by:

      (Line):(Column).
      4:23
NOTE: Variable charyear is uninitialized.
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).

      740 at 4:8

 

My code is below

data dataset1.pce;

set dataset1.pce;
Year = mdy (1,1,input(charyear, 4.));
format year year4.;

run;


I think that you need to provide some details as to just what ". I am trying to tell data that my variable, Year, is a year that restarts to 2000 once it hits 2019. But I can't seem to get the code right to put it in a year format. " actually means.

That appears to be some sort of value issue and not at all a format issue.

Example data would needed as to what you currently have and what the result should look like.

 

For instance, the warning very clearly states that you do not have any values for the variable CHARDATE at all (that is what Uninitialized means), which may mean that you do not have a variable with that name at all until you used it in that statement.

 

You also need to be aware that when you use code like

data dataset1.pce;
set dataset1.pce;

you completely replace the input data set and logic issue may yield your data unusable.

You really should use when recoding values to prevent such problems.

data dataset1.pce2;
set dataset1.pce;
KHayes
Calcite | Level 5
 

I copied over a snip of the dataset showing one country and a portion of a second country- there is a total of 37 countries with 20 years of data. Year is a predictor variable that starts  at year 2000 for each country until year 2019. I want SAS to know that the variable 'Year' is a year variable for each country. Below is my original code and the second picture happens when I run my code. Just FYI - I changed my code and substituted 'year' where I originally had charyear- the second picture is the result with that change. 

 

Also thanks for the heads up on the dataset name change! 

 

Screen Shot 2021-07-19 at 11.20.45 AM.png

Screen Shot 2021-07-19 at 11.28.45 AM.png

ballardw
Super User

AS I said, you likely corrupted the "Year" data you had, setting it all to missing because of the missing values of "Chardate". You need to go back in your process and recreate you data.

 

 

Still don't see what may have lead you do anything to Year variable. It appears to be what you said you needed as long as you process BY COUNTRY or include Code or Country as a CLASS variable in models. Since I can't code from a picture I cannot demonstrate any use of By group processing.

KHayes
Calcite | Level 5

I really appreciate your help!

 

Where would I put the 'By' statement in the code? 

 

This is my code:

 

data dataset1.pce2;
set dataset1.pce;
Year = mdy (1,1,input(year, 4.));
format year year4.;
run;
proc print data=dataset1.pce2;
run;

 

 

This is from my log:


11 libname dataset1 'C:\Users\kjoseph4\Documents';
NOTE: Libref DATASET1 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Users\kjoseph4\Documents
12
13 data dataset1.pce2;
14 set dataset1.pce;
15 Year = mdy (1,1,input(year, 4.));
16 format year year4.;
17 run;

NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
15:23
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
740 at 15:8
NOTE: There were 740 observations read from the data set DATASET1.PCE.
NOTE: The data set DATASET1.PCE2 has 740 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


18 proc print data=dataset1.pce2;
19 run;

NOTE: There were 740 observations read from the data set DATASET1.PCE2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.37 seconds
cpu time 0.35 seconds

PaigeMiller
Diamond | Level 26
NOTE: Variable charyear is uninitialized.

Your code doesn't work because variable charyear doesn't exist.

 

Then you made some change so that charyear is replaced by year. Please show us the LOG when you run this code. We need to see the ENTIRE log, that's 100% of it, every single character. Do not pick and choose parts of the log to show us, and not show us other parts. PLEASE copy the log as text and paste it into the window that appears when you click on the </> icon — do not skip this step.

 

Also, I'm struggling with the concept of "numeric year to formatted year". If you take the year 2021, and you format it, it is still 2021, what do you gain from this formatting?

--
Paige Miller
KHayes
Calcite | Level 5

Hi Paige, 

Thanks so much for your help. Log is pasted below.

 

I thought we wanted to tell SAS that the observations for the Year variable are years in 4 digit format otherwise SAS doesn't know what it is. But I could (and most likely) be wrong!! 🙂 Appreciate your advice!

 

proc print data=dataset1.pce2;
NOTE: Writing HTML Body file: sashtml.htm
10 run;

NOTE: No observations in data set DATASET1.PCE2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.26 seconds
cpu time 0.23 seconds


11 libname dataset1 'C:\Users\kjoseph4\Documents';
NOTE: Libref DATASET1 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Users\kjoseph4\Documents
12
13 data dataset1.pce2;
14 set dataset1.pce;
15 Year = mdy (1,1,input(year, 4.));
16 format year year4.;
17 run;

NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
15:23
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
740 at 15:8
NOTE: There were 740 observations read from the data set DATASET1.PCE.
NOTE: The data set DATASET1.PCE2 has 740 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


18 proc print data=dataset1.pce2;
19 run;

NOTE: There were 740 observations read from the data set DATASET1.PCE2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.37 seconds
cpu time 0.35 seconds

PaigeMiller
Diamond | Level 26

We're trying to help you, but you have to help us. I gave specific instructions on how to present the log, and in big bold red letters I said do not skip this step. Please follow the instructions and don't skip that step.

--
Paige Miller
KHayes
Calcite | Level 5
11   libname dataset1 'C:\Users\kjoseph4\Documents';
NOTE: Libref DATASET1 was successfully assigned as follows:
      Engine:        V9
      Physical Name: C:\Users\kjoseph4\Documents
12
13   data dataset1.pce2;
14   set dataset1.pce;
15   Year = mdy (1,1,input(year, 4.));
16   format year year4.;
17   run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      15:23
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      740 at 15:8
NOTE: There were 740 observations read from the data set DATASET1.PCE.
NOTE: The data set DATASET1.PCE2 has 740 observations and 7 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


18   proc print data=dataset1.pce2;
19   run;

NOTE: There were 740 observations read from the data set DATASET1.PCE2.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.37 seconds
      cpu time            0.35 seconds

Sorry - I'm new to this...trying the best I can 🙂 

PaigeMiller
Diamond | Level 26
NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      15:23

In dataset dataset1.pce, does YEAR exist? Is it character or numeric? What does PROC CONTENTS say? And again, if year is the four digits such as 2021, why do you need to format it so it has four digits????


What happens if you remove these two lines, does it work then?

Year = mdy (1,1,input(year, 4.));
format year year4.;

 

--
Paige Miller
Tom
Super User Tom
Super User

So the log is clearly showing that YEAR is already numeric.  So simplify your code.

data dataset1.pce2;
  set dataset1.pce;
  year = mdy(1,1,year);
  format year year4.;
run;

That will convert numbers like 2,021 into numbers like 22,281 which represents the first day in the year 2021.

Example:

1457  data _null_;
1458    year=2021 ;
1459    put year= @ ;
1460    year=mdy(1,1,year);
1461    put '-> ' year comma. ' -> ' year date9. ' -> ' year year4. ;
1462  run;

year=2021 -> 22,281 -> 01JAN2021 -> 2021

 

PaigeMiller
Diamond | Level 26

Why do this? Why not leave 2021 unformatted?

--
Paige Miller
Tom
Super User Tom
Super User

I have no idea why the poster wants to do it.  But I assume they need an actual DATE value for future processing.

PaigeMiller
Diamond | Level 26

I can't think of how an actual DATE value is useful here. Seems like wasted typing to me.

--
Paige Miller
ballardw
Super User

@PaigeMiller wrote:

I can't think of how an actual DATE value is useful here. Seems like wasted typing to me.


I'm waiting for a follow-up question about why the coefficients for his "year" variable looks funny (i.e. very small) in the output of a regression.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 1314 views
  • 0 likes
  • 4 in conversation