So I've looked on the boards and found a number of things that could help but aren't close enough. As a lot of the examples use datalines and I'm importing data from a text file.
I'm reading in data (First 4 lines from the txt file)
Album,Artist,ReleaseDate,TotalCertifiedCopies,ClaimedSales,Genre Thriller,Michael Jackson,"November 30, 1982",46.3,65,Pop Back in Black,AC/DC,"July 25, 1980",26.1,50,Rock The Dark Side of the Moon,Pink Floyd,"March 1, 1973",24.2,45,Rock
As you can see the ReleaseDate, which I need to extract the year from in a later step is read in as a string. When I import it in SAS uses Datetime format and Anydtdtm for it also.
98 data WORK.IMPORT ; 99 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */ 100 infile REFFILE delimiter = ',' MISSOVER DSD firstobs=2 ; 101 informat Album $52. ; 102 informat Artist $18. ; 103 informat ReleaseDate anydtdtm40. ; 104 informat TotalCertifiedCopies best32. ; 105 informat ClaimedSales best32. ; 106 informat Genre $10. ; 107 format Album $52. ; 108 format Artist $18. ; 109 format ReleaseDate datetime. ; 110 format TotalCertifiedCopies best12. ; 111 format ClaimedSales best12. ; 112 format Genre $10. ; 113 input 114 Album $ 115 Artist $ 116 ReleaseDate 117 TotalCertifiedCopies 118 ClaimedSales 119 Genre $ 120 ; 121 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */ 122 run;
I'm simply struggling to try and extract the year from it. I've tried
/*var = put(datepart(ReleaseDate),date9.); Decade = Year(var);*/ /*input ReleaseDate $20.;*/ new_date = input(ReleaseDate, yymmdd20.); year = year(new_date);
I've tried manually import it with the infile statement and just how it's splitting up wasn't working. But I'm willing to try manually importing again if I need too.
To just post code with some data you can use DATALINES in the infile statement. It's then not hard to repoint this to an external file in your environment. Code like below should work for you.
data WORK.IMPORT;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
/* infile REFFILE delimiter = ',' MISSOVER DSD firstobs=2;*/
infile datalines delimiter = ',' truncover DSD firstobs=2;
informat Album $52.;
informat Artist $18.;
informat ReleaseDate anydtdte40.;
informat TotalCertifiedCopies best32.;
informat ClaimedSales best32.;
informat Genre $10.;
format Album $52.;
format Artist $18.;
format ReleaseDate date9.;
format TotalCertifiedCopies best12.;
format ClaimedSales best12.;
format Genre $10.;
input
Album $
Artist $
ReleaseDate
TotalCertifiedCopies
ClaimedSales
Genre $
;
year=year(ReleaseDate);
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
datalines;
Album,Artist,ReleaseDate,TotalCertifiedCopies,ClaimedSales,Genre
Thriller,Michael Jackson,"November 30, 1982",46.3,65,Pop
Back in Black,AC/DC,"July 25, 1980",26.1,50,Rock
The Dark Side of the Moon,Pink Floyd,"March 1, 1973",24.2,45,Rock
;
run;
Another option to provide code with sample data in an external file is to write this sample data to a temporary file as done below.
This should also make it easy for you to adopt whatever gets proposed to your actual environment.
filename REFFILE temp;
data _null_;
file REFFILE;
input;
put _infile_;
datalines;
Album,Artist,ReleaseDate,TotalCertifiedCopies,ClaimedSales,Genre
Thriller,Michael Jackson,"November 30, 1982",46.3,65,Pop
Back in Black,AC/DC,"July 25, 1980",26.1,50,Rock
The Dark Side of the Moon,Pink Floyd,"March 1, 1973",24.2,45,Rock
;
data WORK.IMPORT;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile REFFILE delimiter = ',' truncover DSD firstobs=2;
informat Album $52.;
informat Artist $18.;
informat ReleaseDate anydtdte40.;
informat TotalCertifiedCopies best32.;
informat ClaimedSales best32.;
informat Genre $10.;
format Album $52.;
format Artist $18.;
format ReleaseDate date9.;
format TotalCertifiedCopies best12.;
format ClaimedSales best12.;
format Genre $10.;
input
Album $
Artist $
ReleaseDate
TotalCertifiedCopies
ClaimedSales
Genre $
;
year=year(ReleaseDate);
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
Release date is a date, and not a datetime. So you probably want the informat ANYDTDTE. Once that is working, the year is found by using the YEAR function.
year=year(releasedate);
What do you exactly mean by this? As of now I'm using the import data function
/* Generated Code (IMPORT) */ /* Source File: TopGrossingAlbumsR.txt */ /* Source Path: /folders/myfolders */ /* Code generated on: 3/2/20, 4:37 PM */ %web_drop_table(WORK.IMPORT); FILENAME REFFILE '/folders/myfolders/TopGrossingAlbumsR.txt'; PROC IMPORT DATAFILE=REFFILE DBMS=DLM OUT=WORK.IMPORT; DELIMITER=","; GETNAMES=YES; DATAROW=2; RUN; PROC CONTENTS DATA=WORK.IMPORT; RUN; %web_open_table(WORK.IMPORT);
So I'm currently using how SAS is giving it to me. Are you saying I should be importing the data myself and telling it what it should be using? I'm just a bit confused on your answer
@rbivens wrote:
What do you exactly mean by this? As of now I'm using the import data function
/* Generated Code (IMPORT) */ /* Source File: TopGrossingAlbumsR.txt */ /* Source Path: /folders/myfolders */ /* Code generated on: 3/2/20, 4:37 PM */ %web_drop_table(WORK.IMPORT); FILENAME REFFILE '/folders/myfolders/TopGrossingAlbumsR.txt'; PROC IMPORT DATAFILE=REFFILE DBMS=DLM OUT=WORK.IMPORT; DELIMITER=","; GETNAMES=YES; DATAROW=2; RUN; PROC CONTENTS DATA=WORK.IMPORT; RUN; %web_open_table(WORK.IMPORT);So I'm currently using how SAS is giving it to me. Are you saying I should be importing the data myself and telling it what it should be using? I'm just a bit confused on your answer
Proc import makes "guesses" as to informats. Sometimes it gets things wrong.
Use the code generated by the proc by copy from the log, paste into the editor and modify.
The example code you show with ANYDTDTM means that SAS seems to have found something it thought was a date with time. You only show a date example. Which should be read with ANYDTDTE informat. so you could change the generated code to use that informat and assign a date format such as WORDDATE. or DATE9.
Or look in your data for different formats of the date value.
To just post code with some data you can use DATALINES in the infile statement. It's then not hard to repoint this to an external file in your environment. Code like below should work for you.
data WORK.IMPORT;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
/* infile REFFILE delimiter = ',' MISSOVER DSD firstobs=2;*/
infile datalines delimiter = ',' truncover DSD firstobs=2;
informat Album $52.;
informat Artist $18.;
informat ReleaseDate anydtdte40.;
informat TotalCertifiedCopies best32.;
informat ClaimedSales best32.;
informat Genre $10.;
format Album $52.;
format Artist $18.;
format ReleaseDate date9.;
format TotalCertifiedCopies best12.;
format ClaimedSales best12.;
format Genre $10.;
input
Album $
Artist $
ReleaseDate
TotalCertifiedCopies
ClaimedSales
Genre $
;
year=year(ReleaseDate);
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
datalines;
Album,Artist,ReleaseDate,TotalCertifiedCopies,ClaimedSales,Genre
Thriller,Michael Jackson,"November 30, 1982",46.3,65,Pop
Back in Black,AC/DC,"July 25, 1980",26.1,50,Rock
The Dark Side of the Moon,Pink Floyd,"March 1, 1973",24.2,45,Rock
;
run;
Another option to provide code with sample data in an external file is to write this sample data to a temporary file as done below.
This should also make it easy for you to adopt whatever gets proposed to your actual environment.
filename REFFILE temp;
data _null_;
file REFFILE;
input;
put _infile_;
datalines;
Album,Artist,ReleaseDate,TotalCertifiedCopies,ClaimedSales,Genre
Thriller,Michael Jackson,"November 30, 1982",46.3,65,Pop
Back in Black,AC/DC,"July 25, 1980",26.1,50,Rock
The Dark Side of the Moon,Pink Floyd,"March 1, 1973",24.2,45,Rock
;
data WORK.IMPORT;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile REFFILE delimiter = ',' truncover DSD firstobs=2;
informat Album $52.;
informat Artist $18.;
informat ReleaseDate anydtdte40.;
informat TotalCertifiedCopies best32.;
informat ClaimedSales best32.;
informat Genre $10.;
format Album $52.;
format Artist $18.;
format ReleaseDate date9.;
format TotalCertifiedCopies best12.;
format ClaimedSales best12.;
format Genre $10.;
input
Album $
Artist $
ReleaseDate
TotalCertifiedCopies
ClaimedSales
Genre $
;
year=year(ReleaseDate);
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
Thank you so much it works!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.