BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bautista
Calcite | Level 5

Recently I inherited a large dataset (apids.sas7bdat).  It contains a variable called "REPDATE" which is essentially a concatenated YYYY and MM.  For example, if the date when a case was reported was in July 2008 the creators of the dataset combined the year 2008 and the month 07 so the "REPDATE" value is 200807.

For example, the following are the values in the dataset for January 2002, February 1981 and October 1985 respectively:

REPDATE

200201   

198102   

198510   

I would like to "split" the REPDATE variable into two new variables called REPYEAR and REPMONTH. 

Obviously the values for REPYEAR will just be the first 4 numbers in REPDATE.

And the values for REPMONTH will just be the last 2 numbers in REPDATE.

I would still like to keep the REPDATE variable in the dataset.

Does anyone have a suggestion for how I could achieve these two new variables?

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Depending whether repdate is character or numeric...

data have;
length repdatec $6;
input repdate @1 repdatec;
datalines;
200201   
198102   
198510
;

data want;
length yearc $4 monthc $2;
set have;

/* If numeric */
year = floor(repdate/100);
month = mod(repdate,100);

/* If character */
yearc = substr(repdatec,1,4);
monthc = substr(repdatec,5,2);
run;

proc print; run;

PG

PG

View solution in original post

6 REPLIES 6
art297
Opal | Level 21

Questions about repdate:

Is it a numeric where 200201 is really captured as 200,201 or

Is it a numeric where 200201 ia really captured as the SAS date for 1/1/2002 or

Is it a character variable?

PGStats
Opal | Level 21

Depending whether repdate is character or numeric...

data have;
length repdatec $6;
input repdate @1 repdatec;
datalines;
200201   
198102   
198510
;

data want;
length yearc $4 monthc $2;
set have;

/* If numeric */
year = floor(repdate/100);
month = mod(repdate,100);

/* If character */
yearc = substr(repdatec,1,4);
monthc = substr(repdatec,5,2);
run;

proc print; run;

PG

PG
Bautista
Calcite | Level 5

Thank you thank you thank you thank you !  A million thank yous.  Such brilliant minds here.  I tried it and it does work. 


I have a new twist (shown below).


In looking at the dataset, I see the variable REPDATE is a character variable.  This is because I imported the dataset from an external source as shown below, and SAS automatically decided to treat the REPDATE variable as a character variable when it did the import.


PROC IMPORT OUT= WORK.in_apids

            DATAFILE='Pids02q4.dbf'

            DBMS=DBF REPLACE;

     GETDELETED=NO;

RUN;

New twist:  I would like the two newly-created variables to be numeric despite the fact that the REPDATE variable got imported as a character variable.  Is there any way I could change the PROC IMPORT code shown above, to force SAS to import the REPDATE variable as numeric instead?

If I am not mistaken, I think this can be done with the PUT statement like Jonam said, yes?

year=substr(put(repdate,6.),1,4);


My question is: Will the PUT statement shown above work to create a numeric 4-digit variable for year, even though the REPDATE variable is character?  If so, then that would be awesome!


jonam
Calcite | Level 5

yes, you can multiply by 1 which implicitly converts to numeric.

year=substr(put(repdate,6.),1,4) *1;

Linlin
Lapis Lazuli | Level 10

to create numeric year and month:

data want;

set have;

year=input(substr(repdate,1,4),4.);

month=input(substr(repdate,5,2),2.);

run;

jonam
Calcite | Level 5

bautista,

PGstats answered your question already. since you've not mentioned it's char or numeric. you can also use put(),substr() to do it as follows.    

data have;

length repdatec $6;

input repdate @1 repdatec;

datalines;

200201   

198102   

198510

;

data want;

set have;

year=substr(put(repdate,6.),1,4);

year1=substr(put(repdatec,6.),1,4);

proc print;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1459 views
  • 8 likes
  • 5 in conversation