I have a wide dataset where a series of columns are named _201507, _201508, ..., _201603, to indicate the value of a variable during a specific month. I want to create an array of _201507-_201603.
I tried the following code:
array period(*) _201507-_201603;
SumPeriod = sum(of _201507-_201603);
The problem with this code is that SAS doesn't realize that column _201512 is followed by column _201601. Instead it creates columns _201513 through to _201600 which is clearly not what I want.
How can I get around this issue?
array period(*) _201507--_201603;
array period(*) _201507--_201603;
Thank you for your help. This solved my problem.
In the original dataset, the columns were ordered randomly. I used the Query Builder in SAS EG to order them chronologically next to each other. Is there any faster way of ordering columns in situations like these?
data have; retain _id 2; array x{*} _201602 _201511 _201512 _201601 (4*1); run; proc sql; select name into : list separated by ' ' from dictionary.columns where libname='WORK' and memname='HAVE' and prxmatch('/^_\d{6}$/',strip(name)) order by input(substr(name,2),best32.); quit; %put NOTE: &list ;
SAS variable ranges are simply numbers. There is no such thing as a SAS variable "date range". I'd also say it is probably not good practice to hard-code dates into variable names as you lose a lot of flexibility. For example if your table just contained the last 12 month's data, each month your variables would change name.
Why not consider naming your columns like month1 to month12 (or how ever many you want) then add a start_month and an end_month variable to specify the start and the end of the range?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.