I have a character timepoint value that I have coded as numeric for sorting purposes. My possible values are:
Character | Numeric |
Screening | 0 |
Cycle 1 | 1 |
Cycle 2 | 2 |
Cycle 3 | 3 |
… | … |
Cycle 22 | 22 |
Follow-Up | 98 |
Unscheduled | 99 |
In these cases, I usually assign a numeric value to the character value in the order that I want and then apply a format to the numeric values since sorting the character values would result in Cycle 1, Cycle 10, Cycle 11... instead of Cycle 1, Cycle 2, Cycle3.... This seems tedious in my current situation, so I was wondering if there is a more efficient way. I wanted to try a picture format where I just use a "Cycle" prefix to all the number 1-22 and somehow format 0, 98, and 99 as desired. Any ideas on how to proceed?
If not, I may just type everything out in a regular format or use a loop in the future if this arises more often.
Not sure you can get more efficient.
I prefer proc format with a CNTLIN dataset rather than a macro myself, but it's a preference.
data cycle_fmt;
fmtname='cycle_fmt';
type='C';
start=0; label='Screening'; output;
start=98; label='Follow-up'; output;
start=99; label='Unscheduled'; output;
do start=1 to 22;
label=cat("Cycle ", start);
output;
end;
run;
proc format cntlin=cycle_fmt;
run;
Hi mate.
I didn't get exactly what you need.
Here is what i get:
proc format;
value score 1 - 22 = 'Cycle 1-22'
22<- 98 = '0-98'
other = '99'
;run;
data test;
length var 8;
input var;
datalines;
1
2
3
4
22
98
99
;;;;
run;
data want;
set test;
new_var=var;
format new_var score.;
run;
Hope this helps
Actually, this is what I'm trying to do in long hand:
proc format;
value timept
0='Screening'
1='Cycle 1'
2='Cycle 2'
3='Cycle 3'
4='Cycle 4'
5='Cycle 5'
6='Cycle 6'
7='Cycle 7'
8='Cycle 8'
9='Cycle 9'
10='Cycle 10'
11='Cycle 11'
12='Cycle 12'
13='Cycle 13'
14='Cycle 14'
15='Cycle 15'
16='Cycle 16'
17='Cycle 17'
18='Cycle 18'
19='Cycle 19'
20='Cycle 20'
21='Cycle 21'
22='Cycle 22'
98='Follow-up'
99='Unscheduled';
run;
data test;
var=0; output;
do var=1 to 22;
output;
end;
var=98; output;
var=99; output;
run;
data want;
set test;
new_var=var;
format new_var timept.;
run;
If there is a better way than writing out every format option since there is a lot of repeat with "Cycle", I would like to incorporate that.
I think you can get there by creating two formats. I can't check the syntax right now, so you may need to debug:
proc format;
value all 0='Screening' 98='Follow-up' 99='Unscheduled' other=[cycles.];
picture cycles low-high = 'Cycle 99';
run;
It didn't work quite as you presented it, but I made a few changes (see below). My only concern now is how I would get rid of all the leading spaces in the values 1-22:
proc format;
value all
0='Screening' 98='Follow-up' 99='Unscheduled' other=[cycles.];
picture cycles low-high = '99' (prefix='Cycle ');
run;
data test;
var=0; output;
do var=1 to 22;
output;
end;
var=98; output;
var=99; output;
run;
data want;
set test;
new_var=var;
format new_var all.;
run;
To get rid of leading blanks, you would need to set up two ranges in the PICTURE format:
1-9 = '9' (prefix='Cycle')
10-97='99' (prefix='Cycle')
That actually addressed a second question, but not the first. I was getting "Cycle 01" instead of "Cycle 1", which using 2 ranges solved. However, when I use the format, I get a 33 spaces before the text. Maybe not an issue at this point in my project.
I found my issue. The log states that there was not a length specified for the picture format, so it defaulted to a length of 40. I just had to change the length of the cycles format:
proc format;
picture cycles
low-9='9' (prefix='Cycle ')
10-high='99' (prefix='Cycle ');
value timept
0='Screening'
98='Follow-up'
99='Unscheduled'
other=[cycles9.];
run;
There is still a single space in front of Cycle 1 - Cycle 9, but oh well. Thanks for your help!
One last reply. In case you care for future reference, I think this is what I was looking for:
proc format;
picture cycles_one
1-9='9' (prefix='Cycle ');
picture cycles_two
10-90='99' (prefix='Cycle ');
value timept
0='Screening'
1-9=[cycles_one8.]
10-90=[cycles_two9.]
98='Follow-up'
99='Unscheduled';
run;
data test;
var=0; output;
do var=1 to 22;
output;
end;
var=98; output;
var=99; output;
run;
data want;
set test;
new_var=var;
format new_var timept.;
run;
You can ask @BeverlyBrown to change the correct answer 🙂
@Reeza and @djbateman: Anyone can edit their own post (use down arrow by the wheel at upper right of comment.) Or did I misunderstand the question? Certainly possible. In any case, glad to see such healthy discussion and a good outcome!
This seems to do what I want, but anyone feel free to elaborate if there is a cleaner way to do this in the future.
%macro format;
proc format;
value timept
0='Screening'
%do i=1 %to 22;
&i.="Cycle &i."
%end;
98='Follow-up'
99='Unscheduled';
run;
%mend format;
%format;
data test;
var=0; output;
do var=1 to 22;
output;
end;
var=98; output;
var=99; output;
run;
data want;
set test;
new_var=var;
format new_var timept.;
run;
Not sure you can get more efficient.
I prefer proc format with a CNTLIN dataset rather than a macro myself, but it's a preference.
data cycle_fmt;
fmtname='cycle_fmt';
type='C';
start=0; label='Screening'; output;
start=98; label='Follow-up'; output;
start=99; label='Unscheduled'; output;
do start=1 to 22;
label=cat("Cycle ", start);
output;
end;
run;
proc format cntlin=cycle_fmt;
run;
If you already have the data in a data set and can summarize it then use the data to build a format for you. Something like:
Proc sql;
create table CycleFormat as
select distinct numeric as start, character as Label, "MyCycleFormat" as Fmtname, type='N'
from datasetname
;
quit;
Where the text before Fmtname is the name of the format you want to create and type N is for numeric, C would be character.
proc format <library=libname> cntlin=CycleFormat;
run;
The Library part is optional but save it to a permanent library and you can add that library to the search path for formats with:
Options append=(fmtsearch=(libname));
where libname is the name of your permanent library.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.