Dear all,
I have a question, is it possible to assign unique colors to each patient in
a data set of about 1000 patients.The patients are already assigned unique ids.
These colors are required to display their data graphifically using for example sgplot.
If yes, how can this be programmed?
The Do loop was just an example to create fake data to demonstrate. You would want to create a data set with the unique values of your patients and then attach the color and other required variables for the Dattrmap data set.
Something like this:
/* one way to create a data set with unique patient id values*/ proc sql; create pats as select distinct patientid from yourdataset ; quit; data attrs; /* this creates 1296 colors and Dattrmap variables except the Value*/ id='PATCOLOR'; do c=0 to 256 by 50; do m=0 to 256 by 50; do y=0 to 256 by 50; do k=0 to 256 by 50; cmykcolor= cats("K",put(c,hex2.),put(m,hex2.),put(y,hex2.),put(k,hex2.)); fillcolor=cmykcolor; linecolor=cmykcolor; markercolor=cmykcolor; textcolor=cmykcolor; show='DATA'; output; end; end; end; end; run; Data patattrmap; merge pats(rename=(patientid=value) in=in1) attrs ; if in1; run;
The Merge above just matches the two datasets by record order of appearance. Since the Attrs data set may have more than colors than there are patients the In= option creates a temporary variable that tells whether the current observation comes from the patient data. The values are 1/0 (true/false). The IF only keeps observations with the patient id data.
If your patientid variable is not character you may have issues.
Caveat: Dattrmap only applies to variables used with Group. The default limit on the number of group values is 1000, which shouldn't be a problem with your case but large numbers of groups can make many graphs hard to read and the legend can take up a large amount of space.
If you place the resulting dataset in a permanent library you don't have to rebuild it unless the patients change.
Data attribute maps but you will not be able to visually distinguish between 1000 different colours so I question the usefulness of such a graphic.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatug/p0wv2bbjftizkpn1puokmcblrw15.htm
To unique assign the colours either find a list of colours somewhere in HEX Codes and then map one to an ID or find a formula that allows you to increment the RGB formulas to specify your colours in a more systematic fashion.
https://htmlcolorcodes.com/colors/
This may also be a useful reference:https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/graphref/p0edl20cvxxmm9n1i9ht3n21eict.htm
@Anita_n wrote:
Dear all,
I have a question, is it possible to assign unique colors to each patient in
a data set of about 1000 patients.The patients are already assigned unique ids.
These colors are required to display their data graphifically using for example sgplot.
If yes, how can this be programmed?
I hope the ID's are relatively static. If you are going to constantly add Id's you should let us know now.
Look up DATTRMAP for SGPlot and SGPanel graphs.
You create a data set that sets display properties such as line color, symbol color, text color, line type, symbol based on values of a variable.
SAS supports a number of color naming methods. RGB, CMYK, HLS, HSV are the ones most likely to be useful in creating large number of "unique" colors.
The example below shows creating some dummy PatientID values and creating colors to assign to each.
data exattrmap; /*this Keyword id is the identifier for which set of rules in Dattrmap dataset to use */ id='PATCOLOR'; do c=0 to 100 by 30; do m=0 to 100 by 30; do y=0 to 100 by 30; do k=0 to 100 by 30; /* some fake patient indentifiers*/ num+1; patientid = put(num,z10.); /* this Keyword variable Value is used to compare your variable values to the properties set */ value=patientid; /*K prefix tells SAS the value is CMYK color name*/ cmykcolor= cats("K",put(c,hex2.),put(m,hex2.),put(y,hex2.),put(k,hex2.)); /* these variables are keywords for setting color of different items in a graph */ fillcolor=cmykcolor; linecolor=cmykcolor; markercolor=cmykcolor; textcolor=cmykcolor; /* this is keyword that can affect legend contents only shows ones with data. ATTRMAP option shows all values of the variable */ show='DATA'; output; end; end; end; end; run; /* the DATTRMAP is used for variables in a GROUP role in the plot*/ Proc sgplot data=exattrmap dattrmap=exattrmap; /* subset the data a bit*/ where m=60 and k=0; scatter x=c y=y /group=patientid attrid= PATCOLOR ; run;
Each of the 4 values of the CMYK color have a range of 00 to FF hex (256 numbers). The greater the interval in the "by" of the do loops above the larger the color step would be. I used a limit of 100 to have a smaller data set to play with for an example. Any method of assigning those values as long as they stay in the range. Note the use of the HEX format to create the numbers from decimal values easy to program.
The data set has added variables needed to be used as DATTRMAP. Normally the Dattrmap data set would be (relatively) static and only used as the Dattrmap option. If you have to add Patient ids constantly this may get tedious.
The Sgplot code shows one way to reference the map id and use with a limited number of data points because this data set would have many duplicates of X, Y pairs.
@ballardw thanks for the example. Am very glad to know that, this can be possible. I tried your example with my data (with exactly 835 patient ids). It worked till where I applied the do loop. I get multiple observations per patient id, which makes the data very long. Is there any solution to that?
The Do loop was just an example to create fake data to demonstrate. You would want to create a data set with the unique values of your patients and then attach the color and other required variables for the Dattrmap data set.
Something like this:
/* one way to create a data set with unique patient id values*/ proc sql; create pats as select distinct patientid from yourdataset ; quit; data attrs; /* this creates 1296 colors and Dattrmap variables except the Value*/ id='PATCOLOR'; do c=0 to 256 by 50; do m=0 to 256 by 50; do y=0 to 256 by 50; do k=0 to 256 by 50; cmykcolor= cats("K",put(c,hex2.),put(m,hex2.),put(y,hex2.),put(k,hex2.)); fillcolor=cmykcolor; linecolor=cmykcolor; markercolor=cmykcolor; textcolor=cmykcolor; show='DATA'; output; end; end; end; end; run; Data patattrmap; merge pats(rename=(patientid=value) in=in1) attrs ; if in1; run;
The Merge above just matches the two datasets by record order of appearance. Since the Attrs data set may have more than colors than there are patients the In= option creates a temporary variable that tells whether the current observation comes from the patient data. The values are 1/0 (true/false). The IF only keeps observations with the patient id data.
If your patientid variable is not character you may have issues.
Caveat: Dattrmap only applies to variables used with Group. The default limit on the number of group values is 1000, which shouldn't be a problem with your case but large numbers of groups can make many graphs hard to read and the legend can take up a large amount of space.
If you place the resulting dataset in a permanent library you don't have to rebuild it unless the patients change.
@ballardw okay, thanks I will try that.
@ballardw thanks it worked
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.