BookmarkSubscribeRSS Feed
mmea
Quartz | Level 8

Hi

I have this example dataset.

 

data WORK.EXAMPLE;
  infile datalines delimiter=',' truncover; 
  input test_date date9.  event_text :$100.  date_of_event:date9.  ALLE:$12
    format test_date  date_of_event  ddmmyyd8.;
datalines4;
01JAN2020,event1,01JAN2020, method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
03JAN2020,.,.,.
03JAN2020,.,.,.
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method1
04JAN2020,event3,04JAN2020,method1
06JAN2020,.,.,.
06JAN2020,.,.,.
07JAN2020,.,.,.
07JAN2020,.,.,.
08JAN2020,event4,08JAN2020,method1 
08JAN2020,event4,08JAN2020,method1  
08JAN2020,event4,08JAN2020,method1  
09JAN2020,event5, 09JAN2020,method1 
09JAN2020,event5, 09JAN2020,method1 
09JAN2020,event5, 09JAN2020,method1 
09JAN2020,event5, 09JAN2020,method1 
09JAN2020,event5, 09JAN2020,method1 
;;;;

I wish to make the following plot, where CASES is the actual number of test based on test_date in Y-axis. The X-axis should be the specific dates based on dates_of_events. The event_text should be placed above every event date. And a curve that illustrates the number of test over time.

My orginally data is big so the curve will be noticeable, but maybe in this example it would look unpleasant - the curve should be based on the variable ALLE. Some datalines have an empty place (.) it means they dont have a specific event in that time.

But is this possible?

test.jpg

 

 

14 REPLIES 14
PeterClemmensen
Tourmaline | Level 20

Do you want the dates with no events included in the plot?

mmea
Quartz | Level 8

Yes - if possible 🙂

PeterClemmensen
Tourmaline | Level 20

See if you can use this as a template

 

data WORK.EXAMPLE;
infile datalines delimiter=',' truncover dsd; 
input test_date :date9. event_text $ date_of_event :date9. ALLE $;
format test_date date_of_event ddmmyy8.;
datalines4;
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
03JAN2020,,,                      
03JAN2020,,,                      
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method1
04JAN2020,event3,04JAN2020,method1
06JAN2020,,,                      
06JAN2020,,,                      
07JAN2020,,,                      
07JAN2020,,,                      
08JAN2020,event4,08JAN2020,method1
08JAN2020,event4,08JAN2020,method1
08JAN2020,event4,08JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
;;;;

proc summary data = example missing nway;
   class test_date event_text;
   output out = plot;
run;

proc sgplot data = plot noautolegend;
   vbarparm category = test_date response = _freq_ / datalabel = event_text;
   series x = test_date y = _FREQ_;
run;

 

Result:

 

 

SGPlot4.png

 

mmea
Quartz | Level 8

Is it possible to make the bars only as line?

PeterClemmensen
Tourmaline | Level 20

Yes 🙂

 

data WORK.EXAMPLE;
infile datalines delimiter=',' truncover dsd; 
input test_date :date9. event_text $ date_of_event :date9. ALLE $;
format test_date date_of_event ddmmyy8.;
datalines4;
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method1
01JAN2020,event1,01JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
02JAN2020,event2,02JAN2020,method2
03JAN2020,,,                      
03JAN2020,,,                      
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method2
04JAN2020,event3,04JAN2020,method1
04JAN2020,event3,04JAN2020,method1
06JAN2020,,,                      
06JAN2020,,,                      
07JAN2020,,,                      
07JAN2020,,,                      
08JAN2020,event4,08JAN2020,method1
08JAN2020,event4,08JAN2020,method1
08JAN2020,event4,08JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
09JAN2020,event5,09JAN2020,method1
;;;;

proc summary data = example missing nway;
   class test_date event_text;
   output out = plot;
run;

title 'Title Here';
proc sgplot data = plot noautolegend;
   needle x =  test_date y = _FREQ_ / markers datalabel = event_text datalabelpos = top;
   series x = test_date y = _FREQ_;
   yaxis offsetmin=0 label = 'Some Text';
   xaxis label = 'Some Text';
run;
title;
mmea
Quartz | Level 8

Thank you!!

A very last question.

As you can see on the example data they can be different method 1 and 2. 

No the line represent them all, but eventually in the future if I want to divided the lines in method so that there will appear 2 lines representing each method - is that possible?

 

PeterClemmensen
Tourmaline | Level 20

Yes, this is possible. Do something like this. However, then you want to establish a logic to how the line should be drawn?

 

title 'Title Here';
proc sgplot data = plot noautolegend;
   needle x =  test_date y = _FREQ_ / group = ALLE groupdisplay=cluster markers nomissinggroup
                                      datalabel = event_text datalabelpos = top;
   *series x = test_date y = _FREQ_;
   yaxis offsetmin=0 label = 'Some Text';
   xaxis label = 'Some Text';
run;
title;
mmea
Quartz | Level 8

The line should preferably be like the drawing i showed - so not zigzag form.

But i guess if you have a lot of data then it will look like my example drawing?

PeterClemmensen
Tourmaline | Level 20

Yeah, but suppose you have two 'lines' for a date. What line should the series plot connect to if any? Or do you want the series to be the average of the two lines?

mmea
Quartz | Level 8

Can they connect to both line so in that the line will be into one if you understand what I mean. So two line for each method. when a date have both methods, then the line will be fused in one

PeterClemmensen
Tourmaline | Level 20

Sorry, I don't understand?

mmea
Quartz | Level 8

I cleaned my data so that i only have this varaibles (in reality I hav enow over 200.000 observations):

data WORK.EXAMPLE;
infile datalines delimiter=',' truncover dsd; 
input event_text $ date_of_event :date9. method $;
format date_of_event ddmmyy8.;
datalines4;
eventtext1,15MAR2020,method1
eventtext2,18JAN2020,method1
eventtext3,18sep2020,method1
eventttext4,28oct2020,method1
eventtext5,19AUG2020,method2
eventtext6,02JAN2020,method2
eventtext7,08JUN2020,method2
eventtext8,10MAR2020,method2

;;;;

I use this code for my real data :

 

proc summary data = EXAMPLE missing nway;
   class date_of_event event_text ;
   output out = plot;
run;

title 'Title Here';
proc sgplot data = plot noautolegend;
   needle x =  date_of_event y = _FREQ_ / markers datalabel = event_text datalabelpos = top;
   series x = date_of_event  y = _FREQ_;
   yaxis offsetmin=0 label = 'Antal test';
   xaxis label = 'Dato';
run;
title;

But my output messes up in some way inthe x-axis (the output is from my real data where the text in the plot is the event_text. On the x-axis should be all months, but only the months with a event should be highlighted with a line and text)

 

 

mmea_0-1609339981904.png

 

mmea
Quartz | Level 8

When I use this for my code (I have over 600000 observation) it appears like this.

I have dates from january until now.

Is there a better way to display this,

even though I have so many observation I only have few dates and events.

 

mmea_0-1609329113492.png

 

ballardw
Super User

That sort of plot typically comes from one of two things: One or other small number of values with an extreme value for the Y variable.

Or specifying an incorrect Y axis where you expected values to 600000 for Y but the actual range is less than 1000 or so.

 

Since you didn't bother to show the code actually used for the procedure or data it is hard to tell which is more likely to apply.

The labels on your Xaxis also make me wonder about your actual X values. You say "dates from January" but the plot doesn't show anything from January until March.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1755 views
  • 0 likes
  • 3 in conversation