Recently in the SAS Community Library: Your often contains the information you need, but not sequenced in the order required for processing. @SASJedi shows you how to properly sequence data so you can compare the data in one table to the data in another, conduct merges or joins and more.
Hello
Lets say that have data set has wide structure than I know how to calculate -First_Smallest,Second_Smallest,Third_Smallest..
Let's say that have data set has ling structure.
What is the way to calculate-First_Smallest,Second_Smallest,Third_Smallest from this data set?
(Without change the structure from long to wide)?
Note-
First_Smallest is the value in smallest observation
Second_Smallest is the value in second smallest observation
Third_Smallest is the value in third smallest observation
data have1;
input CustID VAR $ mon1 mon2 mon3 mon4 mon5 mon6;
cards;
1 x 10 20 30 40 50 60
1 W 60 50 40 30 20 10
1 Z 30 30 30 0 0 0
2 x 30 20 10 10 40 0
;
Run;
data want1;
set have1;
First_Smallest=smallest(1,mon1,mon2,mon3,mon4,mon5,mon6) ;
Second_Smallest=smallest(2,mon1,mon2,mon3,mon4,mon5,mon6) ;
Third_Smallest=smallest(3,mon1,mon2,mon3,mon4,mon5,mon6) ;
Run;
data have2;
input CustID VAR $ month $ value;
cards;
1 x mon1 10
1 x mon2 20
1 x mon3 30
1 x mon4 40
1 x mon5 50
1 x mon6 60
1 w mon1 60
1 w mon2 50
1 w mon3 40
1 w mon4 30
1 w mon5 20
1 w mon6 10
1 z mon1 30
1 z mon2 30
1 z mon3 30
1 z mon4 0
1 z mon5 0
1 z mon6 0
2 x mon1 30
2 x mon2 20
2 x mon3 10
2 x mon4 10
2 x mon5 40
2 x mon6 0
;
run;
... View more
Hi, I have been working on the Essentials 1 Case Study looking at TSA Claims data. I have reviewed all the documents and am still having troubles with my code. For the total number of date issues, I got 3938 but the document says it should be 4241. For the part with the macro for StateName, every time I run it there is no results/output and the log says: NOTE: No observations were selected from data set TSA.CLAIMS_CLEANED. NOTE: There were 0 observations read from the data set TSA.CLAIMS_CLEANED. WHERE 0 /* an obviously FALSE WHERE clause */ ; I even copied and pasted all of the code from the document and still receive the same number (3938) and same log message from my original code. When the macro is for "C" then there are results that show up, but none for the example with "California" (as per the document code). I noticed in the proc contents that the length of State and StateName was only 1, perhaps that's the problem? I tried changing the length of those in the data steps, but that didn't change anything. Can someone please help me make this work? Below is my code after I set up my library: options validvarname=v7;
proc import datafile="&path/TSAClaims2002_2017.csv"
dbms=csv out=ClaimsImport replace;
guessingrows=max;
run;
/* EXPLORE DATA */
/* Look at first 20 observations */
proc print data=tsa.claimsimport (obs=20);
run;
/* Look at the descriptor portion of the data, sorted bycolumn order. */
proc contents data=tsa.claimsimport varnum;
run;
proc freq data=tsa.claimsimport;
tables Claim_Site Disposition Claim_Type / nocum nopercent;
tables Incident_Date Date_Received / nocum nopercent;
format Incident_Date Date_Received year4.;
run;
/* PREPARE DATA */
/* Remove entirely duplicate rows and create new table */
proc sort data=tsa.claimsimport
out=tsa.Claims_NoDups
nodupkey;
by _all_;
run;
proc sort data=tsa.claims_nodups;
by Incident_Date;
run;
data tsa.Claims_Cleaned;
set tsa.claims_nodups;
/* Clean Claim_Site by replacing missing and - values with Unknown */
if Claim_Site in ('-',"") then Claim_Site = "Unknown";
/*Clean Disposition */
if Disposition in ("-", "") then
Disposition = 'Unknown';
else if Disposition = 'Closed: Canceled' then
Disposition = 'Closed:Canceled';
else if Disposition = 'losed: Contractor Claim' then
Disposition ='Closed:Contractor Claim';
/*Clean Claim_Type*/
if Claim_Type in ("-","") then Claim_Type = "Unknown";
else if Claim_Type = 'Passenger Property Loss/Injury' then Claim_Type = 'Passenger Property Loss';
else if Claim_Type = 'Passenger Property Loss/Injur' then Claim_Type = 'Passenger Property Loss';
else if Claim_Type = 'Property Damage/Personal Injury' then Claim_Type = 'Property Damage';
/* Convert State values to uppercase and all StateName Columns to proper case */
State=upcase(state);
StateName=propcase(StateName);
/* Create columns that identifies rows with issues */
if (Incident_Date > Date_Received or
Incident_Date = . or
Date_Received = . or
year(Incident_Date) < 2002 or
year(Incident_Date) > 2017 or
year(Date_Received) < 2002 or
year(Date_Received) > 2017)
then Date_Issues = "Needs Review";
/* Create permanent labels for columns */
format Incident_Date Date_Received date9. Close_Amount dollar20.2;
label Airport_Code = "Airport Code"
Airport_Name = "Airport Name"
Claim_Number = "Claim Number"
Claim_Site = "Claim Site"
Claim_Type = "Claim Type"
Close_Amount = "Close Amount"
Date_Issues = "Date Issues"
Date_Received = "Date Received"
Incident_Date = "Incident Date"
Item_Category = "Item Category";
/* Drop County and City Columns */
drop County City;
run;
/* ANALYZE */
/* How many date issues are in the overall data? 3938 */
title "Overall Date Issues in the Data";
proc freq data=tsa.Claims_Cleaned;
table Date_Issues / nocum nopercent;
run;
title;
/* How many claims per year of Incident_date are there in the overall data?
Values vary between years. There were 2122 in 2002 and 8340 in 2017 */
/* Include a plot */
ods graphics on;
title "Overall Claims by Year";
proc freq data=tsa.Claims_Cleaned;
table Incident_Date / nocum nopercent plots=freqplot;
format Incident_Date year4.;
where Date_Issues is missing; /* filter where rows DO NOT have any date issues */
run;
/* What are the frequency values for Claim_Type, Claim_Site, and Disposition for */
/* a selected state? NOTE: User should be able to dynamically input a specific state */
%let StateName=California;
proc freq data=tsa.Claims_Cleaned order=freq;
tables Claim_Type Claim_Site Disposition / nocum nopercent;
where StateName="&StateName" and Date_Issues is null; /* Filter for only rows where colunm has missing value */
run;
proc means data=tsa.Claims_Cleaned mean min max sum maxdec=0;
var Close_Amount;
where StateName="&StateName" and Date_Issues is null;
run;
... View more
My company creates xls output using ODS TAGSETS.EXCELXP. We want to export directly to xlsx using ODS EXCEL, but we have some limitations. We were excited when we finally upgraded from SAS 9.4 M3 to M6 thinking those limitations would be resolved, but not all were. We still have a lingering issue. In ODS TAGSETS, we can apply the AUTOFIT_HEIGHT='yes' option to make the row heights adjust based on the text. That option seems to have disappeared in ODS EXCEL. I know ODS EXCEL has ROW_HEIGHT and ABSOLUTE_ROW_HEIGHT, but they do not have the same functionality. I can use these to either speicfy the same height for all rows, or I can speicify the height of each row individually. This becomes ridiculously unfeasible when dealing with 1000s of records. This is a bit of a deal breaker because our text wraps, and we don't want wrapped text to be hidden and missed by a shortened row.
This is example code of how we use ODS TAGSETS, and we want to replicate this output with ODS EXCEL:
/*** Template intended for xls output ***/
ODS PATH work.templat(update) sasuser.templat(read) sashelp.tmplmst(read);
proc template;
define style work.newxls;
parent=styles.normal;
style default / fontsize = 14pt;
style Data from Column /
bordertopwidth = 0px
borderrightstyle = solid
bordertopcolor = #AAC1D9
padding = 3
borderbottomcolor = #AAC1D9
borderbottomwidth = 1px
borderbottomstyle = solid
fontweight = medium
borderrightwidth = 1px
verticalalign = _undef_
fontfamily = "<sans-serif>, <MTsans-serif>, Helvetica, sans-serif"
borderleftstyle = solid
borderrightcolor = #AAC1D9
borderleftcolor = #AAC1D9
textalign = _undef_
fontsize = 10pt
borderleftwidth = 0px
bordertopstyle = solid;
style Header from Column /
bordertopwidth = 0px
borderrightstyle = solid
bordertopcolor = #AAC1D9
padding = 3
borderbottomcolor = #AAC1D9
backgroundcolor = #faf3d4
borderbottomwidth = 1px
borderbottomstyle = solid
fontweight = bold
borderrightwidth = 1px
verticalalign = middle
fontfamily = "<sans-serif>, <MTsans-serif>, sans-serif"
borderleftstyle = solid
borderrightcolor = #AAC1D9
borderleftwidth = 0px
borderleftcolor = #AAC1D9
textalign = center
fontsize = 12pt
color = cx000000
bordertopstyle = solid;
end;
run;
/*** This code will use the NEWXLS template to generate xls output and properly uses AUTOFIT_HEIGHT and WRAPTEXT. ***/
%let ods_options=%str( row_repeat = 'header'
autofilter = 'yes'
orientation = 'landscape'
frozen_headers='5'
center_horizontal = 'yes'
fittopage = 'yes'
pages_fitwidth = '1'
pages_fitheight = '100'
autofit_height = 'yes'
embedded_titles = 'yes'
embedded_footnotes = 'yes'
wraptext = 'yes'
gridlines = 'yes'
sheet_interval = 'none');
ods listing close;
ods tagsets.excelxp file="c:\temp\Test_Output.xls" style=work.newxls;
title1 j=l "Vertex Pharmaceuticals Incorporated";
title2 j=l "Protocol: ";
title3 j=l "Test Report Demonstraiting Row Height - Written for SAS 9.4 M3";
ods tagsets.excelxp options (&ods_options. sheet_name='Test Case' absolute_column_width='12');
proc report data=sashelp.aacomp nowindows split='^';
column locale key lineno text;
define locale / 'Short Variable' style(column)={tagattr='format:@'};
define key / 'Medium Variable' style(column)={tagattr='format:@'};
define lineno / 'Short Variable' style(column)={tagattr='format:@'};
define text / 'Long Variable Text with a Long Label to Illustrate Wrapping' style(column)={tagattr='format:@'};
run;
ods tagsets.excelxp close;
ods listing;
This is example code of how we would use ODS EXCEL, but we get can't find a replacement for AUTOFIT_HEIGHT. Notice how wrapped text is hidden from view:
/*** Template intended for direct xlsx output ***/
ODS PATH work.templat(update) sasuser.templat(read) sashelp.tmplmst(read);
proc template;
define style work.odsexcel;
parent=styles.excel;
class systemtitle/
fontsize=12pt
color=black;
class systemfooter/
color=black;
class header/
fontsize=12pt
fontweight=bold
verticalalign=middle
textalign=center
color=black
backgroundcolor=#faf3d4;
class data/
fontsize=10pt
verticalalign=_undef_
textalign=_undef_;
end;
run;
/***
This code will use the ODSEXCEL template to generate xlsx output but does not allow the same options.
WRAPTEXT is no longer an option. It produces a warning. But it can be replaced with the FLOW option.
AUTOFIT_HEIGHT is no longer an option. It produces a warning. I can't find an equivalent option.
***/
%let ods_options=%str( row_repeat = 'header'
autofilter = 'yes'
orientation = 'landscape'
frozen_headers='5'
center_horizontal = 'yes'
fittopage = 'yes'
pages_fitwidth = '1'
pages_fitheight = '100'
/* autofit_height = 'yes'*/
embedded_titles = 'yes'
embedded_footnotes = 'yes'
/* wraptext = 'yes'*/
flow='tables'
gridlines = 'yes'
sheet_interval = 'none');
ods listing close;
ods excel file="c:\temp\Test_Output.xlsx" style=work.odsexcel;
title1 j=l "Vertex Pharmaceuticals Incorporated";
title2 j=l "Protocol: ";
title3 j=l "Test Report Demonstraiting Row Height - Written for SAS 9.4 M6";
ods excel options (&ods_options. sheet_name='Test Case' absolute_column_width='12');
proc report data=sashelp.aacomp nowindows split='^';
column locale key lineno text;
define locale / 'Short Variable' style(column)={tagattr='format:@'};
define key / 'Medium Variable' style(column)={tagattr='format:@'};
define lineno / 'Short Variable' style(column)={tagattr='format:@'};
define text / 'Long Variable Text with a Long Label to Illustrate Wrapping' style(column)={tagattr='format:@'};
run;
ods excel close;
ods listing;
Please let me know how I can escalate this to be fixed. We are running into issues where xls files are too large to email, and we don't want to have to manually save them as xlsx files.
... View more
Hello
For each custID+VAR I calculated UCL and LCL (control limits).
Then I want to calculate for each CustID+VAR the average without extreme values
Note-Extreme values are values lower LCL or higher UCL
What is the best way to do it
Data have;
input CustID mon X Y Z;
cards;
111 6 10 20 300 10 80
111 5 11 12 13 2 15
111 4 14 15 16 7 20
111 3 17 18 19 6 25
111 2 20 21 22 5 25
111 1 23 24 25 4 30
222 6 40 41 80 10 50
222 5 37 38 39 12 50
222 4 15 13 12 2 20
222 3 12 14 16 5 20
222 2 50 30 20 5 40
222 1 15 16 40 3 25
;
Run;
proc sort data=have;by CustID mon ;Run;
proc transpose data=have out=have2(Rename=(_NAME_=VAR COL1=value));
by CustID mon ;
var X Y Z;
quit;
proc summary data=have2 nway;
class CustID var;
var VALUE;
output out=summary (drop=_type_ _freq_ rename=(CV=CV6 mean=AVG6 N=N6 std=std6 max=max6 min=min6)) cv=CV mean= mean n=n std=std min=min max=max;
run;
data summary2;
set summary;
UCL=AVG6+std6;
LCL=AVG6-std6;
Run;
Data have;
input CustID VAR $ AVG_no_EXT;
cards;
This is the requyired data set
... View more
Hi Need help to Highlight mismatched rows(compare 2-3 , 4-5,6-7 rows so on dynamically) data Using SAS while generating a multi sheet excel file . SAS is on Unix server. I am using SAS EG 7.1 Below is the data I have to compare every two rows and highlight the mismatched data while generating the Excel and send it via email. Note - there are multiple sheets and columns vary (sometimes there maybe 10 columns, sometimes it may be 18) I have SAS code(used ODS) to generate excel with multiple sheets and send it via email and separate a VBA macro to highlight the mismatched data which I run manually. But I am unable to integrate both in to single code so that no manual work is needed, I really appreciate if anyone can suggest any way to do it in one single code. Below is sample dataset , rows and column are dynamically created based on the project. Need to compare 2-3,4-5, 6-7..so-on rows and highlight(I have given blue to the font but i need whole box as yellow) only when they don’t match. ssn firstname lastname empid address city state zip 000-11-1234 abc d 1001 plotno 123 abcd rd 12345 000-11-1234 abc e 1001 plotno 123 abcd rd 12345 111-12-3456 efgh f 2001 house 897 tyui fg 23456-789 111-12-3456 efgh f 2001 hs 897 tyui fg 23456 444-56-7890 wrt y 3001 xc 12345 444-56-7890 wert y 3002 xc 12345 my VBA code: Sub compare_Auto() Dim LastColumn As Long LastColumn = ActiveSheet.Range("A1").CurrentRegion.Columns.Count Dim rng As Range, rngDiff As Range Set rng = Range(Cells(2, 7), Cells(3, LastColumn)) Do While Application.CountA(rng) > 0 'loop while have content Set rngDiff = Nothing 'reset range On Error Resume Next 'ignore "No cells were found" error if no differences Set rngDiff = rng.ColumnDifferences(Comparison:=rng.Cells(1)) On Error GoTo 0 'stop ignoring errors If Not rngDiff Is Nothing Then rngDiff.Interior.ColorIndex = 6 'color all differences rngDiff.Offset(-1).Interior.ColorIndex = 6 End If Set rng = rng.Offset(2) 'two rows down Loop End Sub
... View more