About Vicente95

Vicente95 · ‎07-11-2021

Oh wow! Thank you for letting me know. On the same topic of PROC LOGISTIC, how can I fit a binary logistic regression model with bonus as the outcome variable and basement_area, fireplaces, and lot_shape2 as the predictor variables? In this model, I want to specify "Regular" as the reference group for the lot_shape_2 variable and "0" as the reference level for the fireplaces variable. Would it look something like this? PROC LOGISTIC DATA = WORK.ameshousing3 PLOTS(ONLY) = (EFFECT (CLBAND X = (FIREPLACES)) ODDSRATIO (TYPE = HORIZONTALSTAT)); CLASS lot_shape_2 (PARAM = REF REF = 'Regular') fireplaces (PARAM = REF REF = '0'); MLogit1: MODEL bonus(event='1') = basement_area lot_shape_2 fireplaces / clodds=pl; UNITS fireplaces=1; ODDSRATIO 'Comparisons of Type' lot_shape_2 / DIFF=ALL CL=PL; TITLE 'Bonus Eligibility Status Model'; RUN;

Vicente95 · ‎07-11-2021

Ok, thank you for the response. How can I fix it?

Vicente95 · ‎07-11-2021

Currently, I am not receiving the appropriate Odds Ratio and Relative Risks table in my SAS below when using the ameshousing3.sas7bdat dataset. The expectation is that SAS produces the table in the Results Viewer. However, that is not happening in my use case. Does anyone know what I am doing wrong? I have attached the SAS dataset to this question along with a screenshot of the output that I am receiving. PROC FREQ DATA = WORK.ameshousing3; TABLES fireplaces * bonus / CHISQ RELRISK NOROW NOCOL NOPERCENT; FORMAT bonus bonusfmt.; TITLE "Association Analysis - Chi-Sq and Odds Ratio"; RUN;

Vicente95 · ‎07-02-2021

Thank you. Is there an optimal way where I can shorten the following SAS code? The idea is to fill in the missing make values using the description column by matching the automaker keyword all while keeping the non-missing model values intact. Based on the results, it seems to work. DATA WORK.usedcars6; SET WORK.usedcars5; IF make='' AND FIND(description,'ford') THEN make='ford'; ELSE IF make='' AND FIND(description,'chevy') or FIND(description,'chevrolet') THEN make='chevrolet'; ELSE IF make='' AND FIND(description,'toyota') THEN make='toyota'; ELSE IF make='' AND FIND(description,'honda') THEN make='honda'; ELSE IF make='' AND FIND(description,'nissan') THEN make='nissan'; ELSE IF make='' AND FIND(description,'jeep') THEN make='jeep'; ELSE IF make='' AND FIND(description,'ram') THEN make='ram'; ELSE IF make='' AND FIND(description,'gmc') THEN make='gmc'; ELSE IF make='' AND FIND(description,'bmw') THEN make='bmw'; ELSE IF make='' AND FIND(description,'dodge') THEN make='dodge'; ELSE IF make='' AND FIND(description,'mercedes-benz') OR FIND(description,'mercedes') OR FIND(description,'mercedes benz') OR FIND(description,'benz') THEN make='mercedes-benz'; ELSE IF make='' AND FIND(description,'hyundai') THEN make='hyundai'; ELSE IF make='' AND FIND(description,'subaru') THEN make='subaru'; ELSE IF make='' AND FIND(description,'lexus') THEN make='lexus'; ELSE IF make='' AND FIND(description,'kia') THEN make='kia'; ELSE IF make='' AND FIND(description,'audi') THEN make='audi'; ELSE IF make='' AND FIND(description,'cadillac') THEN make='cadillac'; ELSE IF make='' AND FIND(description,'acura') THEN make='acura'; ELSE IF make='' AND FIND(description,'chrysler') THEN make='chrysler'; ELSE IF make='' AND FIND(description,'mazda') THEN make='mazda'; ELSE IF make='' AND FIND(description,'buick') THEN make='buick'; ELSE IF make='' AND FIND(description,'infiniti') THEN make='infiniti'; ELSE IF make='' AND FIND(description,'lincoln') THEN make='lincoln'; ELSE IF make='' AND FIND(description,'volvo') THEN make='volvo'; ELSE IF make='' AND FIND(description,'mitsubishi') THEN make='mitsubishi'; ELSE IF make='' AND FIND(description,'mini') THEN make='mini'; ELSE IF make='' AND FIND(description,'jaguar') THEN make='jaguar'; ELSE IF make='' AND FIND(description,'pontiac') THEN make='pontiac'; ELSE IF make='' AND FIND(description,'porsche') THEN make='porsche'; ELSE IF make='' AND FIND(description,'saturn') THEN make='saturn'; ELSE IF make='' AND FIND(description,'mercury') THEN make='mercury'; ELSE IF make='' AND FIND(description,'alfa-romeo') OR FIND(description,'alfa romeo') THEN make='alfa-romeo'; ELSE IF make='' AND FIND(description,'tesla') THEN make='tesla'; ELSE IF make='' AND FIND(description,'fiat') THEN make='fiat'; RUN;

Vicente95 · ‎07-02-2021

Hello Team, I have a used cars dataset where there are hundreds of thousands of missing values. The dataset looks something like this (I filtered the manufacturer to only show toyota for brevity's sake) region price year manufacturer model condition cylinders fuel odometer title_status transmission drive size type paint_color state posting_date ventura county $ 7,800 2007 toyota gas 10000000 clean automatic ca 2021-04-30T11:48:47-0700 south florida $ 14,498 2007 toyota fj cruiser 4x4 diesel 7777777 clean automatic fl 2021-05-01T07:38:28-0400 flagstaff / sedona $ 3,900 1998 toyota 4-runner sr5 good 6 cylinders gas 349000 clean automatic rwd mid-size SUV green az 2021-04-05T09:56:10-0700 phoenix $ 3,500 1998 toyota 4runner good 6 cylinders gas 347000 clean automatic rwd SUV silver az 2021-05-03T11:27:32-0700 colorado springs $ 6,500 1998 toyota good 8 cylinders gas 345000 clean automatic 4wd full-size SUV grey co 2021-04-28T10:05:44-0600 seattle-tacoma $ 1,150 2009 toyota prius fair hybrid 345000 clean automatic hatchback blue wa 2021-05-03T10:06:08-0700 knoxville $ 2,000 1998 toyota camry ce good 4 cylinders gas 344200 clean automatic sedan tn 2021-05-03T22:07:49-0400 knoxville $ 2,000 1998 toyota camry ce good 4 cylinders gas 344200 clean automatic sedan tn 2021-04-30T21:54:27-0400 knoxville $ 2,000 1998 toyota camry ce good 4 cylinders gas 344200 clean automatic sedan tn 2021-04-26T13:20:34-0400 knoxville $ 2,200 1998 toyota camry ce good 4 cylinders gas 344200 clean automatic sedan tn 2021-04-23T14:43:32-0400 SF bay area $ 6,900 2001 toyota 4runner 6 cylinders gas 342200 clean automatic ca 2021-05-03T18:18:25-0700 champaign urbana $ 1,000 1997 toyota camry fair 4 cylinders gas 342000 clean automatic fwd full-size sedan red il 2021-04-08T16:24:29-0500 kansas city, MO $ 2,996 2000 toyota tundra 2wd truck excellent 6 cylinders gas 342000 clean automatic rwd pickup white ks 2021-04-26T16:13:17-0500 fort collins / north CO $ 2,000 1999 toyota 4runner fair gas 340500 clean automatic 4wd SUV white co 2021-04-07T10:55:29-0600 orange county $ 8,600 2003 toyota tundra 8 cylinders gas 340000 clean automatic ca 2021-04-28T15:54:43-0700 I need to keep the existing rows that have no missing values intact and untouched, especially the model and manufacturer columns while replacing all the missing values for the engine using the make/model column. Note, all values/observations are already lower-case so there is no need to perform any sort of upper and lower casing. My goal is to use an if-else statement in conjunction with a wildcard and list to fill in the missing values in the cylinders column using the make/model columns. The idea is to fill in missing engine cylinder values based on the car model. There are some missing values for the car model which I predict will cause problems. Here is a sample logic that I want to perform for Toyota. If the SAS code works for Toyota, I can apply this to other car makers such as Honda, BMW, Lexus, Mercedes-Benz, etc. I can also use the same logic to work with the transmission, drive train, size, and type columns. 4 cylinders If the model starts with (is in or like) "camry", "corolla", "rav4", "prius", "yaris", "matrix", "echo", "previa"; and The cylinders is missing/blank; then Cylinders equal "4 cylinders" 6 cylinders Else if the model starts with "avalon", "highlander", "4runner", "sienna", "tacoma", "highlander", "fj", "venza", "cruiser"; and The cylinders is missing/blank; then Cylinders equal "6 cylinders" 8 cylinders Else if the model starts with "tundra", "sequoia"; and The cylinders is missing/blank; then Cylinders equal "8 cylinders" I imagine this if-else statement will be very long since there are 38 car makers/manufacturers in the dataset. Any guidance is appreciated.

Vicente95 · ‎06-30-2021

Unfortunately, the code did not work. Do you know what the error messages mean? 3004 DATA want; 3005 SET have; 3006 3007 array columns manufacturer condition fuel title_status transmission drive size type paint_color; 3008 array strings [9] $300 _temporary_ ( 3009 /*manufacturer*/ 'gmc | hyundai | toyota | mitsubishi | ford | chevrolet | ram | buick | jeep | dodge | subaru | nissan | audi | rover | lexus 3010 | honda | chrysler | mini | pontiac | mercedes-benz | cadillac | bmw | kia | volvo | volkswagen | jaguar | acura | saturn | mazda | 3011 mercury | lincoln | infiniti | ferrari | fiat | tesla | land rover | harley-davidson | datsun | alfa-romeo | morgan | aston-martin | porche NOTE: The quoted string currently being processed has become more than 262 characters long. You might have unbalanced quotation marks. 3012 | hennessey' 3013 /*condition*/ 'excellent | good | fair | like new | salvage | new' 3014 /*fuel*/ 'gas | hybrid | diesel |electric' 3015 /*title_status*/ 'clean | lien | rebuilt | salvage | missing | parts only' 3016 /*transmission*/ 'automatic | manual' 3017 /*drive*/ '4x4 | awd | fwd | rwd | 4wd' 3018 /*size*/ 'mid-size | full-size | compact | sub-compact' 3019 /*type*/ 'sedan | truck | SUV | mini-van | wagon | hatchback | coupe | pickup | convertible | van | bus | offroad' 3020 /*paint_color*/ 'red | grey | blue | white | custom | silver | brown | black | purple | green | orange | yellow' 3021 ); 3022 3023 length word $50; 3024 do col=1 to dim(columns); 3025 do index=1 to countw(strings[col]),'|') while (missing(columns[col])) ; - 388 200 76 ERROR 388-185: Expecting an arithmetic operator. ERROR 200-322: The symbol is not recognized and will be ignored. ERROR 76-322: Syntax error, statement will be ignored. 3026 word=left(scan(strings[col],index,'|')); 3027 if findw(description,word,,'it') then columns[col]=word; 3028 end; 3029 end; 3030 3031 if missing(cylinders) then do; 3032 index=findw(description,'cylinders',,'it'); 3033 if index then do; 3034 cylinders=scan(substrn(description,1,index-1),-1)||' cylinders'; 3035 end; 3036 end; 3037 3038 drop word col index ; 3039 run;

Vicente95 · ‎06-30-2021

Nice! This is what I was looking for 🙂 I will try the code and let you know how it goes.

Vicente95 · ‎06-30-2021

Yes, I received your response. Thank you for that.

Vicente95 · ‎06-30-2021

Vicente95 · ‎06-30-2021

So, I want the equivalent SAS code of what is contained in the screenshot since the author wrote it in Python. What would the SAS code resemble? As for the cylinders list, it would be: %let cylinders=(3 cylinders,4 cylinders,5 cylinders,6 cylinders,8 cylinders,10 cylinders,12 cylinders); Is there a regex to shorten the cylinders list in SAS?

Vicente95 · ‎06-30-2021

Vicente95 · ‎06-28-2021

I agree with what you said, however I was assigned this dataset as my personal project. There is not much that I could do other than use it.

Vicente95 · ‎06-28-2021

Yes, I did. It worked just fine however there are hundreds of thousands of results that my PC lags when scrolling down to view the results. P.S. I am no SAS expert or statistician let alone possess the knowledge that you have. I am a novice who started learning at the start of this year. Hence, why I asked since I am stuck in the woodworks.

Vicente95 · ‎06-28-2021

That makes sense. However, is there a "simpler" or more optimal process on how to do this for all 349K observations? I get that the Ford F-150 is 8 cylinders, but how I would possibly accurately fill in all the remaining cylinders for each and every model in the dataset?

Vicente95 · ‎06-27-2021

As for the question, "why don't you profile cylinders by model," I am using the following dataset: https://www.kaggle.com/austinreese/craigslist-carstrucks-data I would not know how I would profile each cylinder by the model where there are hundred of thousands of them.

Online Status	Offline
Date Last Visited	‎07-12-2021 03:30 AM

Re: Not receiving RELRISK plot in frequency table

Re: Not receiving RELRISK plot in frequency table

Not receiving RELRISK plot in frequency table

Re: IF-ELSE Statement in Conjunction with Wildcard Strings

IF-ELSE Statement in Conjunction with Wildcard Strings

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Not receiving RELRISK plot in frequency table

Re: Not receiving RELRISK plot in frequency table

Re: IF-ELSE Statement in Conjunction with Wildcard Strings

Re: IF-ELSE Statement in Conjunction with Wildcard Strings

Re: IF-ELSE Statement in Conjunction with Wildcard Strings

Re: Not receiving RELRISK plot in frequency table

Re: Not receiving RELRISK plot in frequency table

Not receiving RELRISK plot in frequency table

Re: IF-ELSE Statement in Conjunction with Wildcard Strings

IF-ELSE Statement in Conjunction with Wildcard Strings

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Re: Use values from an existing column to fill missing values in other...

Use values from an existing column to fill missing values in other col...

Re: Removing the last 10 characters from a string and converting it to...

Re: Removing the last 10 characters from a string and converting it to...

Re: Removing the last 10 characters from a string and converting it to...

Re: Removing the last 10 characters from a string and converting it to...