About ericdrosano

ericdrosano · ‎06-13-2024

Will send proper format of data next time. I wish I could use numeric values, but it's more complicated than just A-F scale, including QN, INC, NC, P, and many more "grades".

ericdrosano · ‎06-13-2024

My data comprises multiple IDs and grades for those IDs in multiple terms. I want to condense the dataset so that each unique ID has only one line while all of the columns retain the full grade information. It currently looks like this: ID SP20 SM20 FA20 SP21 SM21 FA21 SP22 1001 A+ 1002 D 1002 F 1003 B 1004 A+ 1005 B- 1006 A+ 1007 F 1007 B+ Please note that IDs 1002 and 1007 each have two lines because they took the course twice, i.e., in two different semesters. I want it to become this: ID SP20 SM20 FA20 SP21 SM21 FA21 SP22 1001 A+ 1002 D F 1003 B 1004 A+ 1005 B- 1006 A+ 1007 F B+ Please note that IDs 1002 and 1007 are now on only one line while the grades from each semester are now included. Attached is a csv of the same data above.

ericdrosano · ‎01-21-2022

The dataset in the first image does contain each firm's data is in their own columns. The dataset in the second image is the pairwise combination of each firm, each year with each of their SIC3 peers in that year. Thus, all the data in the dataset the first image is what I use to pairwise match to create the dataset in the second image. As far as uploading data, I may be able to once SAS has completed running my current code, which is already about 10 minutes in and I expect will take much longer.

ericdrosano · ‎01-21-2022

My data looks like: I'm trying to do the following: "The correlations of OX of each member firm are determined with every other member firm in the same SIC3 group, by firm, over the five-year period." I have tried pairwise matching by SIC3YEAR, and this successfully gives me the following : BUT I can't seem to get correlations between each individual firm-year and their SIC3 peers, which I believe I need in order to sum the SIC3 total, before then dividing by n. I have used so many variations of proc corr with ox and p_ox and even done the grouping before pairwise matching, but I am doing something wrong. Code help is greatly appreciated; thank you!

ericdrosano · ‎12-24-2021

One fix though. Were you say to, undo the ods exclude all with: ods include all; seems to be incorrect. I've looked up the fix, and I think it should instead be: ods exclude none; Notwithstanding, your solution was still timely and helpful!

ericdrosano · ‎12-24-2021

That worked like a charm! It seems that whenever I was running the previous commands on my data, it was overwhelming SAS on my computer. So, I was needing a way around the printing, and your solution was a dandy. Thank you very much! 😁 Also, yes, I was popping noprint right after proc reg, which, as expected, was killing the printing and the output.

ericdrosano · ‎12-24-2021

I am borrowing from the code on THIS sas help page. I am attempting to output the parameter estimates from a proc reg, but without printing the results. The code from the SAS help page is as follows: ods output ParameterEstimates = parms; proc reg data=sashelp.baseball; model logsalary = nHits nBB YrMajor / clb; quit; proc print data=parms noobs; run; This copies the parameter estimate data to the 'parms' dataset, like I want to do with my own dataset; however, if I tell it to not print, like below, it will not output the parameter estimates. It even says exactly this in the warning. So, is there a way to get those estimates into a dataset without having to print it? I have some datasets that are you using hundreds of variables, so I don't want to have to go through printing just to get those results.

ericdrosano · ‎11-12-2021

Thank you for your feeback. 🙂 Another reason I avoid GLM is because it causes problems with some of the IVs that I'm using as fixed effects. Maybe it's because I'm not running GLM correctly, or with the right options, but I'll get '0.000000000' and 'B' for my main variables of interest depending upon certain FE variables. I was trying to avoid what you're saying, in terms of creating additional data sets, but it's possible that may be the best solution.

ericdrosano · ‎11-12-2021

I am running OLS regressions using proc reg and have created dummy variables for years in order to implement a year fixed effect in my model. I do NOT want to use proc glm to run my fixed effects regressions because proc glm requires suppressing the intercept. Since some of my fixed effects involve variables that range in the upper hundreds, I would like to keep using proc reg, and report the parameter estimates of the independent variables, but suppress those variables used as fixed effects in the print out of the results. For example, if my model looked like the following: ods select NObs ANOVA FitStatistics ParameterEstimates; proc reg data = testdata; model dv = IV1 IV2 IV3 IV4 y2000--y2020; run; quit; The parameter estimates will include all of y2000--y2020, but I would rather suppress them. Is this possible?

ericdrosano · ‎06-30-2021

Double quotes versus single quotes... got me crying over here!!! Thank you mightily!

ericdrosano · ‎06-30-2021

I am running the following code, but when calling the macro-variable I just created, "dir", libname does not work. As you can see, the "dir" macro-variable is the EXACT same path used to create the library testa, which successfully creates the library. libname testa 'C:\Users' ; %let dir = C:\Users; libname testb '&dir'; libname testc '&dir.'; libname testd ' &dir '; libname teste ' &dir. '; Every time I run this code, I get the same 'NOTE: Library TEST_X_ does not exist.' message for each test, i.e., TESTB, TESTC, TESTD, & TESTE. (For brevity, I did not include ALL of those error messages.) 159 libname testa 'C:\Users' ; NOTE: Libref TESTA was successfully assigned as follows: Engine: V9 Physical Name: C:\Users 160 %let dir = C:\Users; 161 libname testb '&dir'; NOTE: Library TESTB does not exist. Your help is appreciated!

ericdrosano · ‎06-22-2021

I saw your reply after I had already found the solution. 🙂

ericdrosano · ‎06-22-2021

I found a solution to my problem. Since I wanted to match each firm (gvkey) with its 10 nearest industry-year peers, I first created the gindfyear variable which is a combination of industry (gind) and year (fyear). This allowed me to create the pairwise combinations based upon matched industry-year. Since I began with 72,693 observations, I did not want to create 5,284,199,556 firm-firm pairwise combinations!! This first batch of code did this: proc sql; create table want as select distinct a.gvkey, a.zip_code as main_zip, b.gvkey as peer, b.zip_code as peer_zip, a.gindfyear from have a, have b where a.gindfyear eq b.gindfyear and a.gvkey ne b.gvkey; quit; As you can see above, it creates 5 variables: gvkey, main zip, peer, peer zip, and industry-year. This resulted in 9,861,622 obs, which is far fewer than the potential 5 billion I was reticent to create. I then use the zipcitydistance function in SAS to determine the distances between each main and peer. Finally, I ranked them, keeping only the four closest. The final code is below: data want2; set want; dist= zipcitydistance(main_zip, peer_zip); run; proc sort data= want2; by gvkey gindfyear dist; quit; data want3; set want2; by gvkey gindfyear dist; retain closest; if first.gindfyear then do; closest=1; output; end; else if closest lt 4 then do; closest = closest + 1; output; end; run; This correctly yields 726,930 observations since I began with 72,693 observations (10x for each).

ericdrosano · ‎06-22-2021

I am attempting to create 12 additional variables containing the following three columns repeated 4x: peer ID, peer zip code, peer distance. The first three columns would be the closest peer (cls1gvk, cls1zip, cls1dist) and each subsequent iteration finds the next closest peer, and so on until the 4th closest peer. Peers are established based upon industry classification, 'gind' in Compustat, and year, 'fyear' in Compustat. Current plan of action: The only path I can see forward is to create a unique DB for every unique gind-fyear (of which I have 888). They would each begin as a 2xn, where the variables are gvkey and zip code. I would then perform a pairwise combination of each gvkey (3rd column), add the matched zip to each paired gvkey (4th column), compute the distance (zipcitydistance function) between zip codes for each (5th), rank them (6th), and finally remove all of those ranks greater than four. The data I have looks like: gvkey fyear gind zip code 100001 2018 200555 10021 100001 2019 200555 10021 100002 2018 200555 10021 100002 2019 200555 10021 100003 2018 200555 10021 100003 2019 200555 10021 100004 2018 200555 10021 100004 2019 200555 10021 100005 2018 200555 10021 100005 2019 200555 10021 100006 2018 200555 10021 100006 2019 200555 10021 100007 2018 312448 10022 100007 2019 312448 10022 100008 2018 312448 10022 100008 2019 312448 10022 100009 2018 312448 10022 100009 2019 312448 10022 100010 2018 312448 10022 100010 2019 312448 10022 100011 2018 312448 10022 100011 2019 312448 10022 100012 2018 312448 10022 100012 2019 312448 10022 Any help is appreciated.

ericdrosano · ‎11-16-2020

Your solution was the correct solution for the question I asked; thank you! What is funny is that I discovered that all I had ever needed to do was to use the proc rank statement initially, and then I would have never had the problem I created for myself in the first place! 🙃 proc rank data= rt2 out= test ties= low descending ; by day ; var users ; ranks usersrank; run ; Nevertheless, thank you for your help!

Online Status	Offline
Date Last Visited	‎09-02-2024 05:51 PM

Re: Condense multiple observations into one retaining all variables

Condense multiple observations into one retaining all variables

Re: Getting Mean Correlations

Getting Mean Correlations

Re: Output proc reg Parameter Estimates without printing

Re: Output proc reg Parameter Estimates without printing

Output proc reg Parameter Estimates without printing

Re: Suppress the Parameter Estimates of SOME Independent Variables in ...

Suppress the Parameter Estimates of SOME Independent Variables in PROC...

Re: libname won't call the macro variable

Re: Condense multiple observations into one retaining all variables

Re: Output proc reg Parameter Estimates without printing

Re: libname won't call the macro variable

Re: How do I determine the 4 Closest Peers by Distance?

Re: Creating lags of multiple variables using an array.

Re: Output proc reg Parameter Estimates without printing

Re: LAG Not Copying Data Lag Line

Re: Condense multiple observations into one retaining all variables

Condense multiple observations into one retaining all variables

Re: Getting Mean Correlations

Getting Mean Correlations

Re: Output proc reg Parameter Estimates without printing

Re: Output proc reg Parameter Estimates without printing

Output proc reg Parameter Estimates without printing

Re: Suppress the Parameter Estimates of SOME Independent Variables in ...

Suppress the Parameter Estimates of SOME Independent Variables in PROC...

Re: libname won't call the macro variable

libname won't call the macro variable

Re: How do I determine the 4 Closest Peers by Distance?

Re: How do I determine the 4 Closest Peers by Distance?

How do I determine the 4 Closest Peers by Distance?

Re: LAG Not Copying Data Lag Line