The Fama-McBeth (1973) regression is a two-step procedure . The first step involves estimation of N cross-sectional regressions and the second step involves T time-series averages of the coefficients of the N-cross-sectional regressions. The standard errors are adjusted for cross-sectional dependence. This is generally an acceptable solution when there is a large number of cross-sectional units and a relatively small time series for each cross-sectional unit. However, if both cross-sectional and time-series dependencies are suspected in the data set, then Newey-West consistent standard errors can be an acceptable solution.

## Estimation Procedure

The Fama-McBeth (FMB) can be easily estimated in Stata using asreg package. Consider the following three steps for estimation of FMB regression in Stata.

1. Arrange the data as panel data and use xtset command to tell Stata about it.

2. Install **asreg** from ssc with this line of code:

`ssc install asreg`

3. Apply **asreg** command with **fmb** option

## An Example

We shall use the grunfeld dataset in our example. Let’s download it first:

webuse grunfeld

This data is already xtset, with the following command:

xtset company year

Assume that we want to estimate a FMB regression where the dependent variable is *invest* and independent variables are *mvalue* and *kstock*. Just like regress command, **asreg** uses the first variable as dependent variable and rest of the variables as independent variables. Using the *grunfeld* data, **asreg** command for FMB regression is given below:

asreg invest mvalue kstock, fmb

Fama-MacBeth (1973) Two-Step procedure Number of obs = 200 Num. time periods = 20 F( 2, 19) = 195.04 Prob > F = 0.0000 avg. R-squared = 0.8369 ------------------------------------------------------------------------------ | Fama-MacBeth invest | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mvalue | .1306047 .0093422 13.98 0.000 .1110512 .1501581 kstock | .0729575 .0277398 2.63 0.016 .0148975 .1310176 _cons | -14.75697 7.287669 -2.02 0.057 -30.01024 .496295 ------------------------------------------------------------------------------

## Newey-West standard errors

If Newey-West standard errors are required for the second stage regression, we can use the option **newey(***integer***).** The integer value specifies the number of lags for estimation of Newey-West consistent standard errors. Please note that without using option **newey**, asreg estimates normal standard errors of OLS. This option accepts only integers, for example **newey**(*1*) or **newey**(*4*) are acceptable, but **newey**(*1.5*) or **newey**(*2.3*) are not. So if we were to use two lags with the Newey-West error for the above command, we shall type;

asreg invest mvalue kstock, fmb newey(2) Fama-MacBeth Two-Step procedure (Newey SE) Number of obs = 200 (Newey-West adj. Std. Err. using lags(2)) Num. time periods = 20 F( 2, 19) = 39.73 Prob > F = 0.0000 avg. R-squared = 0.8369 --------------------------------------------------------------------------------- | Newey-FMB invest | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+------------------------------------------------------------------- mvalue | .1306047 .0150138 8.70 0.000 .0991804 .1620289 kstock | .0729575 .0375046 1.95 0.067 -.0055406 .1514557 _cons | -14.75697 8.394982 -1.76 0.095 -32.32787 2.813928 ---------------------------------------------------------------------------------

For some reasons, if we wish to display the first stage N – cross-sectional regressions of the FMB procedure, we can use the option **first**. And if we wish to save the first stage results to a file, we can use the option **save**(*filename*). Therefore, commands for these options will look like:

asreg invest mvalue kstock, fmb newey(2) first asreg invest mvalue kstock, fmb newey(2) first save(FirstStage)

** First stage Fama-McBeth regression results**

_TimeVar | _obs | _R2 | _b_mva~e | _b_kstock | _Cons |

1935 | 10 | .865262 | .1024979 | -.0019948 | .3560334 |

1936 | 10 | .6963937 | .0837074 | -.0536413 | 15.21895 |

1937 | 10 | .6637627 | .0765138 | .2177224 | -3.386471 |

1938 | 10 | .7055773 | .0680178 | .2691146 | -17.5819 |

1939 | 10 | .8266015 | .0655219 | .1986646 | -21.15423 |

1940 | 10 | .8392551 | .095399 | .2022906 | -27.04707 |

1941 | 10 | .8562148 | .1147638 | .177465 | -16.51949 |

1942 | 10 | .857307 | .1428251 | .071024 | -17.61828 |

1943 | 10 | .842064 | .1186095 | .1054119 | -22.7638 |

1944 | 10 | .875515 | .1181642 | .0722072 | -15.82815 |

1945 | 10 | .9067973 | .1084709 | .0502208 | -10.51968 |

1946 | 10 | .8947517 | .1379482 | .0054134 | -5.990657 |

1947 | 10 | .8912394 | .163927 | -.0037072 | -3.732489 |

1948 | 10 | .7888235 | .1786673 | -.0425555 | 8.53881 |

1949 | 10 | .8632568 | .1615962 | -.0369651 | 5.178286 |

1950 | 10 | .8577138 | .1762168 | -.0220956 | -12.17468 |

1951 | 10 | .873773 | .1831405 | -.1120569 | 26.13816 |

1952 | 10 | .8461224 | .1989208 | -.067495 | 7.29284 |

1953 | 10 | .8892606 | .1826739 | .0987533 | -50.15255 |

1954 | 10 | .8984501 | .1345116 | .3313746 | -133.3931 |

## More on FMB regression

FMB regression – what, how and where

Dr. Hassan RazaMarch 24, 2019 at 11:25 amDear Sir,

Hope you are fine and in good health. I am one of your student from Bara-Gali workshop, I am applying Fama and Macbeth regression on Pakistan Stock exchange firms on monthly data (Data sheet attached herewith). I have some queries regarding asreg

, this code provides the second stage Fama and Macbeth results, but as I check the first stage it only shows me … (Dots) in the first process, why?

When same procedure is applied for Global market excess return, it omitted the same variable and provide results for only constant term why?

I am sorry for your precious time. Please also let me know about any coming workshop on Stata.

Attaullah ShahMarch 24, 2019 at 11:35 amA bit of code was missing which I have added. The updated version can be downloaded from SSC a week or so. However, at the moment, there is a workaround and you do not need to wait for the updated version. So just add the

`save`

option to the line and it will work as expected. Bonus yet, you can the first stage regression ouptut in a file.Dr. Hassan RazaMarch 24, 2019 at 11:43 amThank you so much sir. What about when I regressed against excess global premium it omitted the said variable and only report constant. Sorry for your time.

Attaullah ShahMarch 24, 2019 at 11:45 amSince the FMB regression is a cross-sectional regression, estimated in each time period, therefore, the variables need to vary across entities. Your gspc_return variable seems to be constant within a given period. See the case of the first month:

and you shall see that all the values of this variable are the same within the given month, and is also the case with other months; therefore, the regression does not find any variation in the dataset to fit the model.

MathiasApril 12, 2019 at 10:37 amDear Attaullah Shah,

Is the F value in asreg Y X, fmb by(time) defined as the time-series average of the F values from the cross-sectional regressions?

Thank you for your asreg package, which is very useful to me.

Regards,

Mathias

Attaullah ShahApril 13, 2019 at 11:26 amMathiasThe F-value is directly reported from the mvreg regression that is estimated for all the cross-sectional regressions of the first stage of FMB

AnonymousApril 26, 2019 at 6:26 pmDear Attaullah Shah,

Is it possible to generate the adj. R^2? Thank you!

MonicaApril 26, 2019 at 6:28 pmDear Attaullah Shah,

Is it possible to derive the adj. R^2 variable? Thank you.

MarieMay 9, 2019 at 3:01 pmHey,

I am a little bit unsure how I should understand the procedure.

Does this mean that you estimate one regression for each year across the firms? Or do you estimate one regression on each firm (even though some may be unbalanced, thus some periods may be missing both in the long time interval both also in consecutive periods), and then take the average of this coefficient for each year given the firm present in each period.

Thank you!

Attaullah ShahMay 9, 2019 at 4:07 pmMarie

To understand the FMB procedure, you should first study Fama and MacBeth(1973) paper and relevant literature elsewhere. The procedure estimates a cross-sectional regression in each period in the first step. And in the second step, all those cross-sectional coefficients are averaged across time periods. The standard errors are adjusted for cross-sectional dependence, see Fama and MacBeth(1973) paper for more details.

ReferenceFama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests.

Journal of Political Economy,81(3), 607-636.Thomas A.May 14, 2019 at 5:03 pmDear Attaullah Shah,

First of all, thank you for your website it has been great support to me.

However, I have problems using the fmb on my data set. I have a panel dataset with monthly fund returns from which I wanted to get the average alpha using the fama french 3-factor model. When I set xtset Fund Time I always get omitted variables. The paper I am referring to is doing the same, but does not get omitted variables? Do you have an idea what I’m doing wrong?

I am using: asreg fund_return mktfrf smb hml, fmb

Attaullah ShahMay 15, 2019 at 2:01 amThomas

A similar issue is reported every now and then on Statalist. A more recent thread on the Statalist discusses the issue of variables that are invariant cross-sectionally. Please go there and read the thread.

Thomas A.May 15, 2019 at 3:53 pmThank you for the answer,

not sure if I got it right. The Fama-French factors are panel invariant variables and thus the variables get omitted. But why are so many research papers state that they are using FMB in this context since they all face the same problem? Is there a step to perform before using asreg fmb to get variant variables or would an xtset to time id help?

Attaullah ShahMay 15, 2019 at 4:18 pmThomas

We would be interested in posting relevant text from such papers here. If you

this will cause asreg to first estimate a time series regression for each company and then report the averages of those time series regressions.

Thomas A.May 15, 2019 at 8:19 pmHappy to share that paper with you, but since it is a working paper which is not published yet I would prefer to send in private. Just leave me an e-mail adress where to send it to.

Attaullah ShahMay 16, 2019 at 9:34 pmThomas

What I meant was to share text from the mentioned papers that use Fama and French factors in Fama and MacBeth (1973) regression.

Thomas A.May 16, 2019 at 10:44 pmAtthullah

here is a link to one paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3081166

I am referring to the description of table 2 in specific.

Attaullah ShahMay 17, 2019 at 1:03 amOn page 9 of the mentioned paper, the author writes

“Table 2 shows by-fund average fund performance with Fama and MacBeth (1973) standard errors based on monthly returns.”Therefore, the author does not estimate cross-sectional regressions in the first stage of the Fama and MacBeth (1973) procedure. Rather, he estimates time series regression for each fund, and then finds averages across all firms.

AntonioMay 25, 2019 at 10:09 pmDear Sir,

I was wandering how to run a Fama and MacBeth regression over 25 Portfolios.

In accordance with your code, the first variable needs to be the dependent variable while the following variables are considered as independent variables.. Basically I would like to calculate the risk premium of a factor over the 25 value ans size sorted portfolios. Therefore in my case i would have more dependent variables and just one dependent variable.

Thanks for your avialability

Attaullah ShahMay 26, 2019 at 1:17 amAntonio

To answer your question, I have written this post.

AntonioMay 26, 2019 at 10:19 pmDear Sir,

thanks for your detailed answer but unfortunately your example does not fit mine dataset.

In my dataset the independent variable ( for example the market excess return) has the same value for each Portfolio while in your case the independent variable has different value for each portfolio. In fact when I try to use your code I do not get any coefficient for the market risk premium.

Thanks again for your availability

Attaullah ShahMay 26, 2019 at 11:06 pmYes, cross-sectionally invariant variables will be omitted in Fama and MacBeth regressions. There was a lengthy discussion on this issue on Statalist, it might be helpful for you. The post can be read here

MarieMay 28, 2019 at 6:43 pmThank you for the reply. I have an additional question. Do you know if you can obtain reliable estimates when using this approach on T=27 where the first 7 periods have between 60-150 observations in each while the later periods have between 200 and 600 yearly observations. I was thinking of cutting the period, because the reliability on the first 7 periods may influence the total estimate. I found that my results are significantly different when using T=27 and T=20 due to the limited data in the first years. However, I was unable to find more information online on this issue.

I would be really thankful if you had any articles in mind discussing this issue.

Thank you for your time.

MarieMay 28, 2019 at 7:58 pmThank you for the reply.

I have another concern that I would like to ask you about. I have a panel dataset were T=27. However, in 7 of the years I only have 62-128 observations while I have 150-600 yearly observations in the following 20 years. I am wondering if you know of any problems with small T and then small number (/increasing number of N). I have not been able to find articles concerning this issue so far.

I tried using FmB across the entire 27 years, however the results is significantly different from the result I obtain when only using the T=20. So I am looking for any critique that may be of putting relatively large weight on the 7 years (weight 26%) to betas estimated on only approximately 9% of the total firm years.

Thank you for the help.

JuanJune 3, 2019 at 3:47 pmHi professor, thank you so much for your post and help overall. I have a question however, regarding the time period of the formation for the betas. How do you specify how many days, months or years do you want for the rolling betas to form?

Thanks,

Juan

JonJune 9, 2019 at 1:20 amDear Attulah,

I am investigating the relationship between Abnormal Google Search Volume and Abnormal Returns. The data is collected from S&P 500 with a time-span of 5 years. The independent variables are standardized and all rows containing NA are removed. asreg works just fine without newey, but when newey is included I am unable to run it.

# this code works#this code does not workI get the following error:

no observations

Any help would be greatly appreciated!

Attaullah ShahJune 9, 2019 at 12:44 pmJon

To debug the issue, I would need the following

1. A sample of your data that generates the said error

2. The asreg full command that you have used

Can you please share the above with my dropbox email attashah15@hotmail.com or simply email these.

Attaullah ShahJune 9, 2019 at 5:35 pmJon

Thanks for sending me your dataset. Turns out the problem is not with asreg, it is with your date variable. It has a significant number of gaps which the

`newey()`

option cannot handle. In other words, you are using the lag length of 8 with the`newey()`

option, however, the gaps in your date variable are larger than 8 units and hence you get the error of no observations.ShaikaJune 29, 2019 at 6:24 pmHi Sir,

Can we not use time series regression first and then cross-sectional in step two to avoid cross-sectional invariance of fama-french factor?

If we can, how can we use asreg for it?

Attaullah ShahJune 29, 2019 at 6:47 pmShaika

It’s a question of theory. Does your theory suggest that? You have to dig deep and read the literature of the relevant field. If your literature allows that, then asreg can very easily implement that. The reason I am not showing the command to do that in asreg here is the potential misuse. Readers might not read the full story and quickly jump to do what you are asking for.

ShaikaJune 30, 2019 at 2:53 pmHi Sir,

Thanks for your reply. I saw some of the literature reports regression coefficients of Fama-French factor with Fama-Macbeth procedure. Regressing time series first would be the only option to avoid cross sectional invariance in this case. I tried to alter the xtset command and was able to get the results. Is this the way of doing it?

Thanks

Attaullah ShahJune 30, 2019 at 3:53 pmShaika

This is against the spirit of Fama and MacBeth (1973). You might be missing some important steps of the papers you are referring to. Can you give full references to those papers here and copy paste the relevant text from them?

GabrielJuly 3, 2019 at 2:49 pmHello Sir,

Thank you for the detailed and understandable explanation.

Personally, I am testing the Arbitrage Pricing Theory model using the Fama Macbeth procedure. However, my data is monthly for 10 companies and 5 independent variables.

My question is, when I do the fmb procedure, the coefficients that I get as the final result, how do I know/get for each company/dependent variable?

Attaullah ShahJuly 3, 2019 at 3:53 pmGabrielYou have asked how to get the individual coefficients of the independent variable for each company in Fama and MacBeth (1973) procedure? Well I would refer you to the start of this blog page. It mentions

So the final step would just show the averages of the coefficients estimated in the first step. The first is to estimate as many cross-sectional regressions as the time periods. In other words, there are no company-specific coefficients in the final step. If you want to report the first stage results, then just add first to the fmb option as shown in the blog above.

If you cannot still figure it out, then you can consider our paid help.

Patrick LarsenJuly 19, 2019 at 5:21 pmHi Sir,

I have been using the fmb-procedure during my dissertation and it has been working like a charm!

I produce consistent estimates and correct the time-series dependence with newey-west errors.

I run the regression in order to control for heterogeneity within mutual funds, and I wish to study the residuals over time in order to study price dispersion. Is it possible to receive cross-sectional residuals for each firm with this method? I basically wish to study whether high-cost funds have consistently been high-cost funds over the period. Method was inspired by:

Lach (2002) – Existence and Persistence of Price Dispersion: an Empirical Analysis

Michael Cooper, Michael Halling and Wenhao Yang – The Mutual Fund Fee Puzzle

When i try to predict residuals, i get the “option residuals not allowed”. I realize that the procedure theoretically doesn’t include specific companies and basically pull a random sample, but I have a rather consistent, yet unbalanced, panel.

Best regards

Patrick

Attaullah ShahJuly 20, 2019 at 12:14 amPattrick

Thanks for the feedback and asking about the possibility of generating residuals with FMB. As you have mentioned yourself, this option is not yet available and would a sufficient amount of time. I do not patrons who would support in adding further features to asreg. If you are interested, you can drop me an email at attaullah.shah@imsciences.edu.pk

Juan MengSeptember 13, 2019 at 6:52 pmdear sir,

I have several questions about my regression in using Fama MacBeth regression.

first, my data is quarterly data. Will it impact my result? I mean the result will not as good as monthly data?

second, how about the ” xtfmb ” command? I get the same result as using “asreg”. but, how can I choose the lag when using “xtfmb”?

finally, in my data, T=42. however when I add zfc variable, it has some missing value, the results are as follows. is it OK? moreover, the R2 is not so good. is it OK?

Attaullah ShahSeptember 17, 2019 at 8:02 amJuan Meng

Statistically speaking, there is a general agreement on “the more, the merrier”, and this is the case with the monthly data as compared to quarterly data.

(2) Yes, xtfmb and asreg produce exactly the same result, the only difference lies in the calculation time. asreg is much faster, and the difference in calculation time balloons as we use more data.

(3) Usually, lower r-squared is an indication of omitted variable bias. There is no standard to which a lower or higher value can be compared. You may read several papers on this topic in your domain of research and see how low is the r-squared of your model.

Saif UllahOctober 28, 2019 at 7:34 pmThanks for sharing useful resources. Whenever we want to compute Fama and Macbeth model without intercept. asreg command does not ommit it.

Is there any other option for this?

Attaullah ShahOctober 28, 2019 at 10:57 pmCurrently, asreg does not support the

`noconstant`

option with Fama and MacBeth regression.Saif UllahOctober 29, 2019 at 11:52 amThanks for your response. Can you recommend any alternative? I want to apply Fama and MacBeth regression with and without constant.

Attaullah ShahOctober 29, 2019 at 12:15 pmSafi Ullah

It is hard to tell. Reason being that Fama and MacBeth (1973) did not use any variation of their model without a constant. Hence, academics and developers have not bothered about coding the model without a constant.

Jerome RebeNovember 27, 2019 at 9:13 pmDear Professor Shah,

excuse me already from the start for the lengthy post. I am running in some trouble using asreg with the fmb option.

A sample of the data I use is attached at the bottom.

I am running the following regression:

My question is: is there a way to keep one of the dummy variables fixed over time as the one dummy variable that is being used as a reference group. As of now, if you look at the output of that is produced by first, the command uses the dummies seemingly random over time. For example one month it uses dummy1 as a reference group and the next month it uses dummy5.

Is there a way to fix this, so that for example dummy5 is the reference group over all months?

I am very thankful for your response, have a blessed day!

Attaullah ShahDecember 7, 2019 at 10:02 amJerome Rebe

This will require fundamental change inside the asreg code. Currently, I am a bit over-burdened and cannot find enough motivation to do that. Anyway, thanks for reporting this and bringing it to my attention.

Gerad OngDecember 7, 2019 at 9:53 amI was running Fama Macbeth 2-stage regressions (stage 1) and saw discrepancies in the means from the output table below and the one computed by excel – for the slope coefficients and intercept (see attached excel working and below output table).

Can you kindly advise?

Many thanks!

Attaullah ShahDecember 7, 2019 at 9:54 amHello Gerad Ong

Can you please share the dataset that can reproduce the error.

Gerad OngDecember 7, 2019 at 9:57 amHi Prof Shah,

Thanks, I just checked the data points and noticed that the -ve signs for some of them changed to positive after I exported the table to excel.

I re-exported again and the mean figures seem to match up now.

Many thanks for asreg!

Thanks,

Gerad

PrinceJanuary 6, 2020 at 6:45 pmHello Prof, please is there a way to fix this problem… gaps in dates and therefore adding newey (2) it unable to produce results. Please your answer to the question was “Jon, Thanks for sending me your dataset. Turns out the problem is not with asreg, it is with your date variable. It has a significant number of gaps which the newey() option cannot handle. In other words, you are using the lag length of 8 with the newey() option, however, the gaps in your date variable are larger than 8 units and hence you get the error of no observations.” Please is there a way to fix this? Thank you Prof.

Finance studentMay 5, 2020 at 1:33 pmHi,

I have the same problem as Jon above regarding the newey(8) argument.

You say the explanation is “…however, the gaps in your date variable are larger than 8 units and hence you get the error of no observations.” How do you cope with this? Is it impossible to use newey when you have some gaps in the date variable?