Author Archives: Attaullah Shah

  • 4

Quick Table for Converting Different Dates to Stata Format


Daily Dates

Copying data from the internet, CSV files, or other sources into Stata will record the date as a string variable, shown with red color. Before we can use the Stata time-series or panel-data capabilities, we need to convert the string date to a Stata date. In the following table, the first column shows different date formats in which the date is already recorded and brought into Stata. To convert them into a Stata date, an example code is shown in the second column. Once the date is converted into a Stata readable format, we need to format the date so that the visual display of the date is human-readable. We can do that by using the %td format, for example, we can use the code format mydate %td

text Code Output
gen mydate=date(text, "MDY")
gen mydate=date(text, "MDY")
gen mydate=date(text, "YMD")
gen mydate=date(text, "MY")
gen mydate=date(text,"MDY",1999)
gen mydate=date(text,"MDY",2019)
gen mydate=date(text,"MDY",2000)
gen mydate=date(text,"MDY",2050)
gen mydate=date(text,"MDY",2050)
gen mydate=date(text, "YMD")
gen mydate=date(text, "20YMD")

Example using some data

* Enter example data
input str9 text

* Now convert the variable text to Stata date
gen mydate=date(text, "DMY")

* Change the display format
format mydate %td


From daily to other frequencies

From daily to Code
gen weekly_date = wofd(daily_date)
gen monthly_date = mofd(daily_date)
gen qyarterly_date = qofd(daily_date)
gen year = year(daily_date)



Example using some data

* Enter example data
input str9 text

* Now convert the variable text to Stata date
gen daily_date=date(text, "DMY")
format daily_date %td

* Create a weekly date
gen weekly_date = wofd(daily_date)
format weekly_date %tw

* Create a monthly date
gen monthly_date = mofd(daily_date)
format monthly_date %tm

* Create a quarterly date
gen quarterly_date = qofd(daily_date)
format quarterly_date %tq

* Create a yearly date
gen year = year(daily_date)



From other frequencies to daily

If we already have dates in weekly, monthly, or quarterly frequencies, we can convert them back to daily dates. The second column in the following table provides an example of a given format in which the date is already recorded, and the third column presents the code which shall create a daily date. To see the codes in action, download this do file and execute. The file extension should be changed from doc to do after download. 

From  given_date Code
gen daily_date = dofw(given_date)
gen daily_date = dofm(given_date)
gen daily_date = dofq(given_date)
gen daily_date = dofy(given_date)


Complex Conversions

If we already have dates in weekly, monthly, or quarterly frequencies, we can convert them back to daily dates and then to other frequencies. The second column in the following table provides an example of a given format in which the date is already recorded, and the third column presents the code which shall convert the date to the desired frequency.  

From  given_date Code
Weekly to monthly
gen monthly_date = dofm(dofw(given_date))
Monthly to weekly
gen weekly_date = dofw(dofm(given_date))
Quarterly to monthly
gen monthly_date = dofm(dofq(given_date))
Monthly to quarterly
gen quarterly_date = qofd(dofm(given_date))
Weekly to quarterly
gen quarterly_date = qofd(dofw(given_date))
Quarterly to Weekly
gen weekly_date = dofw(dofq(given_date))



  • 6

asdoc: Exporting customized descriptive statistics from Stata to MS Word / RTF


Osama Mahmood has asked : 

If I want to report 25th and 75th percentiles for variables through asdoc, then how would I do that? And what if I do not want to report the Min and Max?

Answer: In this YouTube video, I have shown various methods in which descriptive statistics can be reported using asdoc. What Osama has asked for is possible with the customized descriptive statistics using the stat() option of asdoc. Using option stat(), we can choose from the following statistics. Each of the bold words in the following list represents the control word that can be used to report the required statistic.

N Number of observations
mean Arithmetic mean
sd Standard deviation
semean Stanard error of the mean
sum Sum / total
range Range
min The smallest value
max The largest value
count Counts the number of non-missing observations
var Variance
cv Coefficient of variation
skewness Skewness
kurtosis Kurtosis
iqr Interquartile range
p1 1st percentile
p5 5th percentile
p10 10th percentile
p25 25th percentile
p50 Median or the 50 percentile
p75 75th percentile
p99 99th percentile
tstat t-statistics that the given variable == 0


Example 1: Mean, sd, 25th percentile, median, and 75th percentiles

 sysuse auto
asdoc sum, stat(mean sd p25 p50 p75) replace


Example 2: Mean, sd, 25th percentile, median, and 75th percentiles, range, t-statistics

 asdoc sum, stat(mean sd p25 p50 p75 range tstat) replace

  • 2

asdoc: Export matrix to MS Word | the case of xttab command in Stata


asdoc provides a variety of ways in which results from various Stata commands can be exported to MS Word or an RTF file. In this blog post,  I show how to export a Stata matrix to MS word. Usually, Stata commands leave results in r() or e() macros and sometimes in a Stata matrix. Consider the example of xttab command.  xttab is a generalization of tabulate oneway. It performs one-way tabulations and decomposes counts into between and within components in panel data. The command returns results in the r(results) matrix which we can then send to MS word.


The syntax

asdoc follows the following syntax for exporting matrix to a word document.

asdoc wmat, matrix(matrix_name) [rnames(row names) cnames(row names) replace append other_options]



wmat is the command name – an abbreviation for writing matrix. Option matrix() is a required option to get the name of an existing matrix. Option rnames() and cnames() are optional options to specify row names and column names of the matrix. If these options are left blank, existing row and column names of the matrix are used. Other options of asdoc can also be used with wmat. For example, replace will replace an existing output file, while append will append to the existing file. fs() sets the font size, while option title() can be used to specify the title of the matrix in the output file.

An example: The case of xttab command

The dataset that we shall use is from the help file of xttab.

webuse nlswork
xtset id year
xttab race
mat T = r(results)
asdoc wmat, mat(T) replace



1. The first line downloads the example data

2. The second line declares the data as panel data

3. The third line tabulates the race variable

4. The fourth line creates a matrix with the name T from the xttab command

5. The fifth line writes the T matrix to a Word file. wmat is a sub-command in asdoc for writing matrix data to the output file. The two words after command are options of asdoc. The first option tells asdoc about the name of the matrix that has to be exported. The second option tells asdoc to replace any existing output file.

asdoc produces the following Table.

Results Table
































Over a grouping variable?

If we wished to do the above for each category of the grouping variable msp, that has two categories i.e., 0 and 1, we can use the if qualifier and append the results to the same file. So

xttab race if msp == 1
mat T = r(results)
asdoc wmat, mat(T) replace title(When msp == 1)
xttab race if msp == 0
mat T = r(results)
asdoc wmat, mat(T) title(When msp == 0)


When msp == 1





































When msp == 0





































  • 18

asdoc: using option row for creating customized tables row by row in Stata | MS Word

Category:asdoc,Blog Tags : 


Option row is a new feature in version 2.0 of asdoc. This feature allows building tables in pieces. That is good news for those who want to make highly customized tables from Stata output.

This feature can be considered an advanced topic and might not be good for Stata beginners. With many other Stata commands, using asdoc is exceptionally easy. You can read this concise blog post for some basic examples of using asdoc. However, if you are already familiar with Stata macros and results returned in r() and e() macros, then you should continue reading this post.

How does option row work?

Option row allows building a table row by row from text and statistics. In each run of asdoc with option row, a row is added to the output table. The syntax for using this option is given below:

asdoc, row(data1, data2, data3, ...)

As shown above, we shall type nothing after the word asdoc. Therefore, all other arguments of the command come after the comma. The first required option is row(data1, data2, …). Here data1, data2, … can be either a numeric value, string, or both. Within the brackets after option row, each piece of data should be separated by the character comma and hence it will be written to a separate cell in the output table. If a cell is empty, then each comma should be accompanied by a backslash that is  “,\”


We can use the following options when using option row. dec(): for specifying the number of decimal points. If not used, the default is to use three decimal points. An example of using this option could be dec(2) for using two decimal points. title() : This will add a title to the table. This option works only when the row option is used for the first time in the creation of a table. For example, title(Descriptive Statistics). replace: this option will replace any existing file. Without option replace, the default is to append results.

save(): This will save file with the specified name. For example, save(Table 1) will save the file with the name Table 1. 

A simple example

To understand how does the option row work, let us write first the table column title and then some data. Let us create a table that has four columns. The columns are named as KP, Sindh, Baluchistan, and Punjab. We shall write the table title as Provincial GDP of Pakistan.  So the first row is the header row

 asdoc, row(Years, KP, Sindh, Baluchistan, Punjab) title(Provincial GDP of Pakistan over years) replace

The above line of code generates the table title and the header row. Please note that we also included Years in the table columns because we shall report the provincial GDP over years, therefore we need one additional column for displaying the year labels in the first column. Now let us continue writing to this table. Make sure that you close the Word file before writing additional rows to it.

asdoc, row(1999, 2500.55, 4000.35, 1000.21,  5500.74) dec(2)

In the second line of code, we did not write replace as we wanted to append the results to the same file “MyFile.doc” and we also skipped the title option. We used option dec(2) to report two decimal points with numeric values. We can continue writing additional rows to this table.

asdoc, row(200, 2600.25, 4500.35, 1100, 5700.87) dec(2)


Collecting stats with option accum

We can create a table from text and statistics that are collected from different Stata commands. There is one challenge to developing such a flexible table with option row – that a given row has to be written in one go. So once a row is written, no further cells can be appended to the same row. This means that we need to first collect all the required bits of information before writing a row. Collecting and holding these bits of information can be tricky or too time-consuming. To facilitate this process, asdoc offers option accum(data1 data2,…). The word accum is an abbreviation that I use for accumulate. The syntax of this option is given below:

asdoc, accum(data1 data2 data3 data4 data5 ...) [ dec(#) show ]

Actually, the above command can be run as long as the limit of the global macro to hold data is not reached. The above command will accumulate text and statistics from different runs of asdoc and hold them in the global macro ${accum}. Once we have accumulated all the needed bits of information in the global macro, then its contents can be written to the Word table with option row. Option show can be used to show the contents of the global macro ${accum}. Assume that we want to build an odd table that presents the number of observations, mean, and standard deviation for two variables in two different time periods. The researcher wants to follow the following format:

webuse grunfeld, clear

asdoc, row( \i, \i, invest, \i, \i, kstock,\i) replace

asdoc, row( Periods, N, Mean, SD, N, Mean, SD)

sum invest if inrange(year , 1935, 1945)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

sum kstock if inrange(year , 1935, 1945)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

asdoc, row( 1935-1945, $accum)



1. The second row of our required table reveals that a total of 7 cells are needed, this is why we created 7 cells in the first line of code. The text ” \i,” is a way of entering an empty cell. We entered empty cells so that the variables names invest and kstocks are written in the middle of the table.

2. The second line of code writes the table header row.

3. The third line finds summary statistics. We shall collect our required statistics from the macros that are left behind in r() by the sum command.

4. The fourth line accumulates the required statistics for our first variable invest

5. We are not yet writing the accumulated statistics to the Word file. So we find statistics for our second variable kstocks in the fifth line.

6. We again accumulate the needed statistics for our second variable in the sixth line.

7. Since our row of required statistics is now complete, we write the accumulated statistics and the first-row label, i.e, 1935-1945 to our Word file. Let us write one more row to the table. This time, the statistics are based on years 1946-1954

sum invest if inrange(year , 1946, 1954)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

sum kstock if inrange(year , 1946, 1954)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

asdoc, row( 1946-1954, $accum)


Need more examples?

There was a question on Statalist for a customized table for reporting ttest results. Liu Qiang made an excellent use of the option row() of asdoc. See his solution here


  • 11

Research Topics in Finance: Asset Pricing

Category:Blog Tags : 


Investor sentiment: Does it augment the performance of asset pricing models?

Mispricing and the five-factor model

Size, value, profitability, and investment: Evidence from emerging markets

4 Noisy prices and the Fama–French five-factor asset pricing model

5 Cross-sectional tests of the CAPM and Fama–French three-factor model

6 Decomposing the size, value and momentum premia of the Fama–French–Carhart four-factor model

7 Monday effect in Fama–French’s RMW factor

8 Digesting anomalies in emerging markets: A comparison of factor pricing models

9 Q-theory, mispricing, and profitability premium

10 Limits of arbitrage and idiosyncratic volatility

11 Is size dead? A review of the size effect in equity returns

12 Market states and the risk-based explanation of the size premium

13 Market volatility and momentum

14 A risk-return explanation of the momentum-reversal “anomaly”

15  Time-varying risk, mispricing attributes, and the accrual premium

16 Bayesian tests of global factor models

17 Model comparison tests of linear factor models in stock returns

18 Multi-factor asset pricing models: Factor construction choices and the revisit of pricing factors

19 Idiosyncratic volatility in the Asian equity market

20 What global economic factors drive emerging Asian stock market returns?

  • 17

asdoc version 2 : Summary of New features | export Stata output to MS Word


Version 2.0 of asdoc is here. This version brings several improvements, adds new features, and fixes minor bugs in the earlier version. Following is the summary of new features and updates.


Brief Introduction of asdoc

asdoc sends Stata output to Word / RTF format. asdoc creates high-quality, publication-ready tables from various Stata commands such as summarize, correlate, pwcorr, tab1, tab2, tabulate1, tabulate2, tabstat, ttest, regress, table, amean, proportions, means, and many more. Using asdoc is pretty easy. We need to just add asdoc as a prefix to Stata commands. asdoc has several built-in routines for dedicated calculations and making nicely formatted tables.


How to update

The program can be updated by using the following command from Stata command window

ssc install asdoc, replace


New Features in Version 2.0

1.  Wide regression tables

This is a new format in which regression tables can be reported. In this format, the variables are shown in columns and one regression is reported per row. Therefore, this type of regressions tables is ideal for portfolios, industries, years, etc. Here is one example of a wide regression table. asdoc allows a significant amount of customization for wide tables including asterisks for showing significance, reporting t-statistics and standard errors either below regression coefficients or sideways, controlling decimal points, reporting additional regression statistics such adjusted R2, RMSE, RSS, etc., adding multiple tables in the same file, and several other features. Read this post to know more about wide table format.


2. Allowing by-group regressions

Version 2.0 of asdoc provides the convenience of estimating regressions over groups and summarizing the regression estimates in nicely formatted tables. This feature follows the Stata default of bysort prefix. This feature works with all three types of regression tables of asdoc that include detailed regression tables, nested tables, and wide tables. In this blog post, I show some examples of by-group regressions.


3. Allowing by-group descriptive statistics

Using the bysort prefix with asdoc, we can now find default, detailed, and customized summary statistics over groups. Details related to this feature will be added later on in a blog post.


4. Option label with tabulate and regress commands

Option label can now be used with regression and tabulation commands. Using this option, asdoc will report variable labels instead of variable names. In case variable labels are empty, then the variable names are reported.


5. Developing tables row by row using option row

Option row is a new feature in version 2. Option row allows building a table row by row from text and statistics. In each run of asdoc with option row, a row is added to the output table. This is a useful feature when statistics are collected from different Stata commands to build customized tables. To know more about this option, read this blog post.


6.  Accumulate text or numbers with option accum

Option accum allows accumulating text or numbers in a global macro. Once accumulated, the contents of the macro can then be written to an output file using option row.


7. Saving files in different folders

One additional feature of version 2.0 is the ability to write new files or append to existing files in different folders.


  • 2

asdoc : Easily create Summary Stats in Stata and send it to MS Word: A video example

Category:Blog Tags : 

In this video post, I show the use of asdoc for the different type of summary statistics in Stata and sending them to MS word. Examples given in this video include:

  1. Default summary statistics that include the number of observations, mean, standard deviation, minimum, and maximum
  2. Detailed summary statistics that include no. of observations, mean, standard deviation, 1st percentile, median, 99th percentile, skewness, and kurtosis
  3. Customized summary statistics
  4. Controlling the number of decimal points
  5. Creating new files
  6. appending to existing files

I would really appreciate if you comment on this video on Youtube and subscribe to my channel if you have not already done that.

  • 4

Dropping i.dummies from regression | asdoc | Word | Stata

Category:asdoc,Blog,Stata Programs Tags : 

Questions: I have time and location dummies which I want to include in the regression, but do not want to report them in the regression nested tables created with asdoc. How can I do that?

If you have not already installed asdoc, you can install it from SSC by typing the following in the Stata command window:

ssc install asdoc

Let’s use an example data set.

use, clear

This dataset has four main independent variables, named as x1, x2, x3, x4 and a set of possible dummy variables that will be constructed from the variable year (from 2001-2005) and location (from 1-3).  Let us estimated the following regression:

asdoc reg y x1 x2 x3 i.year i.location, nest drop(i.year i.location) replace

asdoc reg y x1 x2 x4 i.year i.location, nest drop(i.year i.location)


In the above two lines, we have estimated two regressions and sent their output to a Word file.  In the first line, we estimated a regression with the three main independent variables x1, x2, and x3 and included the year and location dummies on the fly. The option nest will create a nested regression table. The option drop(i.year i.location) drops these dummy variables from the regression table, however, they are included in the main regression. The two lines produce the following regression table in MS Word. 

[mc4wp_form id=”1409″]

  • 13

Exporting tabs and cross-tabs to MS Word from Stata with asdoc

Category:Blog Tags : 

For installation and other uses of asdoc, please see this short blog post.

Tabulation and Cross-tabs with asdoc

Exporting tables created by Stata commands such as tab, tabulate1, tabulate12, table, tabsum, tab1, tab2, and others to MS word is super easy with asdoc.  As with other commands, we need to just add asdoc as a prefix to the tabulation commands that includes tabulate, tabulate1 tabulate2, tab1, tab2, etc. Since frequency tables in Stata can assume different structures, asdoc writes these tables from log files.


One-way table

Example: One-way table

sysuse auto, clear 
asdoc tabulate rep78, replace


Please note that replace is asdoc option to replace the existing file. If we were to write to the existing file, we would then use option append, instead of replace.


Two-way table of frequencies


webuse citytemp2, clear

asdoc tabulate region agecat, replace


Example: Include row percentages


asdoc tabulate region agecat , nokey row replace

Note nokey suppresses the display of a key above two-way tables.


Example: Include column percentages

asdoc tabulate region agecat , nokey column replace

Example: Include row percentages, suppress frequency counts

asdoc tabulate region agecat, nokey row nofreq replace


One- and two-way tables of summary statistics

Example: One-way tabulation with summary statistics


sysuse auto, clear
asdoc tabulate rep78, summarize(mpg) replace


Example: Two variables tabulation with summary statistics

generate wgtcat = autocode(weight, 4, 1760, 4840)

asdoc tabulate wgtcat foreign, summarize(mpg) replace


Example: Suppress frequencies

asdoc tabulate wgtcat foreign, summarize(mpg) nofreq replace


Multiple-way tabulation (tab1)

tab1 produces a one-way tabulation for each variable specified in varlist.

Example: Multiple-way tabulation

sysuse nlsw88, clear
asdoc tab1 race married grade, replace


Two-way for all possible combinations (tab2)

Example: Two variables tabulation with summary statistics

asdoc tab2 race south, replace

  • 1

Stata Rolling command vs asreg for rolling regressions: Similarities and differences

Category:Stata Programs Tags : 

Karina van Kuijk asked the following question:


I need to calculate the factor sensitivity of firms to ultimately sort portfolio’s based on this factor. I have found the asreg Stata code on your website and I was wondering if this code would be useful for my purpose. However, if I compare the rolling Stata code with your aserg program on a small dataset, I won’t get the same results.


The key difference between the Stata’s official rolling command and asreg [see this blog entry for installation] is in their speeds. asreg is an order of magnitude faster than rolling.  There are other differences with respect to how these two calculate the regression components in a rolling window.  For example, rolling command will report statistics when the rolling window reaches the required length while asreg reports statistics when the number of observations is greater than the parameters being estimated. Therefore, if we have one independent variable and use a rolling window of 10 periods, rolling will report statistics from the 10th period in the dataset. However, asreg will report statistics from the 3rd observation (two parameters here, the coefficient of the independent variable and the intercept).  To make the results of asreg at par with the rolling command, let us use an example:



Let us use the grunfeld data that has 10 companies and 20 years of time series for each company. We shall use the variables invest as dependent variable and mvalue as the independent variable.  Therefore, the rolling command will look like:


webuse grunfeld

rolling _b, window(10) saving (beta, replace): reg invest mvalue

The results from the rolling command are reported below only for the first company


company start end _b_cons _b_mvalue
1 1935 1944 186.5406 .0562316
1 1936 1945 196.1084 .0573704
1 1937 1946 106.4769 .0847188
1 1938 1947 53.12083 .1053145
1 1939 1948 364.5426 .0359897
1 1940 1949 372.5457 .0400371
1 1941 1950 360.8489 .04835
1 1942 1951 213.7943 .090357
1 1943 1952 119.8572 .1195415
1 1944 1953 -284.6031 .2229699
1 1945 1954 -496.6066 .2841584


To find similar results with asreg, we shall type:

bysort company: asreg invest mvalue, wind(year 10)


asreg generated the following results for the first company:


company year _Nobs _R2 _adjR2 _b_cons _b_mvalue
1 1935 . . . . .
1 1936 . . . . .
1 1937 3 .98568503 .97137006 192.3812 .04135324
1 1938 4 .91957661 .87936492 129.06727 .05411168
1 1939 5 .86795099 .82393465 129.91674 .05233687
1 1940 6 .69944952 .6243119 108.59266 .06102699
1 1941 7 .54085608 .4490273 91.235677 .06942586
1 1942 8 .31250011 .19791679 182.86065 .05101677
1 1943 9 .25355654 .14692176 197.08754 .05052367
1 1944 10 .24298452 .14835759 186.54064 .05623158
1 1945 10 .20582267 .10655051 196.10839 .05737045
1 1946 10 .29515806 .20705282 106.47691 .0847188
1 1947 10 .3728928 .2945044 53.120829 .10531451
1 1948 10 .05894158 -.05869073 364.54258 .03598974
1 1949 10 .1461912 .0394651 372.54574 .04003715
1 1950 10 .18946219 .08814496 360.84887 .04834995
1 1951 10 .41646846 .34352702 213.79429 .09035704
1 1952 10 .38796888 .31146499 119.85717 .11954148
1 1953 10 .69741758 .65959478 -284.60313 .22296989
1 1954 10 .67138447 .63030752 -496.6066 .28415839


As mentioned above, asreg does not wait for the full window to get the required number of period. Therefore, results from the rolling command and asreg start to match only from the 10th observation,  i.e., the year 1944. If you like asreg to ignore observation unless the minimum number of periods are available, you can use the option min. So to match the results with the rolling command, we can type:

bysort company: asreg invest mvalue, wind(year 10) min(9)


and there you go, asreg produces the same coefficients as the rolling command, with blistering speed.


Please do cite asreg in your research


In-text citation

Rolling regressions were estimated using asreg, a Stata program written by Shah (2017).



Shah, Attaullah, (2017), ASREG: Stata module to estimate rolling window regressions. Fama-MacBeth and by(group) regressions,