Category Archives: Stata Programs

  • 6

asdoc: Exporting customized descriptive statistics from Stata to MS Word / RTF

Category:asdoc,Blog

Osama Mahmood has asked : 

If I want to report 25th and 75th percentiles for variables through asdoc, then how would I do that? And what if I do not want to report the Min and Max?

Answer: In this YouTube video, I have shown various methods in which descriptive statistics can be reported using asdoc. What Osama has asked for is possible with the customized descriptive statistics using the stat() option of asdoc. Using option stat(), we can choose from the following statistics. Each of the bold words in the following list represents the control word that can be used to report the required statistic.

N Number of observations
mean Arithmetic mean
sd Standard deviation
semean Stanard error of the mean
sum Sum / total
range Range
min The smallest value
max The largest value
count Counts the number of non-missing observations
var Variance
cv Coefficient of variation
skewness Skewness
kurtosis Kurtosis
iqr Interquartile range
p1 1st percentile
p5 5th percentile
p10 10th percentile
p25 25th percentile
p50 Median or the 50 percentile
p75 75th percentile
p99 99th percentile
tstat t-statistics that the given variable == 0

   

Example 1: Mean, sd, 25th percentile, median, and 75th percentiles

 sysuse auto
asdoc sum, stat(mean sd p25 p50 p75) replace

   

Example 2: Mean, sd, 25th percentile, median, and 75th percentiles, range, t-statistics

 asdoc sum, stat(mean sd p25 p50 p75 range tstat) replace


  • 2

asdoc: Export matrix to MS Word | the case of xttab command in Stata

Category:asdoc,Blog

asdoc provides a variety of ways in which results from various Stata commands can be exported to MS Word or an RTF file. In this blog post,  I show how to export a Stata matrix to MS word. Usually, Stata commands leave results in r() or e() macros and sometimes in a Stata matrix. Consider the example of xttab command.  xttab is a generalization of tabulate oneway. It performs one-way tabulations and decomposes counts into between and within components in panel data. The command returns results in the r(results) matrix which we can then send to MS word.

 

The syntax

asdoc follows the following syntax for exporting matrix to a word document.

asdoc wmat, matrix(matrix_name) [rnames(row names) cnames(row names) replace append other_options]

 

Description

wmat is the command name – an abbreviation for writing matrix. Option matrix() is a required option to get the name of an existing matrix. Option rnames() and cnames() are optional options to specify row names and column names of the matrix. If these options are left blank, existing row and column names of the matrix are used. Other options of asdoc can also be used with wmat. For example, replace will replace an existing output file, while append will append to the existing file. fs() sets the font size, while option title() can be used to specify the title of the matrix in the output file.

An example: The case of xttab command

The dataset that we shall use is from the help file of xttab.

webuse nlswork
xtset id year
xttab race
mat T = r(results)
asdoc wmat, mat(T) replace

 

Explanation

1. The first line downloads the example data

2. The second line declares the data as panel data

3. The third line tabulates the race variable

4. The fourth line creates a matrix with the name T from the xttab command

5. The fifth line writes the T matrix to a Word file. wmat is a sub-command in asdoc for writing matrix data to the output file. The two words after command are options of asdoc. The first option tells asdoc about the name of the matrix that has to be exported. The second option tells asdoc to replace any existing output file.

asdoc produces the following Table.

Results Table

 value

 Overall:Freq

 Overall:Pe~t

 Between:Freq

 Between:Pe~t

 Within:Per~t

r1

1

20180

70.723

3329

70.664

r2

2

8051

28.215

1325

28.126

r3

3

303

1.062

57

1.21

r4

3

28534

100

4711

100

 

Over a grouping variable?

If we wished to do the above for each category of the grouping variable msp, that has two categories i.e., 0 and 1, we can use the if qualifier and append the results to the same file. So

xttab race if msp == 1
mat T = r(results)
asdoc wmat, mat(T) replace title(When msp == 1)
xttab race if msp == 0
mat T = r(results)
asdoc wmat, mat(T) title(When msp == 0)

 

When msp == 1

 

 value

 Overall:Freq

 Overall:Pe~t

 Between:Freq

 Between:Pe~t

 Within:Per~t

r1

1

13321

77.475

2747

75.405

100

r2

2

3682

21.414

853

23.415

100

r3

3

191

1.111

43

1.18

100

r4

3

17194

100

3643

100

100

 


When msp == 0

 

 value

 Overall:Freq

 Overall:Pe~t

 Between:Freq

 Between:Pe~t

 Within:Per~t

r1

1

6853

60.517

2046

65.724

100

r2

2

4359

38.493

1036

33.28

100

r3

3

112

.989

31

.996

100

r4

3

11324

100

3113

100

100

 


  • 13

asdoc: using option row for creating customized tables row by row in Stata | MS Word

Category:asdoc,Blog Tags : 

Introduction

Option row is a new feature in version 2.0 of asdoc. This feature allows building tables in pieces. That is good news for those who want to make highly customized tables from Stata output.

This feature can be considered an advanced topic and might not be good for Stata beginners. With many other Stata commands, using asdoc is exceptionally easy. You can read this concise blog post for some basic examples of using asdoc. However, if you are already familiar with Stata macros and results returned in r() and e() macros, then you should continue reading this post.

 

How does option row work?


Option row allows building a table row by row from text and statistics. In each run of asdoc with option row, a row is added to the output table. The syntax for using this option is given below:

asdoc, row(data1, data2, data3, ...)

As shown above, we shall type nothing after the word asdoc. Therefore, all other arguments of the command come after the comma. The first required option is row(data1, data2, …). Here data1, data2, … can be either a numeric value, string, or both. Within the brackets after option row, each piece of data should be separated by the character comma and hence it will be written to a separate cell in the output table. If a cell is empty, then each comma should be accompanied by a backslash that is  “,\”

Options


We can use the following options when using option row. dec(): for specifying the number of decimal points. If not used, the default is to use three decimal points. An example of using this option could be dec(2) for using two decimal points. title() : This will add a title to the table. This option works only when the row option is used for the first time in the creation of a table. For example, title(Descriptive Statistics). replace: this option will replace any existing file. Without option replace, the default is to append results.

save(): This will save file with the specified name. For example, save(Table 1) will save the file with the name Table 1. 

A simple example


To understand how does the option row work, let us write first the table column title and then some data. Let us create a table that has four columns. The columns are named as KP, Sindh, Baluchistan, and Punjab. We shall write the table title as Provincial GDP of Pakistan.  So the first row is the header row

 asdoc, row(Years, KP, Sindh, Baluchistan, Punjab) title(Provincial GDP of Pakistan over years) replace

The above line of code generates the table title and the header row. Please note that we also included Years in the table columns because we shall report the provincial GDP over years, therefore we need one additional column for displaying the year labels in the first column. Now let us continue writing to this table. Make sure that you close the Word file before writing additional rows to it.

asdoc, row(1999, 2500.55, 4000.35, 1000.21,  5500.74) dec(2)

In the second line of code, we did not write replace as we wanted to append the results to the same file “MyFile.doc” and we also skipped the title option. We used option dec(2) to report two decimal points with numeric values. We can continue writing additional rows to this table.

asdoc, row(200, 2600.25, 4500.35, 1100, 5700.87) dec(2)

Collecting stats with option accum

We can create a table from text and statistics that are collected from different Stata commands. There is one challenge to developing such a flexible table with option row – that a given row has to be written in one go. So once a row is written, no further cells can be appended to the same row. This means that we need to first collect all the required bits of information before writing a row. Collecting and holding these bits of information can be tricky or too time-consuming. To facilitate this process, asdoc offers option accum(data1 data2,…). The word accum is an abbreviation that I use for accumulate. The syntax of this option is given below:

asdoc, accum(data1 data2 data3 data4 data5 ...) [ dec(#) show ]

Actually, the above command can be run as long as the limit of the global macro to hold data is not reached. The above command will accumulate text and statistics from different runs of asdoc and hold them in the global macro ${accum}. Once we have accumulated all the needed bits of information in the global macro, then its contents can be written to the Word table with option row. Option show can be used to show the contents of the global macro ${accum}. Assume that we want to build an odd table that presents the number of observations, mean, and standard deviation for two variables in two different time periods. The researcher wants to follow the following format:

webuse grunfeld, clear

asdoc, row( \i, \i, invest, \i, \i, kstock,\i) replace

asdoc, row( Periods, N, Mean, SD, N, Mean, SD)

sum invest if inrange(year , 1935, 1945)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

sum kstock if inrange(year , 1935, 1945)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

asdoc, row( 1935-1945, $accum)

Explanation:

1. The second row of our required table reveals that a total of 7 cells are needed, this is why we created 7 cells in the first line of code. The text ” \i,” is a way of entering an empty cell. We entered empty cells so that the variables names invest and kstocks are written in the middle of the table.

2. The second line of code writes the table header row.

3. The third line finds summary statistics. We shall collect our required statistics from the macros that are left behind in r() by the sum command.

4. The fourth line accumulates the required statistics for our first variable invest

5. We are not yet writing the accumulated statistics to the Word file. So we find statistics for our second variable kstocks in the fifth line.

6. We again accumulate the needed statistics for our second variable in the sixth line.

7. Since our row of required statistics is now complete, we write the accumulated statistics and the first-row label, i.e, 1935-1945 to our Word file. Let us write one more row to the table. This time, the statistics are based on years 1946-1954

sum invest if inrange(year , 1946, 1954)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

sum kstock if inrange(year , 1946, 1954)

asdoc, accum(`r(N)', `r(mean)', `r(sd)')

asdoc, row( 1946-1954, $accum)

  • 4

Dropping i.dummies from regression | asdoc | Word | Stata

Category:asdoc,Blog,Stata Programs Tags : 

Questions: I have time and location dummies which I want to include in the regression, but do not want to report them in the regression nested tables created with asdoc. How can I do that?

If you have not already installed asdoc, you can install it from SSC by typing the following in the Stata command window:

ssc install asdoc

Let’s use an example data set.

use http://fintechprofessor.com/regdata.dta, clear

This dataset has four main independent variables, named as x1, x2, x3, x4 and a set of possible dummy variables that will be constructed from the variable year (from 2001-2005) and location (from 1-3).  Let us estimated the following regression:

asdoc reg y x1 x2 x3 i.year i.location, nest drop(i.year i.location) replace

asdoc reg y x1 x2 x4 i.year i.location, nest drop(i.year i.location)

Explanations

In the above two lines, we have estimated two regressions and sent their output to a Word file.  In the first line, we estimated a regression with the three main independent variables x1, x2, and x3 and included the year and location dummies on the fly. The option nest will create a nested regression table. The option drop(i.year i.location) drops these dummy variables from the regression table, however, they are included in the main regression. The two lines produce the following regression table in MS Word. 


[mc4wp_form id=”1409″]


  • 1

Stata Rolling command vs asreg for rolling regressions: Similarities and differences

Category:Stata Programs Tags : 

Karina van Kuijk asked the following question:

Question: 

I need to calculate the factor sensitivity of firms to ultimately sort portfolio’s based on this factor. I have found the asreg Stata code on your website and I was wondering if this code would be useful for my purpose. However, if I compare the rolling Stata code with your aserg program on a small dataset, I won’t get the same results.

Answer 

The key difference between the Stata’s official rolling command and asreg [see this blog entry for installation] is in their speeds. asreg is an order of magnitude faster than rolling.  There are other differences with respect to how these two calculate the regression components in a rolling window.  For example, rolling command will report statistics when the rolling window reaches the required length while asreg reports statistics when the number of observations is greater than the parameters being estimated. Therefore, if we have one independent variable and use a rolling window of 10 periods, rolling will report statistics from the 10th period in the dataset. However, asreg will report statistics from the 3rd observation (two parameters here, the coefficient of the independent variable and the intercept).  To make the results of asreg at par with the rolling command, let us use an example:

 

Example

Let us use the grunfeld data that has 10 companies and 20 years of time series for each company. We shall use the variables invest as dependent variable and mvalue as the independent variable.  Therefore, the rolling command will look like:

 

webuse grunfeld

rolling _b, window(10) saving (beta, replace): reg invest mvalue

The results from the rolling command are reported below only for the first company

 

company start end _b_cons _b_mvalue
1 1935 1944 186.5406 .0562316
1 1936 1945 196.1084 .0573704
1 1937 1946 106.4769 .0847188
1 1938 1947 53.12083 .1053145
1 1939 1948 364.5426 .0359897
1 1940 1949 372.5457 .0400371
1 1941 1950 360.8489 .04835
1 1942 1951 213.7943 .090357
1 1943 1952 119.8572 .1195415
1 1944 1953 -284.6031 .2229699
1 1945 1954 -496.6066 .2841584

 

To find similar results with asreg, we shall type:

bysort company: asreg invest mvalue, wind(year 10)

 

asreg generated the following results for the first company:

 

company year _Nobs _R2 _adjR2 _b_cons _b_mvalue
1 1935 . . . . .
1 1936 . . . . .
1 1937 3 .98568503 .97137006 192.3812 .04135324
1 1938 4 .91957661 .87936492 129.06727 .05411168
1 1939 5 .86795099 .82393465 129.91674 .05233687
1 1940 6 .69944952 .6243119 108.59266 .06102699
1 1941 7 .54085608 .4490273 91.235677 .06942586
1 1942 8 .31250011 .19791679 182.86065 .05101677
1 1943 9 .25355654 .14692176 197.08754 .05052367
1 1944 10 .24298452 .14835759 186.54064 .05623158
1 1945 10 .20582267 .10655051 196.10839 .05737045
1 1946 10 .29515806 .20705282 106.47691 .0847188
1 1947 10 .3728928 .2945044 53.120829 .10531451
1 1948 10 .05894158 -.05869073 364.54258 .03598974
1 1949 10 .1461912 .0394651 372.54574 .04003715
1 1950 10 .18946219 .08814496 360.84887 .04834995
1 1951 10 .41646846 .34352702 213.79429 .09035704
1 1952 10 .38796888 .31146499 119.85717 .11954148
1 1953 10 .69741758 .65959478 -284.60313 .22296989
1 1954 10 .67138447 .63030752 -496.6066 .28415839

 

As mentioned above, asreg does not wait for the full window to get the required number of period. Therefore, results from the rolling command and asreg start to match only from the 10th observation,  i.e., the year 1944. If you like asreg to ignore observation unless the minimum number of periods are available, you can use the option min. So to match the results with the rolling command, we can type:

bysort company: asreg invest mvalue, wind(year 10) min(9)

 

and there you go, asreg produces the same coefficients as the rolling command, with blistering speed.

 

Please do cite asreg in your research

 

In-text citation

Rolling regressions were estimated using asreg, a Stata program written by Shah (2017).

 

Bibliography

Shah, Attaullah, (2017), ASREG: Stata module to estimate rolling window regressions. Fama-MacBeth and by(group) regressions, https://EconPapers.repec.org/RePEc:boc:bocode:s458339.


  • 9

asdoc : options and examples

Category:asdoc,Stata Programs Tags : 

Introduction

asdoc sends Stata output to Word / RTF format. asdoc creates high-quality, publication-ready tables from various Stata commands such as
summarize, correlate, pwcorr, tab1, tab2, tabulate1, tabulate2, tabstat, ttest, regress, table, amean, proportions, means, and many more.
Using asdoc is pretty easy. We need to just add asdoc as a prefix to Stata commands. asdoc has several built-in routines for dedicated
calculations and making nicely formatted tables.

 

asdoc Options

How to enter asdoc options and Stata_command options?
Both the asdoc options and Stata_command specific options should be entered after comma. asdoc will parse both the option itself. For example,the following command has both types of options.

asdoc sum, detail replace dec(3)

option detail belongs to sum command of Stata, whereas options replace and dec(3) are asdoc options.

Following options are used for controlling the behavior of asdoc:

1.1 Replace / append:

We shall use option replace when an existing output file needs to be replaced. On the other hand, we shall use option append if we want to
append results to the existing file. Both the options are optional. Therefore, if none of these options are used, asdoc will first determine
whether a file with a similar name exists in the current directory. If it exists, asdoc will assume an append option. If the file does not
exist, it will create a new file with the default name “Myfile.doc”

Example 1 : running asdoc without replace or append (first time)

sysuse auto
asdoc sum

The above lines of code will generate a new file with the name Myfile.doc. Next, if we estimate a table of correlation, we can replace the
existing file Myfile.doc or append to it. Again, if we do not use any of these options, option append will be assumed. So;

Example 2 : running asdoc without replace or append (second time)

asdoc cor
OR
asdoc cor, append

Both of the above commands serve the same purpose. The file Myfile.doc will now contain a table of summary statistics, followed by a table ofcorrelations. However, had we typed the following, then the file would contain only table of correlations. asdoc cor, replace

1.2 rowappend:

To develop a table row by row from different runs of the asdoc, we need to use option rowappend. This option can be used with ttest, customized summary statistics, or in other instances where the table headers and structure do not change
and appendable statistics have a similar structure as those already in the table.

 

1.3 save (file_name):

Option save(file_name) is an optional option. This option is used to specify the name of the output file. If left blank, the default name will
be used, that is Myfile.doc. If .doc extension is not preferred, then option save will have to be used with the desired extension, such as
.rtf

Example 3 : Naming the output file

asdoc sum, save(summary.doc)
OR
asdoc sum, save(summary.rtf)

 

1.4 title(table_title)

Option title(table_title) is an optional option. This option is used to specify table title. If left blank, a default table title will be
used.

asdoc sum, save(summary.doc) title(Descriptive statistics)

 

1.5 Font size i.e. fs(#)

The default font size of asdoc is 10 pt. Option fs(#) can be used to change it. For example, fs(12) or fs(8), etc.

 

1.6 Decimal points i.e. dec(#)

The default decimal points in many commands are 3. In some commands, the decimal points are borrowed from the Stata output and hence they cannot be changed. In several commands, it is possible to change decimal points with option dec(#). For example, dec(2) or dec(4), etc.

 

1.7 Adding text lines to the output file i.e. text(text lines)

We can write text to our output file with option text(text lines). This is useful when we want to add details or comments with the Stata
output. In fact, this option makes asdoc really flexible in terms of adding tables and paragraph at the same time. We never have to leave the
Stata interface to add comments or interpretation with the results. One trick that we can play is to use option fs() to change font size and
mark headings and sub-headings in the document. Consider the following examples [I have copied some text from www.wikipedia.org for this example]

1. Write a heading "Details on Cars" in our document
asdoc, text(Details on Cars) fs(16) replace

 

2. Now add some text

asdoc, text(A car is a wheeled motor vehicle used for transportation) append fs(10) 
asdoc, text(Most definitions of car say they run primarily on roads, seat one ) append fs(10) asdoc, text(to eight people, have four tires, and mainly transport people.) append fs(10)

 

3. Now add some statistics

sysuse auto, clear
asdoc sum, append fs(10)

 

1.8 Hide Stata output with option hide

We can suppress Stata output with option hide. It is important to mention that option hide might not work with some of the Stata commands (asdoc creates output from log files in some cases).

 

1.9 Getting Stata commands in output files (cmd)

If we need to report the Stata command in the output file, we can use the option cmd.

 

1.10 Abbreviate variable names with option (abb(#))

In case variable names are lengthy, they can be abbreviated in the output file with option abb(#). For example, abb(8). In many cases, the
default value is 10. However, when option label is used, this value is set to = abb + 22

 

1.11 Report variable labels with option (label)

Several commands allow reporting variable labels instead of variable names. For example, the most commonly used commands for reporting statistics are correlate and summarize. Both of these commands allow option label. For example :

asdoc cor, label
asdoc sum, label

 

1.12 Always report equal decimal points (tzok)

The default for report decimal points is to drop trailing zeros and report only valid decimal points. However, we can use the option tzok
i.e. trailing zeros OK, to report equal decimal points for all values even if the trailing values are zero. Therefore, using option
dec(4) for reporting 4 decimal points, the value 2.1 will be reported as follows with and without option tzok.

Default style 2.1
with tzok option 2.1000

 

[addthis tool=”addthis_relatedposts_inline”]


  • 23

Publication quality regression tables with asdoc in Stata – video example

Category:asdoc Tags : 

Creating publication-quality tables in Stata with asdoc is as simple as adding asdoc to Stata commands as a prefix. asdoc can create two types of regression tables. The first type (call it detailed) is the detailed table that combines key statistics from the Stata’s regression output with some additional statistics such as mean and standard deviation of the dependent variable etc. This table is the default option in asdoc. The second table is a compact table that nests more than one regressions in one table (call it nested).

In this video post, I show how to use asdoc to produce the following nested table. 

 

 


  • 4

How to export high-quality table of correlations from Stata to MS Word

Category:asdoc Tags : 

 

For creating a high-quality publication-ready table of correlations from Stata output, we need to install asdoc program from SSC first.

ssc install asdoc, update

Once the installation is complete, we shall add the word asdoc to the cor or correlate command of Stata. Since we estimate correlations among all numeric variables of a dataset with just cor, we shall add asdoc as a prefix to the cor command. For our example, purposes, let us load the auto.dta data from the Stata example files.

 

Example 1: Make a table of correlation for all variables.

sysuse auto, clear
asdoc cor

 

 

Example 2: We can report variable labels instead of variable names

asdoc cor, label replace

 


 

Further, it is possible to write names of the variables in the column headings instead of sequential numbers. For this, we shall invoke the option nonum. Therefore, see example 3.

 

Example 3: Write variable names in column headers

sysuse auto, clear

asdoc pwcorr, nonum replace

 

Read also : 

Table of contents of asdoc

Generate correlation table with significance/stars

Generate a table of descriptive statistics

 

 

 

 

 


  • 26

How to use asdoc : a basic example

Category:asdoc Tags : 

How to use asdoc

Using asdoc is pretty easy. You need to add just asdoc as a prefix to Stata commands. For example, we use sum command to find summary statistics of all numeric variables in the dataset. We shall add just asdoc as a prefix to sum.  Let us load the auto.dta set for practice and find summary stats of all numeric variables and send the output to MS Word with asdoc

sysuse auto
asdoc sum

And voila, a beautiful table of descriptive statistic is ready [click here to see it].

And for correlations, we shall use asdoc cor. If we were to append the results to the same file, we shall just add append after the comma or leave it (append is the default, you can use replace to replace existing file)

asdoc cor
OR 
asdoc cor, append

 

See also the following resources related to asdoc.

YouTube Video: Descriptive / Summary Statistics from Stata in Word with asdoc

YouTube Video: Create publication quality table of correlation in Stata with asdoc

YouTube Video: Writing all statistics to a single Word file from Stata with asdoc

YouTube Video: Create publication quality regression tables in Stata with asdoc

See a Table of Contents that shows what else asdoc can do

 


  • 22

asdoc : Sends Stata output to MS Word

Category:Stata Programs

About asdoc

asdoc is a Stata program that makes the process of sending Stata output to MS Word super easy. asdoc creates high quality, publication-ready tables from various Stata commands such as summarize, correlate, tabstat, cross-tabs, regressions, t-tests, flexible table, and many more.

Installation

The program can be installed by typing the following from the Stata command window:

ssc install asdoc, update

Table of contents

1. Introduction

1.1 asdoc: short introduction and examples
1.2 Commands for controlling asdoc

       2. Summary Statistics

2. Summary statistics
2.1 Basic summary statistics
2.2 Customized summary statistics

       3. Correlations

3. Correlations [Blog Post]
3. Correlations [YouTube Video]

4. Regressions
4.1 Full regression tables [YouTube Video]
4.2 Compact / nested tables (publication quality)
4.3 Regression over a grouping variable (YouTube Video)

5. Frequency tables
5.1 One-way tabulation (tabulate1)
5.2 Two-way tabulation (tabulate2)
5.3 One- and two-way tables of summary statistics (tabsum)
5.4 Multiple-way tables (tab1)
5.5 All-possible two-way tables (tab2)

6. Compact tables (tabstat)
6.1 Without groups
6.1 With groups

7. Flexible table of statistics (table)
7.1 One-way table
7.2 Two-way table
7.3 Three-way table
7.4 Four-way table

8. T-tests
8.1 one-sample t-test
8.2 two-sample using groups
8.3 two-sample using variables
8.4 paired t-test

9. Table of means, std., and frequencies (tabsum)

10. Means
10.1 Arithmetic / harmonic / geometric means
10.2 Proportions
10.3 Ratio
10.4 Total

11. List command

12. Writing matrix to a Word / RTF file

13.The survey prefix command

14.Customized tables with option row