## Summary Statistics

### Default summary statistics

The default in asdocx is to report mean, standard deviation, minimum, maximum for all numeric variables. Therefore, we do not need to type variable names with the `sum` command.

```* Load Example dataset
sysuse auto, clear

* Estimate summary statistics
asdocx sum, replace```
Table: Descriptive Statistics
Variable Obs Mean Std. Dev. Min Max
price 74 6165.257 2949.496 3291 15906
mpg 74 21.297 5.786 12 41
rep78 69 3.406 0.99 1 5
headroom 74 2.993 0.846 1.5 5
trunk 74 13.757 4.277 5 23
weight 74 3019.459 777.194 1760 4840
length 74 187.932 22.266 142 233
turn 74 39.649 4.399 31 51
displacement 74 197.297 91.837 79 425
gear_ratio 74 3.015 0.456 2.19 3.89
foreign 74 0.297 0.46 0 1

### Statistics for selected variables

In case summary statistics are required only for selected variables, then we need to write the variable names after the `sum` word. Assume that we need summary statistics for price, trunk, mpg, weight, and foreign, our code and results are shown below.

`asdocx sum price trunk mpg weight foreign,  replace`

Table: Descriptive Statistics

Variable Obs Mean Std. Dev. Min Max
price 74 6165.257 2949.496 3291 15906
trunk 74 13.757 4.277 5 23
mpg 74 21.297 5.786 12 41
weight 74 3019.459 777.194 1760 4840
foreign 74 0.297 0.46 0 1

### Reporting variable Labels

Variable labels can be reported with the option `label`. This option works with all other tables that asdocx can create. If a variable does not have a label, then the variable name is reported.

`asdocx sum price trunk mpg weight foreign,  replace label`

Table: Descriptive Statistics

Variable Obs Mean Std. Dev. Min Max
price 74 6165.257 2949.496 3291 15906
Trunk space (.. ft.) 74 13.757 4.277 5 23
Mileage (mpg) 74 21.297 5.786 12 41
Weight (lbs.) 74 3019.459 777.194 1760 4840
Car origin 74 0.297 0.46 0 1

### Using [if] [in] conditions

asdocx accepts `if` and `in` conditions just like any other Stata command. Both `if` and `in` should come at the end of the variable list and before the `,`. See the following example where we want to report descriptive statistics for cases where the car price is greater than 4000.

`asdocx sum if price > 4000,  replace`

Table: Descriptive Statistics

Variable Obs Mean Std. Dev. Min Max
price 63 6586.81 3003.064 4010 15906
mpg 63 20.444 5.488 12 41
rep78 59 3.356 0.978 1 5
headroom 63 3.016 0.889 1.5 5
trunk 63 14.317 4.261 5 23
weight 63 3125.714 772.782 1760 4840
length 63 191.238 21.603 147 233
turn 63 40.111 4.381 31 51
displacement 63 208.095 92.766 85 425
gear_ratio 63 2.984 0.454 2.19 3.89
foreign 63 0.286 0.455 0 1

### Controlling decimal points

Decimal points can be controlled using the `dec()` option. The default is to report three decimal points. If we were to report 4 decimal points, we shall add `dec(4)` option.

`asdocx sum ,  dec(4)`

Table: Descriptive Statistics

Variable Obs Mean Std. Dev. Min Max
cob 170 0.1824 0.3873 0 1
ridageyr 170 31.1882 25.5146 0 80
sex 170 0.5294 0.5006 0 1
married 92 0.5217 0.5023 0 1
ridreth3 170 2.9412 1.7497 1 7
hs 170 45.8353 49.0983 0 99
poor 148 0.4932 0.5017 0 1
insured 169 0.8343 0.3729 0 1
rtnhcpl 170 0.9353 0.2467 0 1
srhealth 170 0.8353 0.372 0 1
cursmk 36 0.3889 0.4944 0 1
alc_use 36 1.1111 0.9495 0 2
dropped 170 0.5118 0.5013 0 1
mec8yr 170 8083.9818 8791.0193 0 39463.809
sdmvpsu 170 1.6647 0.6612 1 3
sdmvstra 170 95.6 4.0403 90 103

### tzok : Reporting equal number of decimal points

In the preceding example, we reported four decimal points using option `dec(4)`. However, some of the values have no decimal points. The reason is that asdocx does not report decimal points if all the trailing values are zeros. We can force asdocx to report equal number of decimal points even if all trailing values are zero. This can be done using the option `tzok`, that is, trailing zeros ok.

`asdocx sum ,  dec(4) tzok`

Table: Descriptive Statistics

Variable Obs Mean Std. Dev. Min Max
cob 170 0.1824 0.3873 0.0000 1.0000
ridageyr 170 31.1882 25.5146 0.0000 80.0000
sex 170 0.5294 0.5006 0.0000 1.0000
married 92 0.5217 0.5023 0.0000 1.0000
ridreth3 170 2.9412 1.7497 1.0000 7.0000
hs 170 45.8353 49.0983 0.0000 99.0000
poor 148 0.4932 0.5017 0.0000 1.0000
insured 169 0.8343 0.3729 0.0000 1.0000
rtnhcpl 170 0.9353 0.2467 0.0000 1.0000
srhealth 170 0.8353 0.3720 0.0000 1.0000
cursmk 36 0.3889 0.4944 0.0000 1.0000
alc_use 36 1.1111 0.9495 0.0000 2.0000
dropped 170 0.5118 0.5013 0.0000 1.0000
mec8yr 170 8083.9818 8791.0193 0.0000 39463.8086
sdmvpsu 170 1.6647 0.6612 1.0000 3.0000
sdmvstra 170 95.6000 4.0403 90.0000 103.0000

### Detailed summary statistics

To find detailed summary statistics, we normally type `summarize, detail` or `sum, detail` command in Stata. To make a table of detailed summary statistics, we shall just add `detail` after comma to the asdocx `sum` command. Using this option, the following statistics are added to the table : `observations`, `mean`, `standard deviation`, `minimum`, `maximum`, `1st percentile`, `99th percentile`, `skewness`, and `kurtosis`. If additional statistics or a specific combination of statistics are required, then we can use the customized statistics option [see the following section].

`asdocx sum,  replace detail`

Table: Descriptive Statistics

Variables Obs Mean Std. Dev. Min Max p1 p99 Skew. Kurt.
price 74 6165.3 2949.5 3291 15906 3291 15906 1.7 4.8
mpg 74 21.3 5.8 12 41 12 41 0.9 4
rep78 69 3.4 1 1 5 1 5 -0.1 2.7
headroom 74 3 0.8 1.5 5 1.5 5 0.1 2.2
trunk 74 13.8 4.3 5 23 5 23 0 2.2
weight 74 3019.5 777.2 1760 4840 1760 4840 0.1 2.1
length 74 187.9 22.3 142 233 142 233 0 2
turn 74 39.6 4.4 31 51 31 51 0.1 2.2
displacement 74 197.3 91.8 79 425 79 425 0.6 2.4
gear_ratio 74 3 0.5 2.2 3.9 2.2 3.9 0.2 2.1
foreign 74 0.3 0.5 0 1 0 1 0.9 1.8

### Custom summary statistics

To make a table of a specific combination of statistics, use the option `statistics()` or `stat()` with asdocx `sum` command. Option `statistics()` allows the following statistics:

option details
N Number of observations
mean Arithmetic mean
sd Standard deviation
semean Stanard error of the mean
sum Sum / total
range Range
min The smallest value
max The largest value
count Counts the number of non-missing observations
var Variance
cv Coefficient of variation
skewness Skewness
kurtosis Kurtosis
iqr Interquartile range
p1 1st percentile
p5 5th percentile
p10 10th percentile
p25 25th percentile
p50 Median or the 50 percentile
p75 75th percentile
p90 90th percentile
p99 99th percentile
tstat t-statistics that the given variable == 0

Assume that we wish to report mean, standard deviation, t-value, 1st, and 99th percentiles for all variables.

`asdocx sum,  replace stat(N mean sd tstat p1 p99)`

Table: Descriptive Statistics

Variables N Mean Std. Dev. 1st Perc. 99th Perc. t-value
price 74 6165.257 2949.496 3291 15906 17.981
mpg 74 21.297 5.786 12 41 31.666
rep78 69 3.406 0.99 1 5 28.578
headroom 74 2.993 0.846 1.5 5 30.436
trunk 74 13.757 4.277 5 23 27.666
weight 74 3019.459 777.194 1760 4840 33.421
length 74 187.932 22.266 142 233 72.605
turn 74 39.649 4.399 31 51 77.527
displacement 74 197.297 91.837 79 425 18.481
gear ratio 74 3.015 0.456 2.19 3.89 56.839
foreign 74 0.297 0.46 0 1 5.557

### Statistics over a grouping variable

To find summary statistics separately for each category of a grouping variable, we can use `by(varname)` or the prefix `bysort varname: `with asdocx. Examples of grouping variables can include country, year, industry, gender, family, etc. In the following example, let us report mean SD, t-value 1st and 99th percentiles for each category of the variable `foreign`. In the auto dataset, the variable `foreign` has two categories : Domestic and Foreign.

