# Data & Code

# Equity Anomaly data

## Portfolio sorts

Stocks are sorted into *N* portfolios. Value-weighted returns within each portfolio. NYSE breakpoints.

*Current data (July 1963 -- December 2019):* portfolio sorts (daily), portfolio sorts (monthly), portfolio assignments.

References:

Haddad, Kozak, Santosh (2020)

*"Factor Timing":*data (daily), data (monthly).Giglio, Kelly, Kozak (2020)

*"Equity Term Structures without Dividend Strips Data":**data (daily)**,**data (monthly)**.*Kozak, Nagel, Santosh (2018)

*"Interpreting Factor Models"*use an older version of these data.

## Characteristic-managed portfolios

Portfolios are constructed by weighing each stock by its value of a characteristic signal. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.

*Current data (July 1963 -- December 2019):* daily, monthly.

References:

Kozak, Nagel, Santosh (2020)

*"Shrinking the Cross-Section":*data (daily), data (monthly), source code (**a new version with L1L2 penalty**), slides (TeX).Kozak and Santosh (2020)

*"Why do Discount Rates Vary?"*(used a subset of the data above).

## Characteristic signals

This panel dataset contains values of characteristics signals for for each stock at any point in time. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.

*Current data (July 1963 -- December 2019):* characteristic signals.

References:

Kozak (2020)

*"Kernel Trick for the Cross-Section":*data.

# synthetic Equity strip yield data (preliminary)

This dataset is constructed using the model in Giglio, Kelly, Kozak (2021) *"Equity Term Structures without Dividend Strips Data"*. The data contain end-of-month equity yields, as defined by equation (27) in the paper (e_{t,n}). The data contain yields for the aggregate market index for maturities 1--100 years, and for the cross-section of 100 portfolios (50 long and 50 short ends of anomalies below) for maturities 1--15 years. Note that the S&P 500 strips data (tradable contracts) most closely corresponds to the *sizeS* cross-sectional portfolio in these data (large firms).

*Current data (August 1975 -- September 2020):* aggregate and cross-sectional synthetic equity strip yields.

References:

Giglio, Kelly, Kozak (2021)

*"Equity Term Structures without Dividend Strips Data":*data.