Data & Code

Equity Anomaly data


Anomaly definitions


Portfolio sorts

Stocks are sorted into N portfolios. Value-weighted returns within each portfolio. NYSE breakpoints.


Current data (July 1963 -- December 2019): portfolio sorts (daily), portfolio sorts (monthly), portfolio assignments.


References:

      1. Haddad, Kozak, Santosh (2020) "Factor Timing": data (daily), data (monthly).
      2. Giglio, Kelly, Kozak (2020) "Equity Term Structures without Dividend Strips Data": data (daily), data (monthly).
      3. Kozak, Nagel, Santosh (2018) "Interpreting Factor Models" use an older version of these data.


Characteristic-managed portfolios

Portfolios are constructed by weighing each stock by its value of a characteristic signal. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.


Current data (July 1963 -- December 2019): daily, monthly.


References:

      1. Kozak, Nagel, Santosh (2020) "Shrinking the Cross-Section": data (daily), data (monthly), source code, slides (TeX).
      2. Kozak and Santosh (2020) "Why do Discount Rates Vary?" (used a subset of the data above).


Characteristic signals

This panel dataset contains values of characteristics signals for for each stock at any point in time. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.


Current data (July 1963 -- December 2019): characteristic signals.


References:

      1. Kozak (2020) "Kernel Trick for the Cross-Section": data.



synthetic Equity strip yield data (preliminary)

This dataset is constructed using the model in Giglio, Kelly, Kozak (2020) "Equity Term Structures without Dividend Strips Data". The data contain end-of-month equity yields, as defined by equation (27) in the paper (et,n). The data contain yields for the aggregate market index for maturities 1--100 years, and for the cross-section of 100 portfolios (50 long and 50 short ends of anomalies below) for maturities 1--15 years. Note that the S&P 500 strips data (tradable contracts) most closely corresponds to the sizeS cross-sectional portfolio in these data (large firms).


Current data (August 1975 -- November 2016): aggregate and cross-sectional synthetic equity strip yields.


References:

      1. Giglio, Kelly, Kozak (2020) "Equity Term Structures without Dividend Strips Data": data.