Problem Set 4

Author

Lukas Hager

Published

April 19, 2024

This problem set must be submitted on Canvas by 11:59 PM PST on April 24, 2024.

The theme of this week’s problem set: can we predict stock prices?

Exercise 0

Please write a function that takes no arguments and returns a link to your solutions on GitHub.

Use the following shell:

def github() -> str:
    """
    Some docstrings.
    """

    return "https://github.com/<user>/<repo>/blob/main/<filename.py>"

Exercise 1

Please write a function called load_data that accesses the file on Tesla stock price history on the course website and returns that data as a pd.DataFrame.

Use the following shell:

import pandas as pd

def load_data() -> pd.DataFrame:
    """
    Some docstrings.
    """

    return None

Exercise 2

Please write a function called plot_close which takes the output of load_data() as defined above, as well as an optional start and end date (strings formatted as ‘YYYY-MM-DD’) and plots the closing price of the stock between those dates as a line graph. Please include the date range in the title of the graph. Note that this function needn’t return anything, just plot a graph using matplotlib.

Use the following shell:

def plot_close(df: pd.DataFrame, start: str = '2010-06-29', end: str = '2024-04-15') -> None:
    """
    Some docstrings
    """

Exercise 3

Note

We’re going to test the random walk hypothesis (if you’re curious, see this Fama paper). Here’s the idea:

Null hypothesis: the movement of a stock today and the movement of the stock tomorrow are uncorrelated.
Alternative hypothesis: the movement of a stock today and the movement of a stock tomorrow are correlated – we can use data from today to predict price tomorrow.

Fama (and many economists) believe that we cannot reject the null hypothesis – let’s test it with one stock (Tesla).

Please write a function called autoregress that takes a single argument df (the output of Exercise 1) and returns the t statistic on \(\hat{\beta}_0\) from the regression

\[ \Delta x_t = \beta_0 \times \Delta x_{t-1} + \varepsilon_i \]

where \(x_t\) is the close price at time \(t\) and \(\Delta x_t = x_t-x_{t-1}\). Note that the regression should not have an intercept. Please use HC1 standard errors for the regression.

Caution

Make sure you’re only using observations where you have consecutive days of data (that is, you need to have data from time \(t\) and time \(t-1\) to compute \(\Delta x_t\)). Check out pd.shift’s freq argument – could be helpful.

Use the following shell:

def autoregress(df: pd.DataFrame) -> float:
    """
    Some docstrings.
    """

    return None

Exercise 4

Let’s specify the analysis slightly differently. Please write a function called autoregress_logit that takes a single argument df (the output of Exercise 1) and returns the t statistic on \(\hat{\beta}_0\) from the logistic regression

\[ \mathbb{P}(\Delta x_t > 0) = \frac{\exp(\beta_0 \times \Delta x_{t-1})}{1 + \exp(\beta_0 \times \Delta x_{t-1})} \]

Use the following shell:

def autoregress_logit(df: pd.DataFrame) -> float:
    """
    Some docstrings.
    """

    return None

Exercise 5

Please write a function called plot_delta that takes a single argument df (the output of Exercise 1) and plots \(\Delta x_t\) for the full dataset. Note that this function needn’t return anything, just plot a graph using matplotlib.

def plot_delta(df: pd.DataFrame) -> None:
    """
    Some docstrings.
    """

Note

Not to be turned in, but to be considered: does your plot align with your intuition from the regressions you just ran?