What is the relevant outcome space of the random variable Y ?

Econ 41 Review

  1. Discrete Random Variables. Suppose that we are interested in the number of cups of coffee drank by a
    (randomly selected) student at UCLA. This quantity can be represented as a random variable Y with
    probability mass function:
    pY (a) =

    
    
    1
    4
    if a ∈ {0, 1, 2}
    1
    8
    if a = 3
    3
    32 if a = 4
    c if a = 5
    0 otherwise
    ,
    where c is an unknown constant.
    (a) Explain why the number of cups of coffee drank in a day by a randomly selected student at UCLA
    is a random variable.
    (b) What is the relevant outcome space of the random variable Y ?
    (c) Explain what the distribution of this random variable represents. In other words distribution of
    Y assigns a probability to any subset of the outcome space. How do we interpret this probability?
    (d) Solve for c. (Hint: Recall that PY (OY ) = 1 so that P
    a∈OY
    pY (a) must equal one).
    (e) What is the probability that a randomly selected student at UCLA drinks at least 3 cups of coffee
    a day, PY (Y ≥ 3)?
    (f) What is the expected number of cups of coffee drank per day for a randomly selected student at
    UCLA?
  2. Continuous Random Variables. Suppose that we are interested in the income of a randomly selected
    Angeleno. The distribution of incomes (in tens of thousands of dollars) for residents of Los Angeles
    can be described as a random variable, X, with the following pdf.
    fX(a) =



    0.11 − ca if 0 ≤ a ≤ 10
    0 otherwise
    ,
    where c is an unkown constant.
    1
    Page 2
    (a) What is the outcome space of X, OX?
    (b) Using the relationship
    PX(l ≤ X ≤ m) = Z m
    l
    fX(a) da,
    explain why the pdf must always be weakly positive, fX(a) ≥ 0, for any a ∈ R.
    (c) Because PX(OX) = 1 we must have that R 10
    0
    fX(a) da = 1. Using this fact, solve for c.
    (d) What is the expected value of X, E[X]?
    (e) What is the variance of X, Var(X)?
  3. Variance and Covariance. Let Y be a random variable representing income (in tens of thousands of
    dollars) and X be a random variable representing years of education. Suppose that the marginal
    distribution of X is described by its probability mass function
    pX(x) =

    
    
    0.05 if x ∈ {1, 2, . . . , 12}
    0.09 if x ∈ {13, 14, 15, 16}
    0.04 if x ∈ {17}
    0 otherwise
    .
    The marginal distribution of Y is described by its probability density function
    fY (y) =



    0.1 if 0 ≤ y ≤ 10
    0 otherwise
    .
    (a) What is the expectation of Y , E[Y ]? What is its variance, Var(Y )?
    (b) What is the expectation of X, E[X]? What is its variance, Var(X)?
    (c) Using E[Y X] = 60 compute the covariance between Y and X, Cov(X, Y ).
    (d) Calculate the correlation coefficient between X and Y .
    ρY X =
    Cov(X, Y )
    σXσY
    .
    (e) What does this covariance tell us about the relationship between education levels and income? Is
    there a positive or negative association?
    (f) Should we interpret this result as a causal relationship between education and income? What are
    some reasons we may want to refrain from this interpretation?
    (g) (Challenge) A common inequality used in econometrics is the Cauchy-Schwarz inequality. It
    states that, for any random variables X and Y , and any functions g(·) and h(·),

    E[g(X)h(Y )]


    p
    E[g
    2(X)]p
    E[h
    2(Y )].
    Use this inequality to show why the correlation coefficient is bounded between negative one and
    Page 3
    one, −1 ≤ ρXY ≤ 1. (Hint: Try g(x) = x − µX and h(y) = y − µY ).
    Introduction to Single Linear Regression
  4. Useful Equalities. Recall that in deriving the form of βˆ
    1 we used the following equalities
    1
    n
    Xn
    i=1
    (Yi − Y¯ )(Xi − X¯) = 1
    n
    Xn
    i=1
    YiXi − Y¯ X¯ and 1
    n
    Xn
    i=1
    (Xi − X¯)
    2 =
    1
    n
    Xn
    i=1
    X2
    i − (X¯)
    2
    .
    Show either one of these equalities (only have to show one or the other).
  5. Assumptions for Inference. Suppose we are interested in the relationship between the size of the average
    American’s social circle, X, and whether or not they are unemployed, Y . To investigate this relationship
    we want to estimate the following regression equation1
    Y = β0 + β1X + , E[] = E[X] = 0.
    To estimate the regression coefficient parameters we collect a sample of size n, {Yi
    , Xi}
    n
    i=1. Recall
    that for valid asymptotic inference on our estimates βˆ
    0 and βˆ
    1 we require the following assumptions:
    Random Sampling, Homoskedasticity, and Rank condition.
    • Random Sampling: Assume that {Y,Xi} are independently and identically distributed from the
    population of interest, (Yi
    , Xi)
    i.i.d ∼ (Y, X).
    • Homoskedasticity: Assume that Var(|X = x) = σ
    2
    
    for all possible values of x.
    • Rank Condition: There must be at least two distinct values of X that appear in the population.
    (a) Suppose we collect our sample by only randomly surveying people on UCLA campus. Which
    assumption would be violated?
    (b) Suppose we collect our sample and find that everyone appears to have exactly one friend. Which
    assumption would be violated? Why is this a problem when computing the line of best fit through
    our sample?
    (c) Suppose random sampling, homoskedasticity, and the rank condition are all satisfied, but n = 10.
    Why might inferences based on the approximation
    βˆ
    1 − β1
    σˆβ1
    /

    n
    ∼ N(0, 1)
    not be valid?
  6. Hypothesis Testing. Suppose now that we are interested in investigating the relationship between the
    size of someone’s social circle, X, and their income (in tens of thousands of dollars), Y . We want to
    estimate the following linear regression model
    Y = β0 + β1X + , E[] = E[X] = 0.
    1Recall that this regression specification corresponds to finding the line of best fit parameters β0, β1 = arg minb0,b1 E[(Y −
    b0 − b1X)
    2
    ] and defining  = Y − β0 − β1X
    Page 4
    To do so we collect a random sample of size n = 64, {Yi
    , Xi}
    64
    i=1 and find that 1
    n
    Pn
    i=1(Xi −X¯)
    2 = 100,
    1
    n
    Pn
    i=1(Yi − Y¯ )(Xi − X¯) = 225, Y¯ = 5.5, and X¯ = 1.5.
    (a) Using this information find and interpret βˆ
    1 and βˆ
    0.
    (b) After finding βˆ
    1 and βˆ
    1 describe how you would construct the estimated residuals ˆi
    .
    (c) We find that 1
    n
    Pn
    i=1 ˆ
    2
    i = 36. Use this and the result that, for n large,
    βˆ
    1 − β1
    σˆβ1
    /

    n
    ∼ N(0, 1),
    to compute the (approximate) probability that, if the true value was given β1 = 0, we would see
    a value of |βˆ
    1| equal to or larger than the one that we observed.
    (d) Use this result to test, at level α = 0.1, the hypotheses
    H0 : β1 = 0 vs. H1 : β1 6= 0
    (e) Conduct this test in another fashion by constructing the test statistic t
    ∗ and comparing to either
    z0.95 = 1.64 or z0.9 = 1.24 (indicate which value you are comparing the test statistic to).
    (f) Construct a 90% confidence interval for β1. How could we use this to conduct the hypothesis test
    in part (d)?
    (g) Suppose that we find we made an error in our calculation and actually 1
    n
    Pn
    i=1(Xi − X¯)
    2 = 1. If
    all other values stayed the same, how would this change the result of the hypothesis test in part
    (d)?
× How can I help you?