New: Frontend mock interviews added to Exponent Practice →

Top Data Scientist Statistics & Experimentation Interview Questions

Review this list of 7 statistics & experimentation data scientist interview questions and answers verified by hiring managers and candidates.
  • Microsoft logoAsked at Microsoft 
    S
    B

    "In the Transformer architecture, the decoder differs from the encoder primarily in its additional mechanisms designed to handle autoregressive sequence generation. Here's a breakdown of the key differences: Self-Attention Mechanism: Encoder: The encoder has a standard self-attention mechanism that allows each token to attend to all other tokens in the input sequence. Decoder: The decoder has two types of self-attention. The first is the same as in the encoder, but the second is mas"

    Ranj A. - "In the Transformer architecture, the decoder differs from the encoder primarily in its additional mechanisms designed to handle autoregressive sequence generation. Here's a breakdown of the key differences: Self-Attention Mechanism: Encoder: The encoder has a standard self-attention mechanism that allows each token to attend to all other tokens in the input sequence. Decoder: The decoder has two types of self-attention. The first is the same as in the encoder, but the second is mas"See full answer

    Data Scientist
    Statistics & Experimentation
  • F
    L
    U

    "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"

    Aarav G. - "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"See full answer

    Data Scientist
    Statistics & Experimentation
  • T

    "I would use A/B testing to see if the new feature would be incrementally beneficial. To begin the testing, we should define what's the goal of this testing. Let's say the new feature would increase the average number of trade by X. Then randomly assign the clients to two groups, control and test group. Control group doesn't see the new feature and the test group see the new feature. We could also stratified sampling if we want to make sure cover different customer segmentation. During this desig"

    Jiin S. - "I would use A/B testing to see if the new feature would be incrementally beneficial. To begin the testing, we should define what's the goal of this testing. Let's say the new feature would increase the average number of trade by X. Then randomly assign the clients to two groups, control and test group. Control group doesn't see the new feature and the test group see the new feature. We could also stratified sampling if we want to make sure cover different customer segmentation. During this desig"See full answer

    Data Scientist
    Statistics & Experimentation
  • McKinsey logoAsked at McKinsey 
    F

    "The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"

    Himani E. - "The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"See full answer

    Data Scientist
    Statistics & Experimentation
  • Meta (Facebook) logoAsked at Meta (Facebook) 
    Video answer for 'Explain Bayes' theorem.'
    C

    "Is it bad to get the answer a different way? Will they mark that as not knowing Bayes Theorem or just correct as it is an easier way to get the answer? The way I went is to look at what happens when the factory makes 100 light bulbs. Machine A makes 60 of which 3 are faulty, Machine B makes 40 of which 1.2 are faulty. Therefore the pool of faulty lightbulbs is 3/4.2 = 5/7 from machine A and 1.2/4.2 = 3/7 from Machine B."

    Will I. - "Is it bad to get the answer a different way? Will they mark that as not knowing Bayes Theorem or just correct as it is an easier way to get the answer? The way I went is to look at what happens when the factory makes 100 light bulbs. Machine A makes 60 of which 3 are faulty, Machine B makes 40 of which 1.2 are faulty. Therefore the pool of faulty lightbulbs is 3/4.2 = 5/7 from machine A and 1.2/4.2 = 3/7 from Machine B."See full answer

    Data Scientist
    Statistics & Experimentation
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Amazon logoAsked at Amazon 
    Video answer for 'What are common linear regression problems?'
    C

    "I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"

    Ilnur I. - "I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"See full answer

    Data Scientist
    Statistics & Experimentation
    +3 more
  • Z

    "Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"

    Chetak C. - "Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"See full answer

    Data Scientist
    Statistics & Experimentation
Showing 1-7 of 7