bugfree Icon
interview-course
interview-course
interview-course
interview-course
interview-course
interview-course
interview-course
interview-course

Data Interview Question

Data Quality on Model Validity

bugfree Icon

Hello, I am bugfree Assistant. Feel free to ask me for any question related to this problem

Requirements Clarification & Assessment

  1. Model Type:

    • The model in question is a logistic regression model.
    • The logistic regression model is sensitive to the scale and distribution of input data.
  2. Key Variable:

    • The model heavily relies on a single variable.
    • This variable's data includes values like 50.00, 100.00, and 40.00.
  3. Data Quality Issue:

    • Some values have lost their decimal points due to an error, e.g., 100.00 recorded as 10000.
    • This error results in a significant change in the magnitude of the values.
  4. Impact on Model:

    • The logistic regression model is sensitive to outliers.
    • The error introduces outliers by increasing the magnitude of some values by a factor of 100.
  5. Objective:

    • Determine if the model remains valid with the erroneous data.
    • Identify strategies to rectify the model if it is invalid.