Skip to content

Instantly share code, notes, and snippets.

@basselkarami
Last active March 16, 2022 10:25
Show Gist options
  • Select an option

  • Save basselkarami/5f3550d7832828b66ef59f4eb9c44ed4 to your computer and use it in GitHub Desktop.

Select an option

Save basselkarami/5f3550d7832828b66ef59f4eb9c44ed4 to your computer and use it in GitHub Desktop.
Utility function for sanity checks that tests if model output increases with the provided input columns
def sanity_check_sum(model, dataframe, cols, delta=1):
'''Calculates success rate on basic sanity check. A "delta" value is added
to columns in a dataframe and the newly predicted house price should be higher
than the existing prediction since the addition is supposed to be an added feature
to the house such as bigger area or better condition or view etc.
Args:
model: sklearn or other model with predict() method
dataframe: pandas dataframe with dataset to be test
cols: column or list of columns in dataframe to be incremented by delta parameter
delta (optional): Value added to columns before predicting price on updated dataframe.
Defaults to 1.
Returns:
% of observations where all sanity checks are passed
'''
if isinstance(cols, str):
cols = [cols]
test_results = []
for col in cols:
dataframe_pre = dataframe.copy(deep=True)
dataframe_post = dataframe.copy(deep=True)
dataframe_post[col] = dataframe_pre[col] + delta
test_results.append(model.predict(dataframe_post) >=
model.predict(dataframe_pre))
# Check if test is passed on every column (AND logic)
test_results = np.min(test_results, axis=0)
return round(np.mean(test_results), 4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment