Skip to content

Instantly share code, notes, and snippets.

@AaronTorgerson
Last active December 1, 2015 17:40
Show Gist options
  • Select an option

  • Save AaronTorgerson/bffd6edf331ffd109a87 to your computer and use it in GitHub Desktop.

Select an option

Save AaronTorgerson/bffd6edf331ffd109a87 to your computer and use it in GitHub Desktop.
Clean out bad data from bad django model records that won't load. (DANGER: DESTROYS DATA)
def fix_data(model_manager, fields_to_update_with_defaults_dict, commit=False):
bad_ids = []
all_ids = model_manager.values_list('id', flat=True)
print "There are {} total IDs.".format(len(all_ids))
for b_id in all_ids:
try:
b = model_manager.get(id=b_id)
except:
bad_ids.append(b_id)
print "There were {0} bad IDs: ".format(len(bad_ids))
print ", ".join(str(id) for id in bad_ids)
if commit:
# fix 'em
model_manager \
.filter(id__in=bad_ids) \
.update(**fields_to_update_with_defaults_dict)
else:
print "Didn't fix, just reported."
@AaronTorgerson
Copy link
Author

I wrote this to clean out bad data on a pre-production environment with data that had been encrypted a long time ago with a key that has long since been replaced. Since we're testing a big data migration, I wanted to keep as much of the data intact as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment