Skip to content

Instantly share code, notes, and snippets.

@lukeplausin
Created January 26, 2024 12:55
Show Gist options
  • Save lukeplausin/3ae4779cba4fa607effff137d1a99ef6 to your computer and use it in GitHub Desktop.
Save lukeplausin/3ae4779cba4fa607effff137d1a99ef6 to your computer and use it in GitHub Desktop.

Revisions

  1. lukeplausin created this gist Jan 26, 2024.
    42 changes: 42 additions & 0 deletions undelete_gcs_objects.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,42 @@
    # This gist describes how to undelete files in GCS while retaining most of the metadata.
    # GCS doesn't seem to use delete markers like AWS. If anyone knows a better way please say so in the comments.

    # I couldn't find any examples of how to do this other than in the UI.
    # GCS documentation doesn't seem to include instructions for restoring versioned objects in python, but I found
    # that this method works. Confusingly they use terms like version, generation, and revision interchangably in
    # the documentation. I couldn't understand how object deletion works on versioned buckets.

    import tempfile
    from google.cloud import storage

    # Set your bucket settings here

    # bucket_name = 'my_bucket'
    # directory_name = 'this/is/my/path/'
    client = storage.Client()
    bucket = client.get_bucket(bucket_name)

    # list all deleted objects in the directory of interest
    deleted_blobs = [
    blob for blob in bucket.list_blobs(prefix=directory_name, versions=True)
    if blob.time_deleted
    ]

    for blob in deleted_blobs:
    # Copy old file revision into a temporary file
    tf = tempfile.NamedTemporaryFile(prefix="", suffix=blob.name)
    blob.download_to_file(tf)
    tf.seek(0)

    # Create a new blob generation
    new_blob = bucket.blob(blob_name=blob.name)
    # Copy old properties over
    new_blob._properties = blob._properties.copy()

    # Clear the ones that mark file generation/deletion/revision/version of the old object
    new_blob._properties.pop('timeDeleted')
    new_blob._properties.pop('generation')
    new_blob._properties.pop('metageneration')

    # Upload the old version as another new version
    new_blob.upload_from_file(tf, rewind=True)