Describe table output: ```python >>> spark.sql('describe table extended test.sample').show(truncate=False, n=100) +----------------------------+--------------------------------------------------------------+----------------------------+ |col_name |data_type |comment | +----------------------------+--------------------------------------------------------------+----------------------------+ |some_id |bigint |some identifier column | |created_at |timestamp |when this record was created| |status |string |status of this record | |stats |struct |stats for this record | |ds |string |date partition | |# Partition Information | | | |# col_name |data_type |comment | |ds |string |date partition | | | | | |# Detailed Table Information| | | |Database |test | | |Table |sample | | |Owner |kandelin | | |Created Time |Tue Jul 26 09:15:46 EDT 2022 | | |Last Access |UNKNOWN | | |Created By |Spark 3.2.1 | | |Type |EXTERNAL | | |Provider |parquet | | |Location |file:/tmp/warehouse/test/sample | | |Serde Library |org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | | |InputFormat |org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | | |OutputFormat |org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat| | |Partition Provider |Catalog | | +----------------------------+--------------------------------------------------------------+----------------------------+ ``` Define a helper to extract the metadata specifics in the above output: ```python def get_describe_table_metadata(spark: SparkSession, table: str) -> Dict[str, str]: """Extract table metadata from the `describe table extended` commanded, returned as a dict.""" metadata_df = spark.sql(f'describe table extended {table}').where(col('col_name').rlike('^[A-Z]')) metadata_dict = { row.col_name.lower(): row.data_type for row in metadata_df.collect() } return metadata_dict ``` Which outputs the following: ```python >>> get_describe_table_metadata(spark, 'test.sample') { "database": "test", "table": "sample", "owner": "kandelin", "created time": "Tue Jul 26 09:15:46 EDT 2022", "last access": "UNKNOWN", "created by": "Spark 3.2.1", "type": "EXTERNAL", "provider": "parquet", "location": "file:/tmp/warehouse/test/sample", "serde library": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe", "inputformat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat", "outputformat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat", "partition provider": "Catalog" } ```