I’m using Google BigQuery to do some ETL, and have found OpenAI’s products to be enormously helpful for the task.
A new client recently asked for some assistance working with Sunshine Act data. Since I expect additional asks about this data set to come in over time, rather than fuss with the generic UI, I decided to load the entire dataset into BigQuery instead. ChatGPT’s o3-mini-high
model has generated schemas and ETL queries extremely well, accelerating my work by at least a factor of 2x.
For example, given this prompt, where ChatGPT had already generated the BigQuery schema for me:
Here is a BigQuery schema in JSON format. Please write a BigQuery SQL query that will group together any fields that seem related, either in a RECORD field, REPEATED field, or a REPEATED RECORD field. For example, columns alpha_1 and alpha_2 should probably be grouped together in a REPEATED field, and columns alpha_1_bravo alpha_1_charlie alpha_2_bravo alpha_2_charlie should probably be grouped together in a REPEATED RECORD field. I will use the query to create a new table, in the style of CREATE TABLE xyz AS SELECT ...; Please ensure that all column names in the created table are lowercase. Please exclude NULL/empty values and empty STRUCT/RECORD values from REPEATED columns.
[
{ "name": "Change_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Noncovered_Recipient_Entity_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Teaching_Hospital_CCN", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Teaching_Hospital_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Teaching_Hospital_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Primary_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Primary_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Recipient_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_Recipient_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_1_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_2_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_3_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_4_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Covered_Recipient_Type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Profile_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_NPI", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_First_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Middle_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Last_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Name_Suffix", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Business_Street_Address_Line1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Business_Street_Address_Line2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_City", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Zip_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Province", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Postal_Code", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Primary_Type_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_Specialty_6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_License_State_code1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_License_State_code2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_License_State_code3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_License_State_code4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Principal_Investigator_5_License_State_code5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Submitting_Applicable_Manufacturer_or_Applicable_GPO_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_ID", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_State", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Country", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Related_Product_Indicator", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_or_Noncovered_Indicator_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Product_Category_or_Therapeutic_Area_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Drug_or_Biological_NDC_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Device_or_Medical_Supply_PDI_1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_or_Noncovered_Indicator_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Product_Category_or_Therapeutic_Area_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Drug_or_Biological_NDC_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Device_or_Medical_Supply_PDI_2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_or_Noncovered_Indicator_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Product_Category_or_Therapeutic_Area_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Drug_or_Biological_NDC_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Device_or_Medical_Supply_PDI_3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_or_Noncovered_Indicator_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Product_Category_or_Therapeutic_Area_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Drug_or_Biological_NDC_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Device_or_Medical_Supply_PDI_4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Covered_or_Noncovered_Indicator_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Product_Category_or_Therapeutic_Area_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Drug_or_Biological_NDC_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Associated_Device_or_Medical_Supply_PDI_5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Total_Amount_of_Payment_USDollars", "type": "NUMERIC", "mode": "NULLABLE" },
{ "name": "Date_of_Payment", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Form_of_Payment_or_Transfer_of_Value", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category1", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category2", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category3", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category4", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category5", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Expenditure_Category6", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Preclinical_Research_Indicator", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Delay_in_Publication_Indicator", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Name_of_Study", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Dispute_Status_for_Publication", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Record_ID", "type": "INTEGER", "mode": "NULLABLE" },
{ "name": "Program_Year", "type": "INTEGER", "mode": "NULLABLE" },
{ "name": "Payment_Publication_Date", "type": "STRING", "mode": "NULLABLE" },
{ "name": "ClinicalTrials_Gov_Identifier", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Research_Information_Link", "type": "STRING", "mode": "NULLABLE" },
{ "name": "Context_of_Research", "type": "STRING", "mode": "NULLABLE" }
]
ChatGPT returned the following query:
CREATE OR REPLACE TABLE `${DATASET}.general_${PGYR}` AS
SELECT
Change_Type AS change_type,
Covered_Recipient_Type AS covered_recipient_type,
Noncovered_Recipient_Entity_Name AS noncovered_recipient_entity_name,
-- Group teaching hospital info into a RECORD
STRUCT(
Teaching_Hospital_CCN AS ccn,
Teaching_Hospital_ID AS id,
Teaching_Hospital_Name AS name
) AS teaching_hospital,
-- Group covered recipient personal info into a RECORD
STRUCT(
Covered_Recipient_Profile_ID AS profile_id,
Covered_Recipient_NPI AS npi,
Covered_Recipient_First_Name AS first_name,
Covered_Recipient_Middle_Name AS middle_name,
Covered_Recipient_Last_Name AS last_name,
Covered_Recipient_Name_Suffix AS name_suffix
) AS covered_recipient,
-- Group address info into a RECORD
STRUCT(
Recipient_Primary_Business_Street_Address_Line1 AS address_line1,
Recipient_Primary_Business_Street_Address_Line2 AS address_line2,
Recipient_City AS city,
Recipient_State AS state,
Recipient_Zip_Code AS zip_code,
Recipient_Country AS country,
Recipient_Province AS province,
Recipient_Postal_Code AS postal_code
) AS recipient_address,
-- Group primary type columns into an ARRAY of STRINGs (excluding null/empty)
ARRAY(
SELECT type
FROM UNNEST([
Covered_Recipient_Primary_Type_1,
Covered_Recipient_Primary_Type_2,
Covered_Recipient_Primary_Type_3,
Covered_Recipient_Primary_Type_4,
Covered_Recipient_Primary_Type_5,
Covered_Recipient_Primary_Type_6
]) AS type
WHERE type IS NOT NULL AND type <> ''
) AS covered_recipient_primary_types,
-- Group specialty columns into an ARRAY of STRINGs (excluding null/empty)
ARRAY(
SELECT spec
FROM UNNEST([
Covered_Recipient_Specialty_1,
Covered_Recipient_Specialty_2,
Covered_Recipient_Specialty_3,
Covered_Recipient_Specialty_4,
Covered_Recipient_Specialty_5,
Covered_Recipient_Specialty_6
]) AS spec
WHERE spec IS NOT NULL AND spec <> ''
) AS covered_recipient_specialties,
-- Group license state codes into an ARRAY of STRINGs (excluding null/empty)
ARRAY(
SELECT lic
FROM UNNEST([
Covered_Recipient_License_State_code1,
Covered_Recipient_License_State_code2,
Covered_Recipient_License_State_code3,
Covered_Recipient_License_State_code4,
Covered_Recipient_License_State_code5
]) AS lic
WHERE lic IS NOT NULL AND lic <> ''
) AS covered_recipient_license_state_codes,
-- Group principal investigator details into a REPEATED RECORD
ARRAY(
SELECT AS STRUCT
pi.covered_recipient_type,
pi.profile_id,
pi.npi,
pi.first_name,
pi.middle_name,
pi.last_name,
pi.name_suffix,
pi.business_street_address_line1,
pi.business_street_address_line2,
pi.city,
pi.state,
pi.zip_code,
pi.country,
pi.province,
pi.postal_code,
ARRAY(
SELECT pt
FROM UNNEST([pi.primary_type_1, pi.primary_type_2, pi.primary_type_3, pi.primary_type_4, pi.primary_type_5, pi.primary_type_6]) pt
WHERE pt IS NOT NULL AND pt <> ''
) AS primary_types,
ARRAY(
SELECT spec
FROM UNNEST([pi.specialty_1, pi.specialty_2, pi.specialty_3, pi.specialty_4, pi.specialty_5, pi.specialty_6]) spec
WHERE spec IS NOT NULL AND spec <> ''
) AS specialties,
ARRAY(
SELECT lic
FROM UNNEST([pi.license_state_code1, pi.license_state_code2, pi.license_state_code3, pi.license_state_code4, pi.license_state_code5]) lic
WHERE lic IS NOT NULL AND lic <> ''
) AS license_state_codes
FROM UNNEST([
STRUCT(
Principal_Investigator_1_Covered_Recipient_Type AS covered_recipient_type,
Principal_Investigator_1_Profile_ID AS profile_id,
Principal_Investigator_1_NPI AS npi,
Principal_Investigator_1_First_Name AS first_name,
Principal_Investigator_1_Middle_Name AS middle_name,
Principal_Investigator_1_Last_Name AS last_name,
Principal_Investigator_1_Name_Suffix AS name_suffix,
Principal_Investigator_1_Business_Street_Address_Line1 AS business_street_address_line1,
Principal_Investigator_1_Business_Street_Address_Line2 AS business_street_address_line2,
Principal_Investigator_1_City AS city,
Principal_Investigator_1_State AS state,
Principal_Investigator_1_Zip_Code AS zip_code,
Principal_Investigator_1_Country AS country,
Principal_Investigator_1_Province AS province,
Principal_Investigator_1_Postal_Code AS postal_code,
Principal_Investigator_1_Primary_Type_1 AS primary_type_1,
Principal_Investigator_1_Primary_Type_2 AS primary_type_2,
Principal_Investigator_1_Primary_Type_3 AS primary_type_3,
Principal_Investigator_1_Primary_Type_4 AS primary_type_4,
Principal_Investigator_1_Primary_Type_5 AS primary_type_5,
Principal_Investigator_1_Primary_Type_6 AS primary_type_6,
Principal_Investigator_1_Specialty_1 AS specialty_1,
Principal_Investigator_1_Specialty_2 AS specialty_2,
Principal_Investigator_1_Specialty_3 AS specialty_3,
Principal_Investigator_1_Specialty_4 AS specialty_4,
Principal_Investigator_1_Specialty_5 AS specialty_5,
Principal_Investigator_1_Specialty_6 AS specialty_6,
Principal_Investigator_1_License_State_code1 AS license_state_code1,
Principal_Investigator_1_License_State_code2 AS license_state_code2,
Principal_Investigator_1_License_State_code3 AS license_state_code3,
Principal_Investigator_1_License_State_code4 AS license_state_code4,
Principal_Investigator_1_License_State_code5 AS license_state_code5
),
STRUCT(
Principal_Investigator_2_Covered_Recipient_Type AS covered_recipient_type,
Principal_Investigator_2_Profile_ID AS profile_id,
Principal_Investigator_2_NPI AS npi,
Principal_Investigator_2_First_Name AS first_name,
Principal_Investigator_2_Middle_Name AS middle_name,
Principal_Investigator_2_Last_Name AS last_name,
Principal_Investigator_2_Name_Suffix AS name_suffix,
Principal_Investigator_2_Business_Street_Address_Line1 AS business_street_address_line1,
Principal_Investigator_2_Business_Street_Address_Line2 AS business_street_address_line2,
Principal_Investigator_2_City AS city,
Principal_Investigator_2_State AS state,
Principal_Investigator_2_Zip_Code AS zip_code,
Principal_Investigator_2_Country AS country,
Principal_Investigator_2_Province AS province,
Principal_Investigator_2_Postal_Code AS postal_code,
Principal_Investigator_2_Primary_Type_1 AS primary_type_1,
Principal_Investigator_2_Primary_Type_2 AS primary_type_2,
Principal_Investigator_2_Primary_Type_3 AS primary_type_3,
Principal_Investigator_2_Primary_Type_4 AS primary_type_4,
Principal_Investigator_2_Primary_Type_5 AS primary_type_5,
Principal_Investigator_2_Primary_Type_6 AS primary_type_6,
Principal_Investigator_2_Specialty_1 AS specialty_1,
Principal_Investigator_2_Specialty_2 AS specialty_2,
Principal_Investigator_2_Specialty_3 AS specialty_3,
Principal_Investigator_2_Specialty_4 AS specialty_4,
Principal_Investigator_2_Specialty_5 AS specialty_5,
Principal_Investigator_2_Specialty_6 AS specialty_6,
Principal_Investigator_2_License_State_code1 AS license_state_code1,
Principal_Investigator_2_License_State_code2 AS license_state_code2,
Principal_Investigator_2_License_State_code3 AS license_state_code3,
Principal_Investigator_2_License_State_code4 AS license_state_code4,
Principal_Investigator_2_License_State_code5 AS license_state_code5
),
STRUCT(
Principal_Investigator_3_Covered_Recipient_Type AS covered_recipient_type,
Principal_Investigator_3_Profile_ID AS profile_id,
Principal_Investigator_3_NPI AS npi,
Principal_Investigator_3_First_Name AS first_name,
Principal_Investigator_3_Middle_Name AS middle_name,
Principal_Investigator_3_Last_Name AS last_name,
Principal_Investigator_3_Name_Suffix AS name_suffix,
Principal_Investigator_3_Business_Street_Address_Line1 AS business_street_address_line1,
Principal_Investigator_3_Business_Street_Address_Line2 AS business_street_address_line2,
Principal_Investigator_3_City AS city,
Principal_Investigator_3_State AS state,
Principal_Investigator_3_Zip_Code AS zip_code,
Principal_Investigator_3_Country AS country,
Principal_Investigator_3_Province AS province,
Principal_Investigator_3_Postal_Code AS postal_code,
Principal_Investigator_3_Primary_Type_1 AS primary_type_1,
Principal_Investigator_3_Primary_Type_2 AS primary_type_2,
Principal_Investigator_3_Primary_Type_3 AS primary_type_3,
Principal_Investigator_3_Primary_Type_4 AS primary_type_4,
Principal_Investigator_3_Primary_Type_5 AS primary_type_5,
Principal_Investigator_3_Primary_Type_6 AS primary_type_6,
Principal_Investigator_3_Specialty_1 AS specialty_1,
Principal_Investigator_3_Specialty_2 AS specialty_2,
Principal_Investigator_3_Specialty_3 AS specialty_3,
Principal_Investigator_3_Specialty_4 AS specialty_4,
Principal_Investigator_3_Specialty_5 AS specialty_5,
Principal_Investigator_3_Specialty_6 AS specialty_6,
Principal_Investigator_3_License_State_code1 AS license_state_code1,
Principal_Investigator_3_License_State_code2 AS license_state_code2,
Principal_Investigator_3_License_State_code3 AS license_state_code3,
Principal_Investigator_3_License_State_code4 AS license_state_code4,
Principal_Investigator_3_License_State_code5 AS license_state_code5
),
STRUCT(
Principal_Investigator_4_Covered_Recipient_Type AS covered_recipient_type,
Principal_Investigator_4_Profile_ID AS profile_id,
Principal_Investigator_4_NPI AS npi,
Principal_Investigator_4_First_Name AS first_name,
Principal_Investigator_4_Middle_Name AS middle_name,
Principal_Investigator_4_Last_Name AS last_name,
Principal_Investigator_4_Name_Suffix AS name_suffix,
Principal_Investigator_4_Business_Street_Address_Line1 AS business_street_address_line1,
Principal_Investigator_4_Business_Street_Address_Line2 AS business_street_address_line2,
Principal_Investigator_4_City AS city,
Principal_Investigator_4_State AS state,
Principal_Investigator_4_Zip_Code AS zip_code,
Principal_Investigator_4_Country AS country,
Principal_Investigator_4_Province AS province,
Principal_Investigator_4_Postal_Code AS postal_code,
Principal_Investigator_4_Primary_Type_1 AS primary_type_1,
Principal_Investigator_4_Primary_Type_2 AS primary_type_2,
Principal_Investigator_4_Primary_Type_3 AS primary_type_3,
Principal_Investigator_4_Primary_Type_4 AS primary_type_4,
Principal_Investigator_4_Primary_Type_5 AS primary_type_5,
Principal_Investigator_4_Primary_Type_6 AS primary_type_6,
Principal_Investigator_4_Specialty_1 AS specialty_1,
Principal_Investigator_4_Specialty_2 AS specialty_2,
Principal_Investigator_4_Specialty_3 AS specialty_3,
Principal_Investigator_4_Specialty_4 AS specialty_4,
Principal_Investigator_4_Specialty_5 AS specialty_5,
Principal_Investigator_4_Specialty_6 AS specialty_6,
Principal_Investigator_4_License_State_code1 AS license_state_code1,
Principal_Investigator_4_License_State_code2 AS license_state_code2,
Principal_Investigator_4_License_State_code3 AS license_state_code3,
Principal_Investigator_4_License_State_code4 AS license_state_code4,
Principal_Investigator_4_License_State_code5 AS license_state_code5
),
STRUCT(
Principal_Investigator_5_Covered_Recipient_Type AS covered_recipient_type,
Principal_Investigator_5_Profile_ID AS profile_id,
Principal_Investigator_5_NPI AS npi,
Principal_Investigator_5_First_Name AS first_name,
Principal_Investigator_5_Middle_Name AS middle_name,
Principal_Investigator_5_Last_Name AS last_name,
Principal_Investigator_5_Name_Suffix AS name_suffix,
Principal_Investigator_5_Business_Street_Address_Line1 AS business_street_address_line1,
Principal_Investigator_5_Business_Street_Address_Line2 AS business_street_address_line2,
Principal_Investigator_5_City AS city,
Principal_Investigator_5_State AS state,
Principal_Investigator_5_Zip_Code AS zip_code,
Principal_Investigator_5_Country AS country,
Principal_Investigator_5_Province AS province,
Principal_Investigator_5_Postal_Code AS postal_code,
Principal_Investigator_5_Primary_Type_1 AS primary_type_1,
Principal_Investigator_5_Primary_Type_2 AS primary_type_2,
Principal_Investigator_5_Primary_Type_3 AS primary_type_3,
Principal_Investigator_5_Primary_Type_4 AS primary_type_4,
Principal_Investigator_5_Primary_Type_5 AS primary_type_5,
Principal_Investigator_5_Primary_Type_6 AS primary_type_6,
Principal_Investigator_5_Specialty_1 AS specialty_1,
Principal_Investigator_5_Specialty_2 AS specialty_2,
Principal_Investigator_5_Specialty_3 AS specialty_3,
Principal_Investigator_5_Specialty_4 AS specialty_4,
Principal_Investigator_5_Specialty_5 AS specialty_5,
Principal_Investigator_5_Specialty_6 AS specialty_6,
Principal_Investigator_5_License_State_code1 AS license_state_code1,
Principal_Investigator_5_License_State_code2 AS license_state_code2,
Principal_Investigator_5_License_State_code3 AS license_state_code3,
Principal_Investigator_5_License_State_code4 AS license_state_code4,
Principal_Investigator_5_License_State_code5 AS license_state_code5
)
]) AS pi
WHERE pi.profile_id IS NOT NULL AND pi.profile_id <> ''
) AS principal_investigators,
-- Group manufacturer/GPO details into a RECORD
STRUCT(
Submitting_Applicable_Manufacturer_or_Applicable_GPO_Name AS submitting_applicable_manufacturer_or_applicable_gpo_name,
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_ID AS id,
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name AS name,
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_State AS state,
Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Country AS country
) AS manufacturer_gpo,
-- Group payment details into a RECORD
STRUCT(
Total_Amount_of_Payment_USDollars AS amount,
Date_of_Payment AS date,
Form_of_Payment_or_Transfer_of_Value AS form
) AS payment,
-- Group expenditure categories into an ARRAY (non-null/empty)
ARRAY(
SELECT cat
FROM UNNEST([
Expenditure_Category1,
Expenditure_Category2,
Expenditure_Category3,
Expenditure_Category4,
Expenditure_Category5,
Expenditure_Category6
]) AS cat
WHERE cat IS NOT NULL AND cat <> ''
) AS expenditure_categories,
Preclinical_Research_Indicator AS preclinical_research_indicator,
Delay_in_Publication_Indicator AS delay_in_publication_indicator,
Name_of_Study AS name_of_study,
Dispute_Status_for_Publication AS dispute_status_for_publication,
Record_ID AS record_id,
Program_Year AS program_year,
Payment_Publication_Date AS payment_publication_date,
ClinicalTrials_Gov_Identifier AS clinicaltrials_gov_identifier,
Research_Information_Link AS research_information_link,
Context_of_Research AS context_of_research
FROM `${DATASET}.OP_DTL_GNRL_PGYR${PGYR}_P01302025_01212025`;
This alone would have taken at least an hour of work. Instead, I was done for the low, low price of 1:57sec of reasoning.
Anyone who thinks LLMs aren’t changing the way we will all do work isn’t paying attention. It’s not coming for our jobs, but it is coming to help.