Problem with repeated objects while importing JSON into Google BigQuery

Question

I have been trying to manually upload the JSON into BigQuery, but I get the following error message. I already converted the file into newline delimited JSON, so that is not the problem. When looking at the custom_field.value from the error I get this: The problem seems to be that the custom_fields.value has different data types. How can I "homogenize"

Accepted Answer

You need to normalize your data structure so that BigQuery is able to auto-detect a consistent schema. Because the value property is once a number and once a string, this auto-detection fails.There are multiple ways to normalize your data. I&#8217;m not 100% sure which way will work best for BigQuery, which claims to analyze up to the first 100 rows for schema auto-detection.The first attempt is to put different types of values into different fieldsconst fields = data.custom_fields.map(x => {    const f = {        id: x.id,    name: x.name  };    f[x.type] = x.value;    return f;});This will yield:[{  id: "example",  name: "Interval",  value_drop_down: 0}, {  id: "example",  name: "Description",  value_text: "Basics of information security↵"}, {  id: "example",  name: "Year",  value_number: "2021"}]I&#8217;m not sure if this is a structure where BigQuery can reliably merge the inferred type schema for fields, because it might e.g. only encounter value_number in the first 100 rows, and will therefore not handle value_dropdown.A more reliable approach (assuming you know all the different values of type) is to transform the records explicitly into the same structure. This also has the advantage of being able to run any specialized transformations on field values (such as conversions, lookups etc.)const fields2 = data.custom_fields.map(x => ({    id: x.id,    name: x.name,    value_text: x.type === 'text' ? x.value : null,    value_number: x.type === 'number' ? parseInt(x.value, 10) : null,    value_dropdown: x.type === 'drop_down' ? x.type_config.options.find(o => o.orderindex === x.value).name : null  }));You might have to make some of the transform logic a bit more robust depending on your data (e.g. if values are optional or can be empty). Using your example data this transformation yields:[{  "id": "example",  "name": "Interval",  "value_text": null,  "value_number": null,  "value_dropdown": "yearly",  "value_drop_down": 0}, {  "id": "example",  "name": "Description",  "value_text": "Basics of information securityn",  "value_number": null,  "value_dropdown": null}, {  "id": "example",  "name": "Year",  "value_text": null,  "value_number": "2021",  "value_dropdown": null}]I&#8217;ve created a JSFiddle where you can play around with this code.

Advertisement

Answer