Skip to content
Advertisement

Problem with repeated objects while importing JSON into Google BigQuery

I have been trying to manually upload the JSON into BigQuery, but I get the following error message.

JavaScript

I already converted the file into newline delimited JSON, so that is not the problem. When looking at the custom_field.value from the error I get this:

JavaScript

The problem seems to be that the custom_fields.value has different data types.

How can I “homogenize” those data types? or do you have another solution. I would prefer to stay in javascript

Here is a shortened version of my JSON code:

JavaScript

Advertisement

Answer

You need to normalize your data structure so that BigQuery is able to auto-detect a consistent schema. Because the value property is once a number and once a string, this auto-detection fails.

There are multiple ways to normalize your data. I’m not 100% sure which way will work best for BigQuery, which claims to analyze up to the first 100 rows for schema auto-detection.

The first attempt is to put different types of values into different fields

JavaScript

This will yield:

JavaScript

I’m not sure if this is a structure where BigQuery can reliably merge the inferred type schema for fields, because it might e.g. only encounter value_number in the first 100 rows, and will therefore not handle value_dropdown.

A more reliable approach (assuming you know all the different values of type) is to transform the records explicitly into the same structure. This also has the advantage of being able to run any specialized transformations on field values (such as conversions, lookups etc.)

JavaScript

You might have to make some of the transform logic a bit more robust depending on your data (e.g. if values are optional or can be empty). Using your example data this transformation yields:

JavaScript

I’ve created a JSFiddle where you can play around with this code.

Advertisement