I have been trying to manually upload the JSON into BigQuery, but I get the following error message.
Error while reading data, error message: JSON parsing error in row starting at position 0: Repeated field must be imported as a JSON array. Field: custom_fields.value.
I already converted the file into newline delimited JSON, so that is not the problem. When looking at the custom_field.value from the error I get this:
$ cat convert2working.json | jq .custom_fields[].value 0 "Basics of information securityn" "2021"
The problem seems to be that the custom_fields.value has different data types.
How can I “homogenize” those data types? or do you have another solution. I would prefer to stay in javascript
Here is a shortened version of my JSON code:
{ "id": "example", "custom_fields": [ { "id": "example", "name": "Interval", "type": "drop_down", "type_config": { "default": 0, "placeholder": null, "options": [ { "id": "example", "name": "yearly", "color": null, "orderindex": 0 } ] }, "date_created": "1611228211303", "hide_from_guests": false, "value": 0, "required": false }, { "id": "example", "name": "Description", "type": "text", "type_config": {}, "date_created": "1611228263444", "hide_from_guests": false, "value": "Basics of information securityn", "required": false }, { "id": "example", "name": "Year", "type": "number", "type_config": {}, "date_created": "1611228275285", "hide_from_guests": false, "value": "2021", "required": false } ] }
Advertisement
Answer
You need to normalize your data structure so that BigQuery is able to auto-detect a consistent schema. Because the value
property is once a number
and once a string
, this auto-detection fails.
There are multiple ways to normalize your data. I’m not 100% sure which way will work best for BigQuery, which claims to analyze up to the first 100 rows for schema auto-detection.
The first attempt is to put different types of values into different fields
const fields = data.custom_fields.map(x => { const f = { id: x.id, name: x.name }; f[x.type] = x.value; return f; });
This will yield:
[{ id: "example", name: "Interval", value_drop_down: 0 }, { id: "example", name: "Description", value_text: "Basics of information security↵" }, { id: "example", name: "Year", value_number: "2021" }]
I’m not sure if this is a structure where BigQuery can reliably merge the inferred type schema for fields, because it might e.g. only encounter value_number
in the first 100 rows, and will therefore not handle value_dropdown
.
A more reliable approach (assuming you know all the different values of type
) is to transform the records explicitly into the same structure. This also has the advantage of being able to run any specialized transformations on field values (such as conversions, lookups etc.)
const fields2 = data.custom_fields.map(x => ({ id: x.id, name: x.name, value_text: x.type === 'text' ? x.value : null, value_number: x.type === 'number' ? parseInt(x.value, 10) : null, value_dropdown: x.type === 'drop_down' ? x.type_config.options.find(o => o.orderindex === x.value).name : null }) );
You might have to make some of the transform logic a bit more robust depending on your data (e.g. if values are optional or can be empty). Using your example data this transformation yields:
[{ "id": "example", "name": "Interval", "value_text": null, "value_number": null, "value_dropdown": "yearly", "value_drop_down": 0 }, { "id": "example", "name": "Description", "value_text": "Basics of information securityn", "value_number": null, "value_dropdown": null }, { "id": "example", "name": "Year", "value_text": null, "value_number": "2021", "value_dropdown": null }]
I’ve created a JSFiddle where you can play around with this code.