Skip to content

How to split each element of a string array into different categories?

I am using node.js to fetch data from a website. Once I have that fetched data I want to insert into a mySQL database. Fetching the the url gives me a json dump of information.

After formatting the string, I am using an array to store my data. Below is an example of my output:

['table_id: 0xFC (252) SCTE 35',
  "section_syntax_indicator: '0'",
  "private_indicator: '0'",
  "reserved: '11'",
  'section_length: 0x39 (57)',
  'protocol_version: 0',
  'encrypted_packet: 0 no part of this message is encrypted',
  'encryption_algorithm: 0 No encryption',
  'pts_adjustment: 0xFFFF7F18 (33000) > Time: 95443.4 sec > (hh:mm:ss.ms) 26:30:43.351',
  'cw_index: 0x00 (0)',
  'tier: 0x0FFF (4095)',
  'splice_command_length: 0x0005 (5)',
  'splice_command_type: 0x06 (6) time_signal [] time_signal: > Time: 19345 sec > (hh:mm:ss.ms) 05:22:24.972',
  'time_specified_flag: 1 presence of the pts_time field',
  'reserved: 0x3F (63)',
  'pts_time: PTS: 1741047514 [0x67C646DA] > Time: 19345 sec > (hh:mm:ss.ms) 05:22:24.972',
  'descriptor_loop_length: 35 [] Descriptors: [] segmentation_descriptor (0x02): Content Identification (0x01)',
  'descriptor_tag: 0x02 (2)',
  'descriptor_length: 0x21 (33)',
  'identifier: 0x43554549 (CUEI)',
  'segmentation_event_id: 0x00000001 (1)',
  "segmentation_event_cancel_indicator: '0' a previously sent segmentation event, identified by segmentation_event_id, has NOT been cancelled",
  'reserved: 0x7F (127)',
  "program_segmentation_flag: '1' the message refers to a Program Segmentation Point and that the mode is the Program Segmentation Mode whereby all PIDs/components of the program are to be segmented",
  "segmentation_duration_flag: '0' No presence of segmentation_duration field",
  "delivery_not_restricted_flag: '1' the next five bits are reserved",
  'reserved: 0x1F (31)',
  'segmentation_upid_type: 0x01 (1) Deprecated: use type 0x0C; The segmentation_upid does not follow a standard naming scheme.',
  'segmentation_upid_length: 0x12 (18)',
  'segmentation_type_id: 0x01 (1) Content Identification',
  'segment_num: 0x01 (1)',
  'segments_expected: 0x01 (1)',
  'CRC_32: 0x46D15AF3 CRC OK'
]

I want to split each element of this array into different categories.

For example: "time_specified_flag: 1 presence of the pts_time field"

I want this element into 3 different categories (parameter, value, description). Like [time_specified_flag, 1, presence of the pts_time field]

Edit: this is the original json dump as a string: "dump" : "======================================================================================================================================rn| PID: 0401 [SEC -> SCTE-35] length: 60[0x003C] status: VALID |rn--------------------------------------------------------------------------------------------------------------------------------------rnAddress Parameter Length Value Descriptionrn--------------------------------------------------------------------------------------------------------------------------------------rn<span class="hex_tree_child">0x0000,0: [-] PID: 0401 [SEC -> SCTE-35] <0x2A,0> : (37 nodes in 3 levels)</span>rn0x0000,0: * table_id <0x1,0> : 0xFC (252) SCTE 35rn0x0001,0: * section_syntax_indicator <0x0,1> : '0'rn0x0001,1: * private_indicator <0x0,1> : '0'rn0x0001,2: * reserved <0x0,2> : '11'rn0x0001,4: * section_length <0x1,4> : 0x39 (57)rn0x0003,0: * protocol_version <0x1,0> : 0rn0x0004,0: * encrypted_packet <0x0,1> : 0 no part of this message is encryptedrn0x0004,1: * encryption_algorithm <0x0,6> : 0 No encryptionrn0x0004,7: * pts_adjustment <0x4,1> : 0xFFFF7F18 (-33000) => Time: 95443.4 sec => (hh:mm:ss.ms) 26:30:43.351rn0x0009,0: * cw_index <0x1,0> : 0x00 (0)rn0x000A,0: * tier <0x1,4> : 0x0FFF (4095)rn0x000B,4: * splice_command_length <0x1,4> : 0x0005 (5)rn0x000D,0: * splice_command_type <0x1,0> : 0x06 (6) time_signalrn<span class="hex_tree_child">0x000E,0: [-] time_signal <0x5,0> : => Time: 19345 sec => (hh:mm:ss.ms) 05:22:24.972</span>rn0x000E,0: * time_specified_flag <0x0,1> : 1 presence of the pts_time fieldrn0x000E,1: * reserved <0x0,6> : 0x3F (63)rn0x000E,7: * pts_time <0x4,1> : PTS: 1741047514 [0x67C646DA] => Time: 19345 sec => (hh:mm:ss.ms) 05:22:24.972rn0x0013,0: * descriptor_loop_length <0x2,0> : 35rn<span class="hex_tree_child">0x0015,0: [-] Descriptors <0x11,0> :</span>rn<span class="hex_tree_child">0x0015,0: [-] segmentation_descriptor (0x02) <0x11,0> : Content Identification (0x01)</span>rn0x0015,0: * descriptor_tag <0x1,0> : 0x02 (2)rn0x0016,0: * descriptor_length <0x1,0> : 0x21 (33)rn0x0017,0: * identifier <0x4,0> : 0x43554549 (CUEI)rn0x001B,0: * segmentation_event_id <0x4,0> : 0x00000001 (1)rn0x001F,0: * segmentation_event_cancel_indicator <0x0,1> : '0' a previously sent segmentation event, identified by segmentation_event_id, has NOT been cancelledrn0x001F,1: * reserved <0x0,7> : 0x7F (127)rn0x0020,0: * program_segmentation_flag <0x0,1> : '1' the message refers to a Program Segmentation Point and that the mode is the Program Segmentation Mode whereby all PIDs/components of the program are to be segmentedrn0x0020,1: * segmentation_duration_flag <0x0,1> : '0' No presence of segmentation_duration fieldrn0x0020,2: * delivery_not_restricted_flag <0x0,1> : '1' the next five bits are reservedrn0x0020,3: * reserved <0x0,5> : 0x1F (31)rn0x0021,0: * segmentation_upid_type <0x1,0> : 0x01 (1) Deprecated: use type 0x0C; The segmentation_upid does not follow a standard naming scheme.rn0x0022,0: * segmentation_upid_length <0x1,0> : 0x12 (18)rn0x0035,0: * segmentation_type_id <0x1,0> : 0x01 (1) Content Identificationrn0x0036,0: * segment_num <0x1,0> : 0x01 (1)rn0x0037,0: * segments_expected <0x1,0> : 0x01 (1)rn0x0038,0: * CRC_32 <0x4,0> : 0x46D15AF3 CRC OKrn======================================================================================================================================rn",

Here is my code snippet to format this blob of data:

My code to format the JSON response

Is this something that is possible to do?

Answer

@TayshawnHill … was this already sufficient enough?… version 1 matches/captures exactly the pts_time like formats/categories whereas version 2 matches/captures the more generic formats/categories. – Peter Seliger

@PeterSeliger version 2 groups them how I would like them. However, I am not sure how to make use of the regex to put the parameter, value, and description into an object. – Tayshawn Hill

… here we go …

function createKeyValueAndDescriptionList(pattern) {
  const regXCompositeValue = (/^(?<key>[^:]+):s*(?:(?<value>[A-Za-z]+:s*[^>]+))s*(?<description>.*)/);
  const regXGenericValue = (/^(?<key>[^:]+):s*(?<value>[w']+(?:s*([^)]*))*)s*(?<description>.*)/);

  const { groups } = (
    regXCompositeValue.exec(pattern) ||
    regXGenericValue.exec(pattern) ||
    {}
  );
  return (groups && [

    groups.key,
    groups.value.trim(),
    groups.description || '',

  ] || []);
}

console.log([

  'table_id: 0xFC (252) SCTE 35',
  "section_syntax_indicator: '0'",
  "private_indicator: '0'",
  "reserved: '11'",
  'section_length: 0x39 (57)',
  'protocol_version: 0',
  'encrypted_packet: 0 no part of this message is encrypted',
  'encryption_algorithm: 0 No encryption',
  'pts_adjustment: 0xFFFF7F18 (33000) > Time: 95443.4 sec > (hh:mm:ss.ms) 26:30:43.351',
  'cw_index: 0x00 (0)',
  'tier: 0x0FFF (4095)',
  'splice_command_length: 0x0005 (5)',
  'splice_command_type: 0x06 (6) time_signal [] time_signal: > Time: 19345 sec > (hh:mm:ss.ms) 05:22:24.972',
  'time_specified_flag: 1 presence of the pts_time field',
  'reserved: 0x3F (63)',
  'pts_time: PTS: 1741047514 [0x67C646DA] > Time: 19345 sec > (hh:mm:ss.ms) 05:22:24.972',
  'descriptor_loop_length: 35 [] Descriptors: [] segmentation_descriptor (0x02): Content Identification (0x01)',
  'descriptor_tag: 0x02 (2)',
  'descriptor_length: 0x21 (33)',
  'identifier: 0x43554549 (CUEI)',
  'segmentation_event_id: 0x00000001 (1)',
  "segmentation_event_cancel_indicator: '0' a previously sent segmentation event, identified by segmentation_event_id, has NOT been cancelled",
  'reserved: 0x7F (127)',
  "program_segmentation_flag: '1' the message refers to a Program Segmentation Point and that the mode is the Program Segmentation Mode whereby all PIDs/components of the program are to be segmented",
  "segmentation_duration_flag: '0' No presence of segmentation_duration field",
  "delivery_not_restricted_flag: '1' the next five bits are reserved",
  'reserved: 0x1F (31)',
  'segmentation_upid_type: 0x01 (1) Deprecated: use type 0x0C; The segmentation_upid does not follow a standard naming scheme.',
  'segmentation_upid_length: 0x12 (18)',
  'segmentation_type_id: 0x01 (1) Content Identification',
  'segment_num: 0x01 (1)',
  'segments_expected: 0x01 (1)',
  'CRC_32: 0x46D15AF3 CRC OK',

].map(createKeyValueAndDescriptionList));

console.log([

  "foo:------------",
  "bar:++++++++++++",
  "baz:############",

].map(createKeyValueAndDescriptionList));
.as-console-wrapper { min-height: 100%!important; top: 0; }