Skip to content
Advertisement

String tokenizer method

Consider strings with this format:

id-string1-string2-string3.extension

where id, string1, string2 and string3 can be string of variable length, and extension is an image extension type.

For example, two possible strings could be:

Il2dK-Ud2d9-Kod2d-d9dwo.jpg

j54fwf3da-7jrg-9eujodww-kio98ujk.png

I need tokenizer method in JavaScript for an express/nodejs API that takes these strings in input and outputs an object with this format:

{a: id-string1-string2, b: string3, c: extension}

For the example strings this tokenizer should then output:

{a: Il2dK-Ud2d9-Kod2d, b: d9dwo, c: jpg}

{a: j54fwf3da-7jrg-9eujodww, b: kio98ujk, c: png}

I think this can be done with regex. I tried to use the following regex match(/[^-]+/g), but this tokenize every substring, I need a way to skip the first 2 char “-” but couldn’t find it out.

Do you have any ideas? Or could you provide me a better solution instead of using regex? Thanks very much!

Advertisement

Answer

You can achieve this using spit as:

const str = 'Il2dK-Ud2d9-Kod2d-d9dwo.jpg';
const [restStr, c] = str.split('.');
const [a, b] = restStr.split(/-([a-z0-9]+$)/);
const result = { a, b, c };
console.log(result);
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement