Skip to content
Advertisement

Generating strings of length l with hamming distance d

I’m working on a variation of the Motif Search Problem and have hit a block on the logic of one of the subparts.

The problem is:

You have the alphabet — ['A', 'C', 'G', 'T']

Now, you are given an array of strings made of these alphabets with length L and are given a hamming distance D.

For example, if we take L=3, d=2 and are given the array [“ACT”,”TGC”,”GTA”].

For each string in the array, we must generate strings of the same length (3 in this case) using the alphabet which have a hamming distance of d. The hamming distance d means that the final string will only have d edits (two if d = 2) as compared to the original string.

So if we take the example of ACT from the array, we must generate GTT, AAG, CGT and so on. And I want to store these generated strings in a list.

A recursive function that breaks down the string seems like the most efficient way to do it but I’m having trouble implementing it. Any ideas for the logic explained with code/pseudocode? I need to be able to extend this problem to L = 15 and d = 5.

Example of Hamming distance d = 1.

enter image description here

Advertisement

Answer

This can get you started. It doesn’t handle creating strings of different length than the source. Perhaps you could add that.

JavaScript code:

JavaScript
Advertisement