Skip to content
Advertisement

How do I optimally distribute values over an array of percentages?

Let’s say I have the following code:

arr = [0.1,0.5,0.2,0.2]; //The percentages (or decimals) we want to distribute them over.
value = 100; //The amount of things we have to distribute
arr2 = [0,0,0,0] //Where we want how many of each value to go

To find out how to equally distribute a hundred over the array is simple, it’s a case of:

0.1 * 100 = 10
0.5 * 100 = 50
...

Or doing it using a for loop:

for (var i = 0; j < arr.length; i++) {
    arr2[i] = arr[i] * value;
}

However, let’s say each counter is an object and thus has to be whole. How can I equally (as much as I can) distribute them on a different value. Let’s say the value becomes 12.

0.1 * 12 = 1.2
0.5 * 12 = 6
...

How do I deal with the decimal when I need it to be whole? Rounding means that I could potentially not have the 12 pieces needed.

A correct algorithm would –

Take an input/iterate through an array of values (for this example we’ll be using the array defined above.

Turn it into a set of whole values, which added together equal the value (which will equal 100 for this)

Output an array of values which, for this example it will look something like [10,50,20,20] (these add up to 100, which is what we need to add them up to and also are all whole).

If any value is not whole, it should make it whole so the whole array still adds up to the value needed (100).

TL;DR dealing with decimals when distributing values over an array and attempting to turn them into an integer

Note – Should this be posted on a different stackoverflow website, my need is programming, but the actual question will likely be solved using a mathematics. Also, I had no idea how to word this question, which makes googling incredibly difficult. If I’ve missed something incredibly obvious, please tell me.

Advertisement

Answer

You should round all values as you assign them using a rounding that is known to uniformly distribute the rounding. Finally, the last value will be assigned differently to round the sum up to 1.

Let’s start slowly or things get very confused. First, let’s see how to assign the last value to have a total of the desired value.

// we will need this later on
sum = 0;

// assign all values but the last
for (i = 0; i < output.length - 1; i++)
{
    output[i] = input[i] * total;
    sum += output[i];
}

// last value must honor the total constraint
output[i] = total - sum;

That last line needs some explanation. The i will be one more than the last allowed int the for(..) loop, so it will be:

output.length - 1 // last index

The value we assign will be so that the sum of all elements is equal to total. We already computed the sum in a single-pass during the assignment of the values, and thus don’t need to iterated over the elements a second time to determine it.

Next, we will approach the rounding problem. Let’s simplify the above code so that it uses a function on which we will elaborate shortly after:

sum = 0;
for (i = 0; i < output.length - 1; i++)
{
    output[i] = u(input[i], total);
    sum += output[i];
}

output[i] = total - sum;

As you can see, nothing has changed but the introduction of the u() function. Let’s concentrate on this now.

There are several approaches on how to implement u().

DEFINITION
u(c, total) ::= c * total

By this definition you get the same as above. It is precise and good, but as you have asked before, you want the values to be natural numbers (e.G. integers). So while for real numbers this is already perfect, for natural numbers we have to round it. Let’s suppose we use the simple rounding rule for integers:

[ 0.0, 0.5 [  => round down
[ 0.5, 1.0 [  => round up

This is achieved with:

function u(c, total)
{
    return Math.round(c * total);
}

When you are unlucky, you may round up (or round down) so much values that the last value correction will not be enough to honor the total constraint and generally, all value will seem to be off by too much. This is a well known problem of which exists a multi-dimensional solution to draw lines in 2D and 3D space which is called the Bresenham algorithm.

To make things easy I’ll show you here how to implement it in 1 dimension (which is your case).

Let’s first discuss a term: the remainder. This is what is left after you have rounded your numbers. It is computed as the difference between what you wish and what you really have:

DEFINITION
WISH ::= c * total
HAVE ::= Math.round(WISH)
REMAINDER ::= WISH - HAVE

Now think about it. The remained is like the piece of paper that you discard when you cut out a shape from a sheet. That remaining paper is still there but you throw it away. Instead of this, just add it to the next cut-out so it is not wasted:

WISH ::= c * total + REMAINDER_FROM_PREVIOUS_STEP
HAVE ::= Math.round(WISH)
REMAINDER ::= WISH - HAVE

This way you keep the error and carry it over to the next partition in your computation. This is called amortizing the error.

Here is an amortized implementation of u():

// amortized is defined outside u because we need to have a side-effect across calls of u
function u(c, total)
{
    var real, natural;

    real = c * total + amortized;
    natural = Math.round(real);
    amortized = real - natural;

    return natural;
}

On your own accord you may wish to have another rounding rule as Math.floor() or Math.ceil().

What I would advise you to do is to use Math.floor(), because it is proven to be correct with the total constraint. When you use Math.round() you will have smoother amortization, but you risk to not have the last value positive. You might end up with something like this:

[ 1, 0, 0, 1, 1, 0, -1 ]

Only when ALL VALUES are far away from 0 you can be confident that the last value will also be positive. So, for the general case the Bresenham algoritm would use flooring, resulting in this last implementation:

function u(c, total)
{
    var real, natural;

    real = c * total + amortized;
    natural = Math.floor(real); // just to be on the safe side
    amortized = real - natural;

    return natural;
}

sum = 0;
amortized = 0;
for (i = 0; i < output.length - 1; i++)
{
    output[i] = u(input[i], total);
    sum += output[i];
}

output[i] = total - sum;

Obviously, input and output array must have the same size and the values in input must be a paritition (sum up to 1).

This kind of algorithm is very common for probabilistical and statistical computations.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement