Skip to content Skip to sidebar Skip to footer

Find Two Numbers In Array Such That All Elements Between Them Is Smaller Than Smallest Of Two Number

how to find pairs in array such that the elements which are between them are smaller than smallest element from the pair. Eg. 10 4 6 8 7 In the above pairs can be formed like: (10,

Solution 1:

You don't need to take a minimum. If the new endpoint is greater than the maximum you've detected so far, then you have a winner.

def findPairs(values):
    arr = []
    forstartinrange(len(values)-1):
        currMax =0forendinvalues[start+1:]:
            if end> currMax:
                arr.append((values[start], end))
                currMax =endreturn arr

nums = [10, 4, 6, 8, 7]
print(findPairs( nums ))

Solution 2:

I thought some more at the solution I hinted at, in a comment above, and I think it could work. You will never get better than O(n²) worst-case, since you might have to output O(n²) pairs, but the thought is that you might want to have an algorithm that is fast when you do not have to output much, and then (necessarily) slow when there is much to output. An O(n+z) algorithm, where z is the size of the output, is optimal for a case where you need to look at all the data, and you need to output all the pairs.

So, my solution, sketched below does that. It cannot compete with a fast O(n²) solution if the output is large--there is more overhead in it--but asymptotically it should work fine when you output o(n²).

The idea is only for intervals longer than two, so I treat all neighbours as a special case that I just report. I can easily do that in O(n). For the longer intervals, I consider the input a sequence of hills and valleys, with hill tops at the positions i where x[i-1] < x[i] and x[i] < x[i+1]. I don't know how you want to treat cases where you have runs of the same value, but you can adapt the code to handle it. It just depends on what you want to do with them.

If I have two neighbouring hill tops, there is a valley between them, and I must report intervals running down both sides. In the middle of the valley there will be some values that I cannot use, initially it is the minimal value in an interval (that cannot be included), and later I mask out larger chunks, turning small hills into valleys.

So, overall, I take pairs of neighbouring hill tops, output intervals on the hill sides between them.

defreport_valid_intervals(x: list[int]):
    """Report intervals (i,j) such that for all i <= k < j, x[k] <= min(x[i],x[j])."""# All neighbours are valid, and these are a special case that we handle first.for i inrange(len(x) - 1):
        yield i, i+1# Identify hills and valleys in the sequence. These are the valleys we use# to report intervals where the left and right points are larger than the# interval values. Identifying the hill tops (tops) and valleys (masks)# is done in O(n).
    masks = [False] * len(x)
    tops = find_tops(x, masks)

    # Now process valleys one by one. In each valley, we report the intervals inside it# in O(z) -- linear in what we report -- and then we update the mask, throwing# away one hill top (making it into a valley) for each pair, also in O(z) (we# only look at indices we also used in the output). When we are through all the# valleys we have elimiated at least half of them. Because we eliminate half each# time, the running time (excluding reporting) is the sum n + n/2 + n/4 + ... a# geomemtric sum that converges to O(n).whilelen(tops) > 1:
        for left, right inzip(tops, tops[1:]):
            for i, j in report_intervals_in_valley(x, left, right, masks):
                yield i, j
            update_valley_mask(x, left, right, masks)

        # eliminate the smaller tops
        tops = [top for top in tops ifnot masks[top]]

You can identify the hills and valleys as easy as this (but you might want to redefine the tops to handle cases where you have runs of the same value).

# Identifying hills and valleys...deffind_tops(x: list[int], masks: list[bool]) -> list[int]:
    """Locate the tops of the hills and mask out the bottom of the valleys.
    Runs in O(n) and only runs once."""# Get hill tops
    start = [0] if x[0] > x[1] else []
    end = [len(x)-1] if x[-1] > x[-2] else []
    mid = [i for i inrange(1, len(x)-1) if x[i-1] < x[i] > x[i+1]]
    tops = start + mid + end

    # mask valley bottoms. Doesn't matter with the end-points,# since they are only relevant as tops.for i inrange(1, len(x)-1):
        if x[i-1] > x[i] < x[i+1]:
            masks[i] = Truereturn tops

Iteratively running through pairs neighbouring hill tops is linear time, because we eliminate half the tops each time, getting a geometric sum. I am not sure that argument is necessary here, if we analyse the underlying running time a bit more, but it is there if we need it. We just have to process each valley in time proportional to the output we get from it. And here it gets a little tricky.

You can output intervals in a valley by running down one side, and then running down the other side as long as you do not include a value smaller than the next value on the first side. If you output something every time you do this, you get O(z), but you could run down a side and test the first value on the other side repeatedly, without outputting anything. Then you spend O(m) time, where m is the length of the side of the valley, and then the running time goes above O(n+z).

However, we are going to mask out indices that cannot be used after we have processed a valley. All indices with a value smaller than the smallest peak cannot be part of an interval that spans across a hill top, and we can mask those out so we do not consider them. If we process valleys such that we do the expensive work on the smaller hill, we will mask out the indices we process, and since they are only masked out once and then never considered again, we are back to O(n+z).

# When reporting from a valley, work out the valid left and right interval to# consider. Those are the indices from the left/right respectively until we see a# masked index. The masked indices are valleys that we have already processed,# and we do not need to consider anything in there, as intervals spanning them# must also contain values that are too large for a hill top inbetween.defreport_intervals_in_valley_left(x: list[int],
                                    left_top: int, right_top: int,
                                    masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(lv+z) where
    z is the number of intervals we report and lv is the size of the left side
    of the valley (because we look at all those indices even if we do not
    report). We will mask out those intervals afterward so we never look at
    them again, so they cannot contribute more than O(n) to the algorithm in
    total."""for i inrange(left_top, right_top):
        if masks[i]:  # end of this side of the valleybreakfor j inrange(right_top, left_top, -1):
            if masks[j]:
                breakif x[j] < x[i + 1]:  # we cannot go below the next on the other sidebreak# no more hits from iyield i, j


defreport_intervals_in_valley_right(x: list[int],
                                     left_top: int, right_top: int,
                                     masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(rv+z) where
    z is the number of intervals we report and rv is the size of the right side
    of the valley (because we look at all those indices even if we do not
    report). We will mask out those intervals afterward so we never look at
    them again, so they cannot contribute more than O(n) to the algorithm in
    total."""for j inrange(right_top, left_top, -1):
        if masks[j]:
            breakfor i inrange(left_top, right_top):
            if masks[i]:  # end of this side of the valleybreakif x[i] < x[j - 1]:  # we cannot go below the next on the other sidebreak# no more hits from iyield i, j


defreport_intervals_in_valley(x: list[int],
                               left_top: int, right_top: int,
                               masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(v+z) where
    z is the number of intervals we report and v is the size of the side of the
    valley that will be masked out and not contribute to the running time again."""# make sure it is the side with the smallest top that risks doing# the most workif x[left_top] < x[right_top]:
        yieldfrom report_intervals_in_valley_left(x, left_top, right_top, masks)
    else:
        yieldfrom report_intervals_in_valley_right(x, left_top, right_top, masks)

When you update a valley by masking, you remove the values smaller than the smallest peak. Here, my current code doesn't quite do the right thing.

defupdate_valley_mask(x: list[int],
                       left_top: int, right_top: int,
                       masks: list[bool]):
    """Mask out the indices that are smaller than the smallest top. Those cannot
    be part of valid pairs when we merge this valey with its neighbours. The running
    time is less than the time we spent reporting intervals, since each index we
    consider was part of at least one pair that we outputted."""
    mi = min(x[left_top], x[right_top])
    masks[left_top if x[left_top] == mi else right_top] = Truefor i inrange(left_top, right_top):  # range is entire interval, but we breakif masks[i]:
            break# we enter the already masked, so get out to get the correct running timeif x[i] < mi:
            masks[i] = Truefor i inrange(right_top, left_top, -1):
        if masks[i]:
            breakif x[i] < mi:
            masks[i] = True

The code processes the sides of the valley, and it looks at indices that do not get masked out. That means that it doesn't guarantee the O(n+z) running time. However, it should be simple enough to keep track of the closest masked index on the left and right of each hill top. We know the tops of our valley when we mask, so it would just be updating a list. If we do that, then we can start from the closest masked value and work our way up towards the hill top, and then we will mask all the indices we process exact the last, and then the running time is correct again.

I can have a go at it, but I suspect this is only of theoretical interest. The O(n²) solutions are probably fast enough for most cases (and if you have to output many intervals you cannot do better). But I'm reasonably sure that you can do it in O(n+z), and probably smarter than this solution.

Post a Comment for "Find Two Numbers In Array Such That All Elements Between Them Is Smaller Than Smallest Of Two Number"