scrollSpy for an infinite list

Have you ever heard of the term "scrollSpy"?

A scrollSpy is a component that watches the scroll position of a container in order to perform actions based on what is currently visible in the browser's viewport (such as updating navigation links).

The principle is quite simple: let's say you have an page with many sections comprised of a title and some paragraphs. You also have a navigation sidebar that contains all the section titles in the page. When the the scroll position of the page reaches a section, you want to mark the corresponding title as "active" in the sidebar.

Context

This is the behavior we needed for one of our internal tools called the Technical facts dashboard.
Basically, this dashboard is intended to register every DM internal event that could have an impact on our production environment. Events of such nature are actually quite frequent and produce a very long list of activities. Events are grouped by day, so the list can also be seen as a chronologically descending list of days, each of which contains a chronologically descending list of events. The interface looks like this:

Dashboard screenshot

And this is the component that required a scrollSpy to cover the day we're navigating:

Component screenshot

The purpose of this component is to hold the day that is currently "active" (meaning which day section is currently at the top of our browser viewport).

Early on we decided that we would use an infinite list to present the history of events. But infinite lists always come with a few challenges...

The challenge we'll talk about here is how do you connect a scrollSpy to an infinite list with minimal performance impact (because of course, there is no native DOM API function to get the correspondence between a "scroll value" and the corresponding element(s) at that position).

Analysis of current solutions

There are quite a lot of scrollSpy plugins out there but we did not like any of them because they all had the same drawbacks (at least at the time of development of our dashboard):

  1. Initialize up front with the list of all elements already known.

  2. Pre-compute at initialization time each element offset or compute all of them on each scroll event.

This approach did not fit our needs because:

  • Since we deal with an infinite mutable list, we would need to re-initialize the scrollSpy plugin each time the list is modified and we could miss some list mutations.

  • List elements have a variable height so again, we would need to re-initialize scrollSpy plugin each time an element is modified and again we could miss some mutations.

  • Performing the computation of element positions at init time or worse, at scroll time, for a large number of elements can be pretty expensive and we want to maintain a fluid scroll.

We therefore needed a slightly different approach to meet our needs. Luckily, there is a fairly simple way to solve this problem...

It's just a maths problem

All we have to do is to understand that we have no other choice but to compute positions of list elements on the fly during a scroll. But we have to do it in an effective manner, which excludes doing it on the complete element list. The problem then becomes: How can we reduce the "infinite" list of elements to a small enough subset that allows the positions to be calculated fast enough? It's actually easier than it sounds, thanks to the nature of the component we're dealing with: a list of elements stacked one under another. This means that when getting our list of elements from the DOM we can represent it like so:

Elements position graph

To put it differently, the element positions can be seen as a strictly increasing function of the element indices. And this is the key to reduce our huge elements list.

Dichotomy to the rescue

Let's recall what we want: when a user scrolls our elements list, we want to find the element located at the given scroll offset. This requires a trial and error process that can be done in a very effective way with a simple well-known algorithm: the dichotomy algorithm (https://en.wikipedia.org/wiki/Dichotomic_search).

To use this algorithm, we need to find an interval of elements in our huge list in which we are guaranteed to find the currently active element.

An interval is defined by two boundaries and we have to find these "boundary elements" when the list is scrolled. The good news is that when our list container is built, we already have one of our boundaries and we also have our active element: at initialization, our container's scroll offset is 0 and the active element is the first one in the list. Knowing this first boundary we only need to find the second boundary when the user scrolls.

To perform even more accurately with our algorithm, we'd like to start with the smallest possible elements interval. To do so, we make use of the fact that our elements have a minimum height of 50px (note that in the function graph above, all elements have height >= 50px). To simplify the explanation, let's pretend we scroll from 0 to 600px. We then go from this:

scrollOffset = 0;
activeElementIndex = 0;

to this:

scrollOffset = 600;
activeElementIndex = ?;

What we're interested in, of course, is that missing activeElementIndex.

Here are the steps to perform the search of our reduced list elements using the dichotomy method:

  1. The current active element is used as our first boundary element and we'll call it A.

  2. Take an arbitrary boundary element B that we are certain is further than our scroll position
    Since our element are all >= 50px we choose the element with an index = activeElementIndex + (scrollOffsetDelta/50) where scrollOffsetDelta = currentScrollOffset - previousScrollOffset.

    This give us: index = 0 + ((600 - 0) / 50) = 12.

    With this element index, we are sure that our active element is before our boundary element B.

  3. We'll call the interval formed by [A, B] the "working interval".

  4. Split the working interval in two equal parts and call the middle element M.

  5. If the middle element M is over the scroll position, change the working interval to [A, M], otherwise, change the working interval to = [M, B].

    If the interval contains one element we are done, otherwise go back to step 4.

To be completely correct, step 2 should take into account the scroll direction (to search the arbitrary element) and should make sure that the arbitrary element index obtained is one that actually exists (and fallback to the first or last element of the list if it doesn't) but these are just implementation details.

Following this process, we end up with an interval of only one element that is our active element.

Conclusion

When we were first asked to build this scroll-dependent component on our infinite list, we had an immediate concern about performance. But it finally turns out that this problem was less a browser problem than a classical math problem. And it was actually quite interesting to see how simple this could be solved when finding the right abstraction (after all dichotomy is something a lot of people learned at college).

In the end, with this simple algorithm we were able to do a fast enough search through our list than we would have had with the classic incremental iteration used in most scrollSpy plugins.

This kind of performance-specific issue is one that we love to encounter as it really forces us to walk down unfamiliar paths and makes great learning material.

If you're familiar with this problem, do not hesitate to share your thoughts about it!