Choreographing the Crowd: The Precision of Stadium Card Stunts

Tamim Rupo
November 20, 2023
4:39 pm

As we approach college bowl season, football enthusiasts nationwide are not only anticipating on-field excitement but also the captivating “card stunts” executed by the stadium’s audience. This highly synchronized crowd performance can create intricate images reminiscent of pixelated computer screen graphics, employing a coding approach.

Michael Littman’s latest book, “Code to Joy: Why Everyone Should Learn a Little Programming,” is replete with similar illustrations showcasing how the machinery in our surroundings functions. It emphasizes that embracing a future filled with automation is feasible as long as we familiarize ourselves with their language, at least until they become proficient in ours. Covering everything from command sequencing to variable storage, “Code to Joy” serves as an accessible and entertaining primer on the fundamental principles of programming for budding coders of all ages.

YELL OUT FOR THE BLUE!

Card stunts, where a stadium crowd displays colored signs to create a colossal, momentary billboard, resemble flash mobs without the need for special skills or pre-practice. Participants merely need to show up and follow concise command sequences. These directives, announced by a stunt leader, instruct the audience on when to raise poster-sized colored cards, creating a visually striking display. The initial set of instructions includes basics such as carefully listening, holding the card at eye level, facing the indicated color toward the field, and passing cards along the aisle after the stunts without tearing them.

While these instructions may seem straightforward, neglecting them can lead to chaos. Yet, inevitably, there’s always someone who, post-instructions, humorously asks, “Could you repeat the first one?”—an act I’d likely engage in.

Then comes the pivotal moment, wherein one person in the crowd receives the command sequence:

Blue
Blue
Blue

It might not seem extraordinary on its own, but the true beauty unfolds when considering the larger spectacle. Card stunts capitalize on the organized grid of seats in a stadium, turning the audience into a massive computer display. Each participant acts as a single picture element—essentially, person pixels! Alterations in the held-up cards can transform the image or even cause it to morph, resembling a larger-than-life animated GIF.

Card stunts originated as a participatory activity in college sports during the 1920s. Their popularity waned in the 1970s when the prevailing sentiment was to embrace individuality. However, in the 1950s, there was a fervent desire to create more intricate displays. Cheer squads manually designed stunts and painstakingly crafted individual instructions for each of the thousand seats, showcasing a remarkable dedication to their teams. In the 1960s, a few schools explored the use of computers to streamline the instruction preparation process. They developed programs to convert sequences of hand-drawn images into personalized instructions for each participant.

With computer assistance, individuals could receive more detailed sequences dictating when to lift a card, which color to raise, and when to put it down or switch to another card. Unlike the previous example where people created command sequences for computers, here, the computer generates sequences for people to follow. This computer-supported automation enables the creation of more elaborate stunts. A participant’s sequence of commands might resemble:

upon 001 white

003 blue

005 white

006 red

008 white

013 blue

015 white

021 down

up on 022 white

035 down

up on 036 white

043 blue

044 down

up on 045 white

057 metallic red

070 down

While reading these instructions might not be as enjoyable as witnessing the final product, imagine it as part of an animated Stanford “S.” To execute these commands in synchrony, a stadium announcer calls out the step number (“Forty-one!”), and each participant follows their personalized instructions. Participating in a card stunt is not overly complex, but it serves as a fascinating example of creating and adhering to command sequences where the computer guides individuals. Despite its simplicity, mishaps can still occur. At the 2016 Democratic National Convention, a planned arena-wide card stunt intended as a patriotic display of unity resulted in an unreadable mess, sadly failing to spell out “Stronger Together” as intended.

In the present era, computers simplify the process of translating a photograph into instructions regarding which colors to display and where. Essentially, any digitized image serves as a set of instructions detailing the mixture of red, blue, and green to showcase at each position in the picture. A notable challenge in converting an image into card-stunt instructions arises from the fact that typical images comprise millions of colored dots (megapixels), whereas a card-stunt section in a stadium has around a thousand seats. Instead of instructing each person to hold up a thousand small cards, a more practical approach is to compute an average of the colors in that image segment. From the available colors, such as the classic sixty-four Crayola options, the computer selects the one closest to the computed average.

Considerably, the process of color averaging by a computer might not be immediately evident. Teaching a machine to determine the average color, especially when considering various color combinations, presents a unique challenge. This exploration delves into the intricate world of machine learning, offering insights into how computers can be effectively instructed.

Multiple methods exist for averaging colors, and a straightforward approach capitalizes on the fact that each color dot in an image file is stored as the amount of red, green, and blue color it contains. Each color component is represented as a whole number between 0 and 255, chosen because it’s the maximum value achievable with eight binary digits or bits. This representation aligns with the way color receptors in the human eye perceive real-world colors. Simplifying the process involves averaging the amounts of blue, red, and green in a group of pixels. While further refinements involve squaring the values before averaging and square rooting them afterward for better results, the crucial aspect is that there exists a systematic method to average a cluster of colored dots, producing a single dot whose color encapsulates the group.

After producing the average color, the computer requires a method to determine the closest color among the available cards. Is it more akin to burnt sienna or red-orange? An often-used (though not flawless) approach to gauge the similarity of two colors based on their red-blue-green values is the Euclidean distance formula. Expressed as a command sequence, it entails:

1. Take the difference between the amount of red in the two colors and square it.
2. Take the difference between the amount of blue in the two colors and square it.
3. Take the difference between the amount of green in the two colors and square it.
4. Add the three squares together.
5. Take the square root.

To determine which card best represents the average of the colors in the corresponding image section, identify the available colors (blue, yellow-green, apricot, timberwolf, mahogany, periwinkle, etc.) with the smallest distance to that average color. This color corresponds to the card that should be held up by the person in that specific grid location.

The similarity between this distance calculation and the color averaging operation appears coincidental. Sometimes, a square root is just a square root.

In a broader context, these operations—color averaging and finding the closest color to the average—enable a computer to assist in constructing the command sequence for a card stunt. The computer takes input such as a target image, a seating chart, and a set of available color cards. It then generates a map indicating which card should be held up in each seat to best replicate the image. In this scenario, the computer primarily handles organizational tasks and lacks significant decision-making beyond selecting the closest color. Nevertheless, the key takeaway is that the computer streamlines the process of writing command sequences. The transition is from manually choosing every command for every individual pixel at every moment in the card stunt to selecting images and letting the computer generate the necessary commands.

This shift in perspective introduces the prospect of granting the machine greater control over the command-sequence generation process. In the framework of our 2 × 2 grid from Chapter 1, we can transition from telling (providing explicit instructions) to explaining (providing explicit incentives). To illustrate, consider a more challenging version of the color selection problem that engages the computer in more intricate tasks. Suppose we can print cards of any color but are restricted to ordering them in bulk from a print shop, limiting us to eight different card colors. We have the flexibility to choose any colors for that set of eight (corresponding to the number of values achievable with 3 bits, a common computing concept). For instance, we might select blue, green, blue-green, blue-violet, cerulean, indigo, cadet blue, and sky blue to portray a stunning ocean wave in eight shades of blue.

However, this constraint echoes the limitations of early computer monitors, which could display millions of colors but only feature eight distinct ones on the screen simultaneously.

With this constraint in mind, the task of rendering an image in colored cards becomes more intricate. Not only do we need to decide which color from our set of options to assign to each card, as before, but we must also select which eight colors will constitute the set of color options. For example, if creating a face, a diverse range of skin tones may prove more valuable than distinctions among shades of green or blue. How do we transition from a wish list of colors based on the target image to the shorter list that will compose our set of color options?

Machine learning, particularly the clustering or unsupervised learning approach, can address this color-choice dilemma. I will elucidate this process shortly. However, before delving into it, let’s explore a related problem involving the transformation of a face into a jigsaw puzzle. Similar to the card-stunt example, the computer will design a sequence of commands for rendering a picture, but with a twist—the puzzle pieces available for constructing the image are predetermined. Much like the dance-step example, it employs the same set of commands while considering which sequence produces the desired image.