Issue #144
April 2024

My penchant has always been for words. I care about them and I want them to look good on paper before I sing them. A love song written down just doesn’t look good to me.

—Mark E Smith, Jamming!, no. 22 (November 1984)

In late 2023, as I was preparing the release of my piece Vocal Trio for Blume Editions, Daniel Muzyczuk got in touch about writing an essay for e-flux journal that focused on language, music, and text. Vocal Trio was my seventh project in ten years centered on the voice and the use of typographical scores; Daniel’s invitation created a nice opportunity to revisit a few of these projects and to consider how my relationship to scoring with text has changed over the past decade.

Slipping Control: voice as control source, 2012

The first text score I created came about because I was searching for a way of producing systems music that was more prone to failure. I was aiming to destabilize the rhythmic structures of my compositions and wanted to develop a method for generating patterns and phrases that bypassed the mechanical logic of the synthesizers I had been using. I felt the human voice was a perfect carrier for this failing stream of control data and the typographical score seemed like the most direct way to program for the voice.

The project I developed, Slipping Control, began as an investigation into voice-controlled synthesis, but soon evolved into an intermedia exhibition consisting of video, sculpture, print, and so on, where all the individual pieces produced would use the same initial textual materials as their starting point. I was thinking of this process as building a kind of generative soft system with a nice, wobbly human stage tucked into the control path. In terms of the voice, the only content it needed to carry was rhythm and pattern—nothing linguistic, just its non-semantic acoustic properties. Using simple phonetic language was perfect to communicate this information.

Creating streams of phonetic text felt almost like programming a drum machine—kick, snare, hi-hats, plosives for the drums, sibilance for the cymbals. I would sit and type out rhythms, watching the text form into repetitive patterns on the page. A few years later I would dig deeper into this mode of text programming to create a piece that utilized text-to-voice software, producing a sort of synthesized sound poetry, an uncanny and creepily hypnotic track called “Heteroglossic Riot.”


Video of performance by Tyondai Braxton. Part of Ben Vida’s solo exhibition “Slipping Control,” Audio Visual Arts, New York, 2013.

As work on Slipping Control progressed, I invited Tyondai Braxton and Sara Magenheimer into my studio to interpret the scores I had made. I gave them very little instruction and their performances were often significantly different from what I had been hearing in my head, yet each of their unique voicings of the text produced the desired effect of creating unstable rhythms. This was a big moment—to have a method of scoring that remained open enough to engage the creative intuition of my collaborators, but formal enough to elicit my desired compositional results.

Charting the evolution of these texts across mediums also proved informative. Watching how the function of the language changed as it traveled from the page to the video to the recording studio to the performance space was a way of testing the resilience of the text. One medium would inform the other, each creating another filter or sieve for the language to pass through. By the time I was performing this work live, the order of influence that the language had within the control paths had been totally inverted. The function of the score had come full circle and now needed to be reconsidered and reworked to serve in its next role.

Ben Vida, Tztztztzt Î Í Í … (Pink), 2013.

Speech Acts: text as image, 2015

While I was working on Slipping Control, I started playing around with the compositional possibilities of using type as image. I began experimenting with a new set of pieces that blend the semantic, visual, and phonetic elements of language together to produce a kind of narrative concrete poetry—a collection of text drawings that could be nudged into the service of performance. I titled these works Speech Acts.

I was influenced by how the conceptual poets had demonstrated the mutability of digitized text; through their work, language became a ready-made that could be repurposed in much the same fashion as an audio sample. Once sound was fixed to the medium of tape it took on a new life as a sample, a sound object that could be processed and transformed. Extracted from its source, the sample created an opportunity for the composer to reassign its signifiers through sonic transformation and recontextualization. Digitalization expanded the malleability of this process exponentially, creating the possibility for a wild morphology.

Of course there is a history of “language as sample” that precedes conceptual poetry (the cut-up method, concrete poetry, the l=a=n=g=u=a=g=e writers’ games and processes), but having the ability to copy the entirety of a digitized text in three keystrokes, to be able to push text around in a document as one would push around an audio clip in a DAW (digital audio workstation)—what a liberation!

By decoupling phrases from their origin, my relationship to language went through the same recalibration as my relationship to music had when I first began listening to records in search of audio samples. Reading became a foraging expedition in which sentences or phrases might pop off the page, revealing in themselves the possibility of becoming the foundation for something altogether new.

I was also finding that syntactic forms could be repurposed as ready-made templates. Typographical layouts of preexisting texts could be emptied of their meaning and filled back in, Mad Libs style, with new content. Brackets, parentheses, dashes, quotation marks, ellipses, and punctuation, all informing the rhythm, accent, and contour of each new phrase, prompted a call and response or dialogue between what has been removed from the existing text and what remained.

These procedural writing techniques proved to be both generative and playful. On the surface they could seem structuralist or coldly systematic but in practice they became a device for prompting a less self-conscious method of writing. By off-loading some of the basic structural elements of a text, I found the space to be more improvisational and looser with grammatical play. I liked that no specific outcome was sought in regard to plot or narrative—just the pleasure of discovery as words and phrases fell into place, revealing the form of each new work.

Graphic in their layout, opaque in their function, as scores, Speech Acts didn’t necessarily communicate any specific manner of interpretation. As textual drawings it is hard to say if they pre-articulated any viable use as a performance tool at all, which made them fun to actually try to perform.

If on the page Speech Acts hinted at some sort of abstracted narrative, the live performances of these works, read by artist Mary Manning and myself, gave the impression of two spaced-out narrators stumbling through a dialogue of false starts, echoes, mimicries, and circuitous logics. Characters created out of a process are placed in a preexisting setting and talk in language borrowed from some other story. Their intertextual babble acts to create and obscure meaning moment by moment; the narrators seem to want to tell a story but are not quite able to figure out how to start.

Ben Vida, Speech Acts, 2015.

Reducing the Tempo to Zero: typographical road maps for group improvisations, 2018

Having produced two sets of typographical scores devised within a visual arts context, I shifted my focus back to live performance with my piece Reducing the Tempo to Zero. This was a durational work scored for fixed electronics and vocal ensemble and presented as a six-hour-long concert.

In 2010 I attended a performance of Morton Feldman’s String Quartet No. 2 at ISSUE Project Room in Brooklyn and was inspired by how that work utilized extended performance length to explore ideas around memory, absorption, and attention.

Around that time, I’d also been visiting La Monte Young and Marian Zazeela’s Dream House and attending Phill Niblock’s solstice performances; all of these experiences helped to evolve my understanding of compositional form in regard to duration. I wanted to use installation, staging, and site specificity as elemental compositional building blocks to create a performance that oscillated between sound installation and concert, where both the audience and the performers might begin to understand the “now” as an ever-changing point of multiple durations and potentialities. Ultimately, I wanted to see if it was possible to shift our perception of time from the linear to the spatial.

The mechanics behind these performances were simple: a lot of time was going to pass—up to six hours—and we, the vocalists, were going to be there for all of it, singing and breathing in unison, enveloped by the drones of the fixed backing track and navigating the emotions of engaging in something that is demanding both in focus and physicality. It was fun! … and also, kind of brutal. It was a lot of things at once, and that was the point. Anyone who has performed durational work will tell you that time’s elasticity is put into high relief when on the stage, and this experience can deeply imprint in you a new understanding of the possibilities of temporal slippage.

I composed the Reducing the Tempo to Zero scores using phonetic language much in the same style as I had with Slipping Control. I was interested in vocal disfluencies and the cadence of speech patterns that become stuck. I felt that this linguistic noise that obscures the speech signal could be emphasized and used as the main verbal material of a score; all of the “uhs” and “ums” were organized into nonsensical rhythmic conversation.

The texts I composed helped guide the singers through the long performance and determined the materials and trajectories of our improvisations. I knew that an important aspect of these scores would be that they not completely short circuit what the vocalists might be inspired to do intuitively. A certain amount of open-endedness was needed, and again the use of phonetic language served to communicate just enough information to keep us all relatively in sync but left space for the vocalists to express themselves and expand the timbre and feel of the group’s accumulative voice.

These concerts were intense—to sit together and sing for that duration of time, to experience our voices and breathing come together and break apart, to feel one’s attention drift and return. At each performance I learned something new. By the end of this project, I came to realize that I had created more of a momentary social construct that could fast-forward intimacy than a finely tuned musical composition—and that was interesting! I had inadvertently created a framework for spending time with people, sitting together, singing in unison, tuning our voices, and thus listening and adjusting to one another. We got into a shared rhythm and feel and engaged in the act of communal concentration.

Ben Vida, from a performance of Reducing the Tempo to Zero, Centro per l’Arte Contemporanea Luigi Pecci, Prato, Italy, 2016.

The Beat My Head Hit: the gathering and summing of ambient language, 2020

My first vocal work that included an acoustic ensemble and that called for the use of traditional notation was The Beat My Head Hit, written for Yarn/Wire and Nina Dante. I started composing this piece by writing and arranging all of the text. I wanted the formal compositional considerations to be determined by the language and the systems I had developed to organize that language. Which is to say, the composition had to work purely as a text piece first, and from there I could begin composing the music. I did end up producing a notated score for Yarn/Wire but once we started rehearsing, we quickly discarded my many superfluous pages and simply returned to the text, adding just a minimal number of notated cells in the margins of the typographical score. We all understood that the form of this work was in the language, so why not use that as the guide to navigate the piece?

At the time I started work on this project I had been living in New York for almost ten years and had found myself in the habit of tuning into and writing down the ambient language of my surroundings. This sounds like habitual aestheticized eavesdropping, but I think it was actually an attempt at keeping field notes in the hope of capturing an impression of the verbal and textual environment I was living in.

I had been thinking about the Burroughs/Gysin cut-up method and had started to see it as a way to subtly conjure phrases from the atmosphere—a process for summing disparate inputs and then outputting them as a single stream of information. I kept imagining a ring-modulator circuit with the objective ambient language acting as the input signal, my own subjective filtering as the carrier signal, and the resulting combination of these two sources as the final output.

We are always making sense of so much language, text, image, sound. The past/future/present are always there; our attention goes multidimensional and seems to rhizomatically mutate. So much accumulation, so much to parse, so much to try to figure out. I wanted to capture this fragmented density on the page to see what new information might resonate up and out of it, see what new impressionistic stories might be hiding in that flux.

While finishing production on the Speech Acts exhibition I came across an edition of John Cage’s Diary: How to Improve the World (You Will Only Make Matters Worse). Cage’s ability to flatten his personal interests, correspondences, the daily news, passing conversations, recollections, and storytelling into a single linear stream of text produced the effect of an abstracted memoir. Each entry at once comprises memories/forethoughts/aspirations; an ecology and an ambience arise out of a pool of collected phrases. I could see in this process a method of writing that puts an artist in conversation with their surrounding accumulative ambient language and text—a generative practice of foraging and responding, editing and rewriting.

Both Cage’s diary and Andy Warhol’s are an accumulation of seemingly impersonal details that, in their density, start to reveal the dimensions of each artist’s individual worlds (and worldviews). This feels true of David Markson’s novels and the language that adds voice to Raymond Pettibon’s drawings. The origin of the text in these works is of less importance than the resulting imaginative space produced through the collecting and organizing of disparate thoughts into a new form.

The fragmented syntax of The Beat My Head Hit hints at a narrative but leaves the most foundational elements up to the listener to fill in. For me this has something in common with how a spectral composer will build a whole harmonic language off of the partials of a single pitch but suspend the use of that pitch’s fundamental, creating a logic to the dissonance that encourages the listener’s ear to fill in that missing harmonic information. I am beginning to see these texts I create as a kind of textual spectralism, a script that is filled with the higher harmonics and partials but leaves the fundamental aspects of the text open for any listener’s imagined plot.

Ben Vida with Yarn/Wire and Nina Dante, from a performance of The Beat My Head Hit, ISSUE Project Room, Brooklyn, New York, 2020.

Vocal Trio: words as filters, 2022

In spring 2022 I was in Bremen, Germany working with the choreographer Faye Driscoll. Inspired by her method of slowly developing a piece through rehearsal and having daily access to a spacious, interesting-sounding rehearsal space, I started to meet with two of the vocalists from the dance company to work out some ideas. It was a casual and exploratory situation. To create a bit of form for the three of us, I wrote out a page of text to set some parameters and to help guide our improvisations. With this and a few simple spoken instructions (favor 2nds and 7ths over 3rds, consider texture as much as pitch, and so on) we started to discover, in process, the dimensions of a new composition, and in time arrived at a performance of the piece that would become Vocal Trio.

Having spent the previous few years working on The Beat My Head Hit, for Vocal Trio I was happy to use language as a less concrete and meaning-rich compositional material. As Amy Gernux, Lotte Rudhart, and I started to sing together, I began hearing in the blend of our voices something related to additive and subtractive synthesis. In those techniques, complex sounds are created through the combining of synthesized voices, and then these sounds are carved back into by filtering out harmonic information. If the quality of the phonetic language in Slipping Control functioned to create amplitude envelopes approximating rhythmic patterns, the language used for the Vocal Trio score determined the mouth shapes that would articulate the timbral shifts of the pitches we were voicing. We called this “Slow Singing.” We were stretching out the pronunciations of the words to such an extent that their meaning completely dissolved and all we were left with was sound, pitch, tone, and texture.

I recently came across Joan La Barbara’s description of her first time working with Alvin Lucier. It felt so familiar to what I had just discovered by working with Amy and Lotte:

The first work I performed with composer Alvin Lucier, Still and Moving Lines of Silence in Families of Hyperbolas, is enigmatic. We rehearsed in the Merce Cunningham studio at Westbeth in 1972. Lucier explained that he would play sine tones from four audio speakers placed at the edges of the “stage” and that I was free to “play” within the situation. I moved onto the dance floor, closed my eyes, and continued moving until I felt I was in my acoustical center within the space, being “bombarded” equally by the sounds, and then began singing softly, matching exactly the pitch of the sine tones. I then began to move my pitch microtonally away from the unison, causing the waves to move away from me. I played with this situation for some twenty minutes, and then stopped and returned to where Lucier was seated. “Tell me what you were doing,” Alvin said. I explained, and he replied, “That’s the piece.”1

Creating Vocal Trio was much like this: a process of listening and refining that helped us to understand what the piece was simply through the act of the three of us spending time together hanging out and singing. And though this work was informed by my one page of words and the collaborative process of discovering our singing style, the final compositional form was only set once we hit record and sang through the piece together. In performing it we finished the composition.


The experience of hearing an opera is that you accumulate a lot of details that are not very significant in themselves. No one of the details in an opera (or in a novel) is a mind-boggling detail. But things just keep coming until you have a huge pile of them. That’s when they start meaning something.

—Robert Ashley, lecture at Mills College, 1989

I like the matter-of-factness of Ashley talking about his opera Perfect Lives. It resonates with how I’ve been thinking about using atmospheric language to create expanded narratives or large-scale assemblages. I, too, might use generative writing techniques to add details to the “huge pile,” letting meaning arise out of the accumulation.

In 2008 I got to see three of Ashley’s later operas performed at La Mama in New York. Needless to say, it left a deep impression. I like reading through Ashley’s scores; I like looking at them as images. You can see the formal considerations of his pieces on the page, in the typography, and in the language and phrasing. When seeing them performed live—all that density of language, the length of the phrases determining tempo, the chorus of voices slipping between the textural and the textual, the performers seated, turning page after page after page—a complexity of form is revealed that the page only hints at.

As you spend time with Ashley’s work the unique compositional possibilities of the typographical score come into focus. More than a memory device or a set of instructions, in his scores you see the potentiality of language as sound, text as image, and narrative as space. On the page is a proposition for a spatialized intertextual mode of storytelling, one that holds so many ideas and can generate so many more.


Joan La Barbara, foreword to On Minimalism, ed. by Kerry O’Brien and William Robin (University of California Press, 2023), xiii.

Ben Vida is a composer, improviser, writer, and artist. His practice encompasses works for voice and ensemble, musique concrète, text-based compositions, and electro-acoustic improvisation. In the mid-1990s he was involved in Chicago’s multi-faceted experimental music scene, co-founding the quartet Town & Country. In the mid-2000s he relocated to New York and began producing electronic and systems-based compositions that focused on psychoacoustics and advanced synthesis techniques, releasing records on Shelter Press and PAN. He also began exhibiting artworks in various media including video, text drawing, and sound installation. Since 2013, he has been composing pieces that combine his interests in group vocalization, durational performance, and typographical scores. He teaches in the Sonic Arts MFA at Brooklyn College.


