Construct Your Visual Voice – Jennifer Bernabe

With a digital visual voice, the goal is creating the best representation of self in typography using a mixture of verbal voice and handwritten visual voice elements. An innovation making strides in digital technology is talk-to‐text or digital dictation. This technology has been around for years, but it’s been challenged by the inaccuracy of voice recognition devices or software to be accurate with varying dialects, timber and vocal pitches. As typing is becoming optional to verbal dictation on digital devices, planning for a new way of communication beyond tactile interaction is wise. Responsive text can replicate a person’s voice with all of its individual quirkiness. It can be self‐infused communication, a truly personal visual voice of self‐typography. I have created a responsive text formula developed to analyze verbal voice and create it in a typographic experience. The representation showcases all of the audible nuances such as inflection, dialect, volume, pitch, timbre, cadence, and accent to create a unique personal visual voice perfectly reflecting the audible voice. The formula couples verbal characteristics with typographic text to make a flexible self‐typography. The formulas seem straightforward almost simple and build on our inherit understandings. Behind all of these factors are defaults and limits founded in traditional typographic guidelines to insure readability and secure the content is in balance with context. This experimentation in visual voice is not an artistic endeavor, it is the reintroduction and awakening awareness of what context can express in text.

The idea of letter structure being influenced by outside factors is not a new concept. Graphic designers are experimenting with typography, unleashing it from traditional uniformity to take advantage of new technological innovations. There is the potential of fluidity in form and structure conveying outside meaning. In the 34th Typography Club Directors Annual Book, Leland Maschmeyer coupled the number form of “34” to an algorithmic random formula. He digitally gathered the daily information gleaned from cities when people entered their address for invitations to the event creating an interactive visual typographic display for the event. Maschmayer explained that the measurable data of weather, traffic patterns, and location elements of a city were combined digitally to impact the rendering of each pixel in the two numbers. The visual language changed with the retrieval of new data creating an ever morphing degenerative typography.

In this digital generation of letterform, measurable data assembled meaning from context through a linkage to the structural algorithm.
(Maschmayer)

Similar in concept, a personalized visual voice can be formed facilitating individual speech linguistic elements and predetermined dialect indicators. Common in applying nontraditional information in the formation of letters, visual voice differs greatly in creating a flexible range that maintains legibility and keeps the integrity of the letters intact while denoting enough variations to display individuality. This difference keep the typography from digressing to only a work of art due to the complete break down of the letter discernible fabric but rather displays the elegant differences each orator produces. Tying disruptive methodology to letterforms, typography can be a pure representation of verbal voice and hence self‐typography.

Can you imagine talking into a digital device and a graphic visual voice appears on the screen rendering of your words constructed solely by the singular aspects of your audible voice? It would be a pure graphic reflection of verbal self‐expression and self‐typography by responsive text. Not just a show of words, like dictation, rather a person, moment, and emotions captured graphically. Again, as the linguistic relativity theorist, Edward Sapir said,

“If structure is at the heart of language, the variation defines the soul.”

I am here to define the soul of a personal typography by imbuing character in characters, sculpting unique visual dialect by coupling linguistic phonetic phenomena with graphic typography elements to manifest an embodiment of “the soul of a language,” the individuality of a voice. Communication is filtered, shifted, and altered as it is processed through the human body, the personal history and experiences of an individual impacts the understanding and perception of the message. In turn, the audible communication also varies linguistically from the same variety of contextually shared events that influence the phonology. Key phonetic distinctions coupled with acoustic measurements to generate heuristic guidelines and developmental typographic formulas cultivate the capability for text to reflect voice variations, creating an unique graphic experience fueled by phonetic individuality.

VOCAL PITCH / TYPE THICKNESS

Let’s wade into the ideation of creating a visual dialect by starting to determine voice characteristics and setting their physical state to showcase visual variations. One of the simplest and straightforward couplings is pitch denoted by letter stroke thickness or as these ideas are symbiotic stroke thickness determined by tonal voice pitch.

Vocal range is the range of pitches that a human voice can phonate. Its most common application is within the context of singing, where it is used as a defining characteristic for classifying singing voices into voice types. It is also a topic of study within linguistics, phonetics, and speech-language pathology, particularly in relation to the study of tonal languages and certain types of vocal disorders, although it has little practical application in terms of speech.
Webster’s Dictionary

Once fringe range stroke thicknesses are identified the middle or medium is calculated. With the graphic dynamic range formulated the linguistic or human speech tonal range is matched to corresponding stroke weights. This can be formulated by setting a standard by the average tonal octave of the human voice.

The thickness for the font based on human voice studies that provide information of tonal pitch ranges. With a standard set octave measurement graphically represented by a certain font stroke thickness. Then the differential in octave from the standard would visually result in an alteration in the letters thickness of stroke. The higher the octave the thinner the stroke weight and in reverse the lower the narrator’s voice the thicker the stroke weight. The dynamic range of change would have to encompass and determined maximum and minimum so that the maximum would correspond to a stroke weight that was thin or light, however not vanishing or diminished physical integrity essential for readability. On the minimum range, the concern of range would be thicken of the stroke to a level where the letters are scarcely legible or undeterminable shapes. The stroke thickness research foundation would initiate with a study of the abilities of expansive font families, like Futura and Caslon, and their capability to offer a wide range of typeface stylings encumbrance on stroke thickness from extra light to extra black while maintaining legibility and physical presence.

Beyond individual tonal voice characteristics, generalizations of gender could be concluded by the thickness especially in the outer ranges where males it is conjectured will encompass the majority of what is considered a bass or baritone range. Females will, in suit, hold the majority in the mezzo soprano and soprano ranges. Resulting from this conjecture, visually women would on average generate thinner physical lettering in the visual dialect by the stroke calculation. This symbiotic coupling of vocal pitch and type thickness initiates the methodological grounds for formulating a visual voice by conforming typographic elements by linguistic factors. There are two other pairings similar to pitch/thickness where voice characteristics developmentally form the the flexible physical state of the typography resultant from acoustic measurements.

SOUND DURATION / LETTER WIDTH

From the stroke thickness bound to the voice pitch, the conversation turns to physical letter width representing duration of individual phonomes or letter pronunciation. This concept would connect rapidity or rate of speech with letter physical body width, as well as creating a visual flow, rhythm, and cadence of personal speech. Therefore, deliberate speech, which is deemed slower and methodical in progression, would culminate lettering that is wider in form. Functionally certain letters for instance lower case “i” and “l” would be problematic however for this preliminary stage of ideation, it is limited to capitals and through the topic will be revisited in font styling.

Setting the standard for rate of speech can be calculated facilitating the research polled by the linguistic study of William Labov and documented in The Atlas of North American English: Phonetics, Phonology, and Sound Change, which has thousands of speech samples from thousands of people polled across North America in an massive effort of geographic linguistic mapping. (Note: find the actual numbered data to report and supporting studies) With the range of speech rate determinable and the average or majority as the standard, this linguistic data could graphically be connected to the variable of letter width in a flexible typography which provides an outlet for time, as the duration of the pronunciation into the visual voice formula. In relation to time as an developmental factor the conversation of letter emphasis is also mentioned as phonological formations obtain a numerical amounts by acoustic measurements.

For this graphic concept the stretch and compression factors require typographic finesse not to appear whimsical due to an over exaggerated expansion or too narrow so to inhibit letter recognition. The inherit perception of time represented as a visual span is a universal concept evident in is bar graphs and pie charts where the majority has the greatest presence. With the letter sound duration being represented beyond a simple line but rather a letter form, each pronounced letter is metered in milliseconds and duration marked by the letter itself creating a visual pattern of the cadence of the narrator. Labov’s regional mapping studies produced the geographic comparison rates of speech for rural areas compared to urban areas or variances between opposing population density regions. These results marked that rural areas speech timing was longer than in urban centers in North America. In 1994, 39.1% of polled Americans were not living in the same state where they were born, therefore some environmental aspect is directly influencing the rate of initial speech conditioning. This graphic product will also showcase emphasis on individual phonomes, phonological systems, and overall rate of speech in a visual manifestation of pattern of dialect and personal voice, as well as regional indicators, cadence and accent.

SOUND VOLUME / TYPE HEIGHT

If letter width is determined by time in duration of pronunciation then height is calculated by the volume in decibels of loudness. The inherit understanding in dynamic motion for sound is up and down or vertically positioned with a basic universal concept of increased height corresponding to increased volume.

Similar to the sound waves that measure and visually represent sound levels each letter’s inflection in sound will create a patterned product that could represent accent in the undulating rhythm of speech. Everyone has a mixture of low and high rhythm in their speech that produces a distinct cadence or sound pattern. Again, the large pool of quantitive data in Labov’s Atlas of North American English: Phonetic, Phonology, and Sound Change provides a sampling to calculate the average level of speech. With sound volume factored visually referenced by the standard height, variations will sculpt steps and dynamic movement in the graphic. The English language is riddled with accent and inflection rules to place emphasis on proper pronunciation and notable in French and English inflection can indicate whether communication is a statement or inquiry. Therefore in theory facilitating this typographic linguistic factor, a truly monotone narrator’s visual dialect would resemble traditional reading text with all the letters level at the height in capitals and a soft spoken person’s lettering would be smaller than that of an elderly person who speaks loudly due to diminished hearing’s lettering which would tower to show more volume or emphasis. Sound as a variable could also reflect heightened emotion where inflection frequently manifests and dare hint at dialect in speech patterns that are stylistically and socially stratified geographically.

With the objective of creating visual voice, evaluating the height of lettering and discerning differential; a straight baseline must be maintained so to anchor each letter on one side, bottom; and allow the other three sides to flexibly express the narrator’s unique speech in pitch, rate, and volume. From these three lone elements individually influencing each linguistic elements, rhythm, cadence, and inflection in pronunciation will result in unique visual outcomes.

VISUAL VOICE COMPILATION

As of this stage in the research, I have the basic system for constructing an unique visual form of voice that conveys personalized speech such as cadence, accent, and the resemblance of dialect through multiple visual vocal patterns.

PHASE II & III: LOWERCASE FORMULATION & DIALECT / TYPEFACES

Now, the linguistic and typographic conversation turns more complex as the ideation of dialect is tossed in the formula. Dialect is an umbrella term referencing a range of ideas as seen in the collection of Wikipedia, “One usage refers to a variety of a language that is a characteristic of a particular group of the language’s speakers. Under this definition, the dialects or varieties of a particular language are closely related and, despite their differences, are most often largely mutually intelligible, especially if close to one another on the dialect continuum. The term is applied most often to regional speech patterns, but a dialect may also be defined by other factors, such as social class or ethnicity. A dialect that is associated with a particular social class can be termed a sociolect, a dialect that is associated with a particular ethnic group can be termed an ethnolect, and a geographical/ regional dialect may be termed a regiolect^.. According to this definition, any variety of a given language can be classified as “a dialect.”

However, the dialect groundwork is already set by utilizing the linguistic geography from the sociophonetic research of supra-regionalization of the English language in North America by dialectology and sociolinguistics expert, Professor William Labov, documented in his book, The Atlas of North American English: Phonetics, Phonology, and Sound Change. Working within 18 unique dialectoral regions with key phonetic distinctions coupled with acoustic measurements to generate heuristic guidelines and developmental typographic formulas to cultivate the capability for text to reflect the dialect and voice variations, a new facet to the formula can be concocted.

At this point the nuances of typography along with linguistic understanding in advanced to show dialects that different languages by key speech elements apparent in differing geographical areas of the United States. Labov identified in his Atlas and Wolf gram four linguistic characteristics that could be solely found in these areas of the country, evolving from the historical development of the region and the amount and type of outside influences impacted the area enough to imprint on the language. By imparting the understood visual perception of font styling with the historical area, the visual dialect will showcase variations and visual typographic feel.

These regions are dialectual differing and strong in linguistic elements to differentiate.

Focusing on the South and the North as shown in the map below

THE SOUTH
The South encompasses the states of (???). These states were formed by land grants from the King of England to wealth loyalist families in these states. The South is noted linguistically for having a “purer” (???) colonial English dialect than present day England dust to the influx of foreign influences resulting from large scale and broad colonization. The South has had less foreign ethnic influences beyond the African slave trade which is noted by WL as only affecting the coastal or lowcountry areas on the southern states, French in Louisiana, and Spanish reaching up from neighboring Mexico. Therefore, the font selection of a (??) serif embraces the English heritage of Roman capitals. WL found four linguistic deviations that indicate what he terms the “southern drawl”

When any of these linguistic phenomena are detected the font styling is set at a created southern serif standard and further with the degree of the dialect by these indicators additional glyphs and typographic nuances are evident. Similar to the ideation of handwritten replicated “swash” fonts there are degrees of typographic embellishments. Four indicators denote four levels of southern font elements. These typographic elements would manifest in ears, noses, throat, arms, bowls and legs that give the font character and emotional personification.

THE NORTH
The Northeast is typically regarded by its big cities, like New York City and Boston, and historically considered the melting pot area as Ellis Island is one or two of gateways in to America for immigrants all over the world. With the influx of many ethnicities that made tremendous impacts on the language and modern urban cents. With the Industrial Revolution and modernization being a larger part of the history of the Northeast san serif styling suites the uniformed modern simplicity of needed to represent many differing cultures.

wL indicated four, Northern linguistic elements

With the Northern typography being standard sans serif the levels again would be typographic in nature however the lessen of the embellishment would denote more of a uniform urban visual language. Creative glyphs and the switch of some forms of letters like the “a” and “g” that have two accepted shapes.

Determining standards in production values and specific typography allow for deviations in linguistic and speech to be visually formed and graphically showcased. There will have to be a default standard typography that would come in to play if none of the eight regional linguistic elements from the South or North were identified. This standard default would indicate a regional dialect that has not been evaluated and typography set as yet. This could be reflected by a neutral typography that is a serif modern or a stylized script that would not be confused with the Southern or Northern typography.