December 20, 2007

New book (edited volume) on CMC

A new collection of essays, edited by Jeannine Gerbault, has just been published by l'Harmattan. The volume puts together selected papers from the international colloquium La Communication Médiatisée par les Technologies de l'Information et de la Communication, which was held in May 2006 at the Université de Bordeaux III, and it includes papers concerning social interaction in CMC environments, applied linguistics studies, and sociolinguistic analyses of computer-mediated discourse. The reference is:
Gerbault, J. (Ed.). (2007). La langue du cyberespace: de la diversité aux normes. Paris: L'Harmattan.
It will also be distributed by Les Presses de l'Université Laval in North America. It is scheduled to appear in bookstores in early January 2008. Check it out if you get a chance.

October 27, 2007

Language variation and CMC

Overview
Over the past few years, sociolinguists have begun to pay more attention to discourse produced in computer-mediated environments. While some scholars have described computer-mediated discourse (CMD) as an emerging oral-written hybrid or new variety of language and have come up with CMD typologies (Collot & Belmore, 1996; Crystal, 2001; Herring, 2007), others have focused specifically on language variation--in the labovian sense--in CMD. But what kind of language variation can sociolinguists analyze in CMD?

Paolillo (1999, 2001) provided two of the first analyses of language variation in a corpus of Internet Relay Chat. Basing his work on the notions of social networks and tie strength as contributing factors in language variation (see Milroy & Milroy, 1992), he explored word-final
orthography when "s" is expected but "z" could be used and vulgar language use. He found that higher tie strength was positively correlated with orthographic variation and vulgar language.

Herring and Paolillo (2006) investigated gender and genre variation in weblogs (blogs). They tested an algorithm purported to be able to identify masculine versus feminine discourse style based on linguistic and genre factors on the website "GenderGenie." They found that although gender and genre differences existed among blogs of different types, GenderGenie was only about 50% accurate.

van Compernolle and Williams (2007) provided a cross-type analysis of orthographic variation in French-language CMC. They reported that much variation existed on IRC, while it was almost non-existant in moderated chat discussions. Discussion fora were in the middle. They then compared certain orthographic variations thought to be mimetic of informal spoken French (e.g., t'es = tu es or Ø y a = il y a) with a small corpus of sociolinguistic interviews. They found that IRC ressembled informal speech, whereas fora and moderated chat moved in the direction of the formal written language.

van Compernolle (forthcoming) analyzed the variable omission of French ne on IRC. He reported ne retention rates at only 16.1%, which matched frequencies found in informal European French. VARBRUL analyses revealed that subject type (i.e., NP, pronoun, or [- overt subject]) and phonology were the most influential factors.

Although not a variationist study, Williams and van Compernolle (2007) explored second-person pronoun use in French-language chat environments. They reported tu use as high as 99% when compared to vous-singular. In a secondary analysis, the authors addressed IRC participants with vous and found that IRC users see vous-singular as sociopragmatically inappropriate.


Limitations
The biggest limitation to sociolinguistic analyses of CMD is the lack of biographical information about informants (Paolillo, 2001). In studies of speech, researchers know the real or approximate age of informants, their gender, level of education, and socioeconomic status, among other things. This type of information is very difficult, if not impossible, to collect in CMC environments. Another limitation is related to our lack of access to information about messages before they are sent. Since CMD is predominately "written" (i.e., typed), messages are only made available once the participant sends it. Even in synchronous chat, participants have the opportunity to revise their messages before we can see them (i.e., before they hit "Enter"). Therefore, we don't have access to false starts, typographical errors, or anything else CMC users might do before sending the message.


Advantages
There are several advantages to collecting data from text-based CMC environments. First of all, we have access to large amounts of data that do not need to be transcribed or interpreted. We can copy and paste data verbatim and then analyze it. Second, on-line discourse is "naturally" occurring language. In other words, we have access to discourse produced in authentic contexts, without the intimidating influence of the observer, which Labov (1972) labeled as "the observer's paradox." Relatedly, the third advantage is that participants are addressing each other instead of a fieldworker. This allows us to observe a much different kind of interaction, especially if, for example, we wanted to analyze something like questions and responses. Since interviewees rarely address questions to fieldworkers, informant-produced questions are not part of that type of discourse or interaction. In CMC, just like in focus group-type interviews, informants direct the flow of discourse for themselves.


New directions
There are many areas of CMD that need to be explored by sociolinguists and applied linguistics. One very interesting area is orthography. The Internet--and computer more generally--have changed the way we think about text. New writing systems have been developed by CMC users, but unlike Crystal's (2001) notion of "NetSpeak," discourse styles vary depending on the type of CMC used (e.g., chat, blogs, discussion fora, etc.).

Morphosyntax can also be analyzed. Many of the variations observed in written and spoken language exist in on-line communication as well. This area is particularly interesting for sociolinguists because we can track variation and change in new contexts of communication. We can also explore the interaction between writing, speech, and CMD through these types of analyses.


References

Collot, M., & Belmore, N. (1996). Electronic language: A new variety of English? In S. C. Herring (Ed.), Computer-mediated communication: Linguistic, social and cross-cultural perspectives (pp. 13-28). Amsterdam: John Benjamins.

Crystal, D. (2001). Language and the Internet. Cambridge, England: Cambridge University Press.

Herring, S. C. (2007). A faceted classification scheme for computer-mediated discourse. Language@Internet, 1/2007, [np].

Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in weblogs. Journal of Sociolinguistics, 10, 439-459.

Labov, W. (1972). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.

Milroy, L., & Milroy, J. (1992). Social network and social class: Toward an integrated sociolinguistic model. Language in Society, 21, 1-26.

Paolillo, J. C. (1999). The virtual speech community: Social network and language variation on IRC. Journal of Computer-Mediated Communication, 4(4), [np].

Paolillo, J. C. (2001). Language variation on Internet Relay Chat: A social network approach. Journal of Sociolinguistics, 5, 180-213.

van Compernolle, R. A. (forthcoming). Morphosyntactic and phonological constraints on negative particle variation in French-language chat discourse. Language Variation and Change.

van Compernolle, R. A., & Williams, L. (2007). De l'oral à l'électronique: La variation orthographique comme ressource sociostylistique et pragmatique dans le français électronique. Glottopol, 10, 56-69.

Williams, L., & van Compernolle, R. A. (2007). Second-person pronoun use in French-language chat environments. The French Review, 80, 804-820.

October 24, 2007

Working bibliography

I'm putting together a working bibliography of applied linguistics research on computer-mediated discourse, which I'll post here on this blog (see the bottom of the page). In particular, I'm looking for entries related to sociolinguistics, pragmatics, and language variation in non-educational computer-mediated communication environments right now (other lists will be started soon). I've started a list, but if you have any articles, chapters, or other resources you'd like to add, leave the bibliographical information in a comment. Thanks!