Posts Tagged ‘ interpretation ’

Assessing Social Media – Methods

I have written about various social media and web technologies as they relate to knowledge management (KM), and as they are discussed in the literature.  But I haven’t really touched on how the literature approaches measuring the application and success of such technologies in an organizational context.  Prusak notes that one of the priorities of KM is to identify the unit of analysis and how to measure it (2001, 1004).  In this review paper I will examine some of the readings that have applied this question to social media. For the sake of consistency, the readings I have chosen deal with the assessment of blogs for the management of organizational knowledge, but all of the methods discussed could be generalized to other emerging social technologies.

Grudin indicates that the reason most attempts at developing systems to preserve and retrieve knowledge in the past have failed, is that digital systems required information to be represented explicitly when most knowledge is tacit: “Tacit knowledge is often transmitted through a combination of demonstration, illustration, annotation, and discussion.” (2006, 1) But the situation, as Grudin explains, has changed—“old assumptions do not hold…new opportunities are emerging.” (ibid.) Virtual memory is no longer sold at a premium, allowing the informal and interactive activities used to spread tacit knowledge to be captured and preserved; emerging trends such as blogs, wikis, the ever-increasing efficiency of search engines, and of course social networks such as Twitter and Facebook that have come to dominate the Internet landscape open up a multitude of ways in which tacit knowledge can be digitized.

In his analysis of blogs, Grudin identifies five categories (2006, 5):

diary-like blogs, or personal blogs, developing the skill of engaging readers through personal revelation;

A-list blogs by journalists and high-profile individuals, as a source of information on events products and trends;

Watchlists, which track references across a wide selection of sources, reveal how a particular product, organization, name, brand, topic, etc is being discussed;

Externally visible employee blogs provide a human face for an organization or product, which offsets the potential legal and PR risks for a corporation.

Project blogs are internal blogs that focus on work and serve as a convenient means of collecting, organizing and retrieving documents and communication.

Lee, et al. make a similar move in categorizing the types of public blogs used by Fortune 500 companies (2006, 319):

Employee blogs (maintained by rank-and-file employees, varies in content and format)

Group blogs (operated by a group of rank-and-file employees, focuses on a specific topic)

Executive blogs (feature the writings of high-ranking executives)

Promotional blogs (promoting products and events)

Newsletter-type blogs (covering company news)

Grudin does not conduct any formal assessment of blogs, except to provide examples of project blogs, and to assign technical and behavioral characteristics to that particular sub-type that allowed them to be successful, based on his personal experience (2006, 5-7). Lee, et al.’s approach to assessing blogs involves content analysis of 50 corporate blogs launched by the 2005 Fortune 500 companies (2006, 322-23). In addition to the categories above, Lee, et al. also identified five distinct blogging strategies based on their findings, which broadly fall under two approaches (321):

Bottom-up, in which all company members are permitted to blog, and each blog serves a distinct purpose (not necessarily assigned by a higher authority)[1];

Top-down, in which only select individuals or groups are permitted to blog, and the blogs serve an assigned purpose that rarely deviates between blogs.

As the names suggest, a greater control of information is exercised in the top-down approach, while employee bloggers in companies adopting the bottom-up approach are provided greater autonomy.

Huh, et al. developed a unique approach in their study of BlogCentral, IBM’s internal blogging system (2007).  The study combined interviews with individual bloggers about their blogging practices and content analysis of their blogs.  Based on this data, they were able to measure two characteristics of blogs: the content (personal stories/questions provoking discussion/sharing information or expertise) and the intended audience (no specific audience/specific audience/broad audience).  These findings revealed four key observations:

– Blogs provide a medium for employees to collaborate and give feedback;

– Blogs are a place to share expertise and acquire tacit knowledge;

– Blogs are used to share personal stories and opinions that may increase the chances of social interaction and collaboration;

– Blogs are used to share aggregated information from external sources by writers who are experts in the area.

Rodriguez examines the use of WordPress blogs in two academic libraries for internal communication and knowledge management at the reference desk (2010).  Her analysis measures the success of these implementations using diffusion of innovation and organizational lag theories. Rogers’ Innovation Diffusion Theory establishes five attributes of an innovation that influence its acceptance in an organizational environment: Relative advantage, compatibility, complexity, triability, and observability (2010, 109). Meanwhile, organizational lag identifies the discrepancy between the adoption of technical innovation—i.e. the technology itself—and administrative innovation—i.e. the underlying, administrative purpose(s) for implementing the technology, usually representing a change in workflow to increase productivity.  In analyzing the two implementations of the blogging software, Rodriguez discovers that both libraries succeeded in terms of employee adoption of the technical innovation, but failed with the administrative innovation.  This was due specifically to the innovation having poor observability: “the degree to which the results of the innovation are easily recognized by the users and others” (2010, 109, 120). The initiators of the innovation in both cases did not “clearly articulate the broader administrative objectives” and “demonstrate the value of implementing both the tool and the new workflow process.” (2010, 120) If they had done so, Rodriguez suggests, the blogs might have been more successful.

While all of these studies approached blogging in a different way—project blogs, external corporate blogs, internal corporate blogs and internal group blogs—and measured different aspects of the technology—what it is, how it is used, if it is successful—they reveal a number of valuable approaches to studying social media in the KM context. Categorization, content and discourse analysis, interviews, and the application of relevant theoretical models are all compelling methods to assess social media and web technologies.

 


[1] One of the valuable contributions of Lee, et al.’s study is to also identify the essential purposes for which corporate blogs are employed. Some of these include product development, customer service, promotion and thought leadership. The notion of ‘thought leadership’ in particular, as a finding of their content analysis, is worth exploring; ‘thought leadership’ suggest that the ability to communicate innovative ideas is closely tied to natural leadership skills, and that blogs and other social media (by extension) can help express these ideas. Lee, et al.’s findings also suggest that ‘thought leadership’ in blogs will build the brand, or ‘human’ face of the organization, while acting as a control over employee blogs, evidenced by the fact that it is found primarily in blogs that employ a top-down strategy.


Bibliography

Grudin, J. (2006).  Enterprise Knowledge Management and Emerging Technologies. Proceedings of the 39th Hawaii International Conference on System Sciences. 1-10.

Huh, J., Jones, L., Erickson, T., Kellogg, W.A., Bellamy, R., and Thomas, J.C. (2007) BlogCentral: The Role of Internal Blogs at Work.  Proceeding Computer/Human Interaction CHI EA 2007, April 28-May 3. 2447-2452. San Jose, CA.  doi <10.1145/1240866.1241022>

Lee, S., Hwang, T., and Lee, H. (2006). Corporate blogging strategies of the Fortune 500 companies. Management Decision 44(3). 316-334.

Prusak, L. (2001). Where did knowledge management come from? IBM Systems Journal, 40(4), 1002-1007.

Rodriguez, J. (2010). Social Software in Academic Libraries for Internal Communication and Knowledge Management: A Comparison of Two Reference Blog Implementations. Internet Reference Services Quarterly 25(2). 107-124.

Needling the Old Guard: XML in Prosopography

The last few weeks we have been discussing the ongoing debate in the digital humanities between textual markup and databases. Reading K.S.B. Keats-Rohan’s “Prosopography for Beginners” on her Prosopography Portal (http://prosopography.modhist.ox.ac.uk/index.htm), I found it interesting that the tutorial focuses initially and primarily on mark-up. Essentially, Keats-Rohan outlines three stages to prosopography:
1. “Data modelling”—For Keats-Rohan, this stage is accomplished by marking up texts with XML tags “to define the groups or groups to be studied, to determine the sources to be used from as wide a range as possible, and to formulate the questions to be asked.” It does far more than that, however, since the tags identify the particular features of sources that need to be recorded. Keats-Rohan covers this activity extensively with eleven separate exercises, each with its own page.
2. “Indexing”—This stage calls for the creation of indexes based on the tag set or DTD developed in stage one. These indexes collect specific types of information, such as “names”, “persons” and “sources”. These indexes are then massaged with the addition of biographical data into a “lexicon”, with the application of a “questionnaire” (i.e. a set of questions to query your data points.) Ideally, it is suggested, this is done through the creation of a relational database with appropriately linked tables. A single page is devoted to the explanation of this stage, with the following apology:

It is not possible in the scope of this tutorial to go into detail about issues relating to database design or software options. Familiarity with the principles of a record-and-row relational database has been assumed, though nothing more complex that an Excel spreadsheet is required for the exercises.

…11 lengthy exercises for XML, but you’re assumed to appreciate how relational databases work by filling out a few spreadsheets?
3. “Analysis”—This is, of course, the work of the researcher, once the data collection is complete. This section of the tutorial includes a slightly longer page than stage 2 with 4 sample exercises. The exercises are designed to teach users how prosopographical analysis can be conducted.
It strikes me as incongruous that, for a research method that relies so heavily on the proper application of a relational database model, so little time is devoted to discussing its role in processing data. Instead, Keats-Rohan devotes the majority of her tutorial in formulating an XML syntax that, when all is said and done, really only adds an unnecessary level of complexity to processing source data. You could quite easily completely do away with stage one, create your index categories in stage two as database tables, and process (or “model”) your data at that point, simply by entering it into your database. What purpose does markup serve as a means of organizing your content, if you’re just going to reorganize it into a more versatile database structure?
Keats-Rohan’s focus on markup starkly emphasizes how XML is far more greatly valued than databases by humanities scholars. Since both are useful for quite different purposes, and relational databases have so much to offer to humanities scholarship—as prosopographies prove—I am baffled that such a bias persists.

The Implications of Database Design

In studying the database schema for the Prosopography of Anglo-Saxon England (PASE), several features of the design are immediately apparent[1].  Data is organized around three principal tables, or data points: the Person (i.e. the historical figure mentioned in a source), the Source (i.e. a text or document from which information about historical figures is derived), and the Factoid (i.e. the dynamic set of records associated with a particular reference in a source about a person).  There are a number of secondary tables as well, such as the Translation, Colldb and EditionInfo tables that provide additional contextual data to the source, and the Event, Person Info, Status, Office, Occupation and Kinship tables, among others, that provide additional data to the Factoid table.  In looking at these organizational structures, it is clear that the database is designed to pull out information about historical figures based on Anglo-Saxon texts.   I admire the versatility of the design and the way it interrelates discrete bits of data (even more impressive when tested using the web interface at http://www.pase.ac.uk ), but I can’t help but recognize an inherent bias in this structure. In reading John Bradley and Harold Short’s article “Using Formal Structures to Create Complex Relationships: The Prosopography of the Byzantine Empire—A Case Study”, I found myself wondering at the choices made in the design of both databases.  The PBE database structure appears to be very similar if not identical to that of the PASE.  Perhaps it’s my background as an English major—rather than a History major—but I found it especially unhelpful in one particular instance: how do I find and search the information associated with a unique author? With its focus on historical figures written about in sources, rather than the authors of those sources, the creators made a conscious choice to value historical figures over authors and sources.  To be fair, the structure does not necessarily preclude the possibility of searching author information, which appears in the Source table, and there is likely something to be said of the anonymous and possibly incomplete nature of certain Anglo-Saxon texts.  In examining the PASE interface, the creators appear to have resolved this issue somewhat by allowing users to browse by source, and listing the author’s name in place of the title of the source (which, no doubt, is done by default when the source document has no official title).  It is then possible to browse references within the source and to match the author’s name to a person’s name[2].  The decision to organize information in this way, however, de-emphasizes the role of the author and his historical significance, and reduces him to a faceless and neutral authority.  This is maybe to facilitate interpretation; Bradley & Short discuss the act of identifying factoid assertions about historical figures as an act of interpretation, in which the researcher must make a value judgment about what the source is saying about a particular person(8).  Questions about the author’s motives would only problematize this act.  The entire organization of the database, in fact, results in the almost complete erasure of authorial intent. What this analysis of PASE highlights for me is how important it is to be aware of the implications of our choices in designing databases and creating database interfaces.  The creators of PASE might not have intended to render the authors of their sources so impotent, but the decisions they made both in the construction of their database tables and of the user interface, and of the approach to entering factoid data had that ultimate result. Bradley, J. and Short, H. (n.d.).  Using Formal Structure to Create Complex Relationships: The Prosopography of the Byzantine Empire.  Retrieved from http://staff.cch.kcl.ac.uk/~jbradley/docs/leeds-pbe.pdf PASE Database Schema. (n.d.). [PDF].  Retrieved from http://huco.artsrn.ualberta.ca/moodle/file.php/6/pase_MDB4-2.pdf Prosopography of Anglo-Saxon England. (2010, August 18). [Online database].  Retrieved from http://www.pase.ac.uk/jsp/index.jsp


[1] One caveat: As I am no expert, what is apparent to me may not be what actually is.  This analysis is necessarily based on what I can understand of how PASE and PBE are designed, both as databases and as web interfaces, and it’s certainly possible I’ve made incorrect assumptions based on what I can determine from the structure.  Not unlike the assumptions researchers must make when identifying factoid assertions (Bradley & Short, 8).
[2] For example, clicking “Aldhelm” the source will list all the persons found in Aldhelm, including Aldhelm 3, bishop of Malmsbury, the eponymous author of the source (or rather, collection of sources).  Clicking Aldhelm 3 will provide the Person record, or factoid—Aldhelm, as historical figure.  The factoid lists all of the documents attributed to him under “Authorship”.  Authorship, incidentally, is a secondary table linked to the Factoid table; based on the structure, it seems like this information is derived from the Colldb table, which links to the source table.  All this to show that it is possible but by no means evident to search for author information.

Asimov Update: Gender and Otherness

I’ve been working on my encoding of Asimov’s robot stories, and reworked the pr_ref tag to include attributes for the source gender and “otherness”, as well as generalized the source attribute values (phuman, shuman, probot, srobot, nvoice) so it can be used when analyzing a corpus of different texts.

My encoding can now examine the relation to gender of human-robot interactions in the text (i.e. do more female characters respond emotionally to the robots than male characters?  Do male characters physically interact with the robots more? etc.)

I can also track which references demonstrate a portrayal of the robot as “other”, and which references portray the robot as “same” in relation to the source factions in the text.  This otherness/sameness dichotomy is by no means a perfect science, but given a careful reading most references in the text usually imply one or the other.   (Not unlike determining the difference between an emotive and an interactive reference, determining “otherness” relies on interpretation.)

As well, I have made it possible for the principal robot character to reference itself.  This is important in a text like “Someday”, where the robot “the Bard” tells a story about itself.

Click on the screenshot below to see an example of how I’m using the Mandala browser to visualize these features.

Mandala Browser

The Bard's robot references and their "otherness"

Interpretation and the Semantic Web

HuCo 500 – weekly questions

 

Semantic interpretation deals with imprecise, ambiguous natural languages, whereas service interoperability deals with making data precise enough that the programs operating on the data will function effectively. (Halevy et al, 2009).

Halevy, et al make it very clear that there is a distinction between the Semantic Web and semantic interpretation.  That difference is obvious, but also begs the question why we call it the “Semantic Web” in the first place.  What is significant about the associations of the word “semantic”?  What makes ontologies for classifying data “semantic”?

 

Data is useful for identifying patterns (“temporary structures”) and patterns can be used to identify a problem, but can they be valuable in solving the problem or does that require (as Moretti puts it) qualitative explanans? If that’s the case, if interpretation is required to draw conclusions based on pattern recognition, how do we quantify the interpretive act?  Returning to the article by Halevy, et al, how do we situate the “semantic” in the Semantic Web?  How do we teach machines to interpret?  Is this possible?

 

Readings:

Halevy, Alan, Peter Norvig, and Fernando Pereira. “The Unreasonable Effectiveness of Data.” IEEE Intelligent Systems, 2009.

Moretti, Franco. “Graphs.”  Graphs, Maps, Trees: Abstract Models for a Literary History. NY: Verso, 2005.