Content Analysis, Definition of

Content analysis is a widely used method in communication research and is particularly popular in media and popular culture studies. Content analysis is a systematic, quantitative approach to analyzing the content or meaning of communicative messages. Content analysis is a descriptive approach to communication research, and as such is used to describe communicative phenomenon. This entry provides an overview of content analysis, including the definition, uses, process, and limitations of content analysis.

Definition and Uses of Content Analysis

Content analysis is a quantitative process for analyzing communicative messages that follow a specific process. In many communication studies, scholars determine the frequency of specific ideas, concepts, terms, and other message characteristics and make comparisons in order to describe or explain communicative behavior. Content analysis can be used to examine the manifest or latent content of communication, depending on the research question. Manifest content is the specific characteristics of the message itself, or what the communication literally says. For example, when a husband tells his wife, “You look fine, honey,” the manifest content of the message expresses that the wife looks adequate or appropriate. Latent content is the underlying message, or interpretations of the content by implying something about the communicative message. When the wife hears, “You look fine, honey,” she might interpret it to mean that she does not look good but her husband is tired of waiting for her to get ready. Content analysis can study both types of content.

Scholars use content analysis to describe or explain communication; however, content analysis cannot be used to predict cause-and-effect relationships. While used as an approach to discover communication, content analysis can be used in conjunction with other methods, and is useful as a starting point for understanding the effects of particular messages through other research methodologies, in situations where understanding the content of communication is pivotal to examining the effect. Content analysis can be used in conjunction with experimental research when the dependent variable is message-related behavior. For example, researchers who study online civility on social media and message boards use content analysis to analyze posts. A researcher could design an experiment in which participants were exposed to a series of comments written in a specific tone (civil or uncivil) for each participant, and then the participant added a comment of his or her own. The participant’s comment could vary based on the messages to which he or she was exposed. The researcher would conduct a content analysis on the participant comments and compare those to the original comments to which the participant was exposed to determine if the tone of the original comments affected how the participant would respond.

Content analysis, as a method, has several uses. First, content analysis is a flexible method used by scholars and practitioners; that is, it can be used in a wide variety of contexts. Content analysis can be used to characterize communication and make comparisons, such as the types of persuasive messages used in beauty ads. Content analysis is also useful for studying communication in nontraditional settings. While mass media communication is an obvious application of the method, content analysis can be used in a variety of settings, including digital communication, speech therapy, work groups, and the like.

Researchers agree that content analysis should meet three key criteria: objectivity, systematic, and generality. First, content analysis must be objective. In order for the findings of content analysis to have value, the method must be objective and free from bias. Different methods can be employed to ensure objectivity (i.e., using multiple coders and measuring intercoder reliability, using objective codes and procedures). For example, a researcher may hope to find something specific from his or her analysis, and that could affect how he or she interprets the data. One way to prevent researcher bias from affecting the results would be to use a second or third coder in the analysis.

Second, content analysis should be systematic. In identifying and interpreting content, using a particular system to determine what will and will not be included in the dataset and in the conclusions will help avoid researcher bias. Without a systematic approach, researchers could elect to include only the data that supports the research question or hypothesis, thereby influencing the results, which in turn affects objectivity. Carefully defining the codes used to analyze the data and carefully training coders is an important step in this process.

Finally, content analysis should meet the criteria of generality; that is, the results of the content analysis should have theoretical relevance. Researchers agree that content analysis, as a method, should not be applied to a text simply because it can be, but the application of content analysis should culminate in results that can answer a research question or hypothesis. Studying the curse words that contestants on a matchmaking show use to refer to one another might be racy and interesting, but ultimately knowing that information should have a greater purpose.

The Process of Content Analysis

As previously stated, it is critical that content analysis is conducted systematically. As such, scholars outline various step-by-step processes for utilizing the method. While the number of steps in the process differs by scholar, most agree on several key steps to conducting a content analysis.

Define the Population

First, researchers must define the population, or what is going to be studied. Carefully defining the population is an important step in the process. The population should be consistent with the research question, and should be narrow enough to be manageable. For example, a population for the research question, “What words do protagonists in romantic comedies use to describe their love interests to their social network?” would be romantic comedy films. However, the number of existing romantic comedies might be too vast. It could be more useful to focus on films within a specific time frame (e.g., 2005–2015), films with consistent protagonists (e.g., single 30-somethings in New York City), or films featuring female protagonists. The population should be clearly defined.

Select Coding Units

Once the population is defined, coding units, or units of analysis, are selected. Coding units are what is coded and counted from the population. Coding units are observable and measurable and are a consistent way of categorizing the text. Coding units can be words, phrases, amount of time or space utilized, paragraphs, full articles, speakers, characters, photographs, advertisements, television programs, and the like. Coding units should meet three criteria: exhaustive, mutually exclusive, and rule-based. Coding units should be exhaustive, and cover all possibilities; that is, all coded items should fit into a category. For this reason, content analysts will often include an “other” category. Not only should all coded items fit into a category, but they should be mutually exclusive; that is, coded items should only fit into one single category. If a coded item can fit into multiple categories, the categories are not defined narrowly enough and should be refined. Finally, coding units should be rule-based. Before coding begins, rules should be established for what items will be coded and into which category an item will fit.

Select Sample of Messages

Once the population is defined and coding units have been selected, messages are sampled. Sampling is done for a variety of reasons. Sampling should be large enough for meaningful analysis and to claim that the sample is representative of the larger population.

Researchers identify several options for sampling, including random, stratified, interval, and cluster sampling. In random sampling, every text in the population has the same chance of being selected for analysis. Stratified sampling identifies strata (e.g., time slot, geographical region, type of ad) and proportionately selects a sample within each strata. Interval sampling involves drawing a sample based on regular intervals (e.g., every third broadcast, each Monday edition of a daily newspaper, every nth episode). Finally, cluster sampling, sample groups fitting the specific population, and elements within the group are coded. For example, a cluster sample could include all prime-time, network television shows airing Thursday evening from 7–10 p.m.

Coding, Analysis, and Interpretation

Once the population is sampled, messages within the sample are coded, analyzed, and interpreted. Messages are coded based on the coding units, and frequencies of codes are calculated. Coding is an important part of the process, and to address concerns of reliability, multiple coders will code the same messages. If coding inconsistency is high, the inconsistency should be reported and explained, and the unreliable data should be removed from the final dataset. Once codes are tabulated, data is analyzed and reported, most often by reporting descriptive statistics, including tables. Finally, the results are interpreted to answer the research question. Results should be analyzed considering how the results contribute to theory and how the results contribute to practical knowledge.


While content analysis is useful as a descriptive tool, it has limitations. First, while content analysis can describe communicative messages and trends, content analysis cannot be used to infer cause–effect relationships. Content analysis also faces challenges of generalizability; that is, sampling can be difficult for a variety of reasons, and it is often difficult to compile a representative sample. As a result, researchers cannot generalize the results of the study to other categories of content analysis. Content analysis is also a complex, time-consuming, and meticulous process.

While content analysis as a method has limitations, ultimately it serves as a useful and heuristic tool. Content analysis is useful for describing communication phenomenon, and can be used as a starting point for future causal research. Content analysis can be widely used in a variety of different contexts for a variety of purposes, and therefore the communicative messages that can be studied using content analysis are unlimited, provided that they are recorded and accessible. Content analysis provides a systematic, quantitative examination of communicative messages from which descriptive inferences can be drawn.