116 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI categories. Otherwise, the numbering of all other codes in the category would have to be adjusted to properly sort the new subcategories. More on code word labels, quotations and numbers You have seen above that you can create a classification system to distinguish between various levels of codes in the code list. A code word label should not be too long, even if there is no maximum length imposed by the software. The margin offers only limited space and you don’t want to use up half of your screen space displaying the Code Browser or the Code Manager just to see the code labels. Code labels are also used in the Code Co-occurrence and the Code-Document Table. Because of this, you should accustom yourself to using abbreviations and to writing a longer and more detailed definition in the comment field of the Code Manager (or the inspector in the Mac version). The ‘right’ length for quotations As with other aspects of qualitative research, there are no set regulations, only rules of thumb. Consider the likelihood that you will not always work in the context of your data. You will create paper output from time to time, or export data to Excel tables that give you the quotations but not the context. A quotation should be long enough for you to understand its meaning without the context. If you know the data very well, quotations can be shorter. Three words might be enough to remind you of everything else that is going on in the data, but when you work in a team and the labor of coding is divided, you are less familiar with all the data and so quotations need to be longer. There could, for example, be a rule that quotations should be at least a full sentence. Quotation length also relates to the chosen analytic approach. Conversation analysts tend to code very short data fragments of only a few words. Therefore, the answer needs to be: it depends… For you to test whether the quotations are the correct length for your first coding experience, there are several ways to read the quotations of a selected code. I describe how to do this in Skills training 5.2. What to do with repeated occurrences Another question that is often asked is what to do when the same thing is mentioned several times in a longer paragraph. Can quotations be ‘interrupted’? Should the entire paragraph be coded or every segment separately? Since there is no ‘split quotation’, i.e. a quote that summarizes non-contiguous data segments in one quotation, I recommend either coding each instance individually or coding only the first one and linking the other data segments via hyperlinks. If you code the entire paragraph you cannot focus on the issue of interest, because there will be too much other information around it when retrieving the quotations later.
CREATING A CODING SCHEME 117 Figure 5.10 Hyperlinking repeated occurrences Coding only the first occurrence and ignoring the rest is not a satisfactory solution either, because there might be a reason a person is repeating something again and again. If the duplications include different nuances and, thus, are not mere repetitions, this is a good reason to code them. When there is no additional information, you can simply code the first instance, create quotations from the repeated instances and link them to the first one via hyperlinks (see Figure 5.10). This way, you do not lose anything. Everything that you do not code or mark in any way you will overlook in the later process of analysis. For more information about hyperlinks and how to create them, see Chapter 7. The ‘right’ number of codes I want to come back to the puzzle analogy. If your codes are sufficiently abstract, they describe a group of puzzle pieces that belong together thematically. If you code very close to the data, you will generate many codes and, thus, label almost every piece of the puzzle. If you already have 500 codes after coding the first few interviews, you have broken up the data into pieces by noticing interesting things in them. Collecting has so far only occurred very rarely. An indication is the total number of codes combined with the code frequencies in the Code Manager. If a project has a high number of codes, the code frequency will be
118 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI low for most of them. However, you do not have to worry if a few of your codes have a low frequency. That could be because there currently is too little data on the subject, or perhaps it’s only relevant to a small group of study participants. If this difference were to be subsumed under another code, it would no longer be apparent. Thus, just because a code has a low frequency does not automatically mean it needs to be merged. A general rule is that a developed project should show a healthy balance between the total number of codes and the frequency of each one. Evaluate yourself after the first coding exercise. Are you a lumper, who uses more abstract terms for code tags, generating only a few codes that hold many quotations? Or are you a splitter, who works very close to the data, creating many descriptive codes, each having only a few quotations? Or are you somewhere in between? I have not conducted a randomized quantitative survey on coders and coding styles, but from experience and knowing that most things in this world are evenly distributed, most coders will be somewhere in the middle, using a mix of abstract and descriptive codes, while a few will be at the extremes, creating either just a few abstract codes or a lot of descriptive codes (see also Guest et al., 2012: chapter 3, who offer similar guidance). More on categories and subcategories When you start coding, your codes may be at various levels of abstraction, as shown in Table 5.1. When creating a code system, you need to clarify which of your codes is a higher-order or lower-order code – in other words, a category or subcategory. With a code, we collect data segments (quotations) that are similar enough for us to unite them under the same label. When we look at all the codes, in the development of categories, we look for codes that are similar enough to be placed under a common higher-order label. These codes then become the subcategories of a category. All subcategories within a category must be clearly distinguishable from each other, and at the category level, each category must be distinguishable from all other categories. This is described by Freeman (2017: 20) as a feature of categorial thinking: ‘most categories, by virtue of being, or made into, instances of something, are therefore at the same time not instances of something else. This quality of the category may seem self-evident but it is by determining its relationship to other items […] that the overall conceptual structure underlying the categories is made visible’. From time to time, users ask me what to do if a code fits into two categories. The answer is: you need to rethink your categories if that’s the case. An effective way to do that is to diligently write code comments. As you go through and define each code, describe what you mean and what you do not mean, and give an example of a data segment encoded with the code. You will soon see which codes are not clearly separated in their meaning. Possible actions are to merge codes of similar meaning, to change code tags or to partially recode data segments until you finally see how everything fits together. The goal of categorization is to sort and organize data, so that they can be queried, that categories can be contrasted and compared across as well as within and to find and establish relationships between them.
CREATING A CODING SCHEME 119 Let’s return to the dog example that I gave at the beginning of Chapter 4, where I stated that a dog could, for example, be categorized by race, by its duties or in a relational way. The data segment about the dog can be coded by multiple codes. This is not a problem and often desirable as, for instance, you may be interested in the relation between a particular breed and the relational aspect, or a dog’s duties and what this means for the relation between the dog and the human. What is important to understand is that the codes within and across categories need to be mutually exclusive. This is illustrated in Figure 5.11. All subcategories for each of the three categories are mutually exclusive; each subcategory belongs to only one category. The data segment, however, has been coded with subcategories of each category. This is the prerequisite for cross-tabulating codes (see Skills training 6.3). Figure 5.11 Developing mutually exclusive categories There are no fixed rules on how to identify or construct a category. As Freeman (2017:17) noted, ‘there is no agreed theory for determining the relationship between categories and concepts’ (here referred to as codes and subcategories). It all depends on how you frame it – whether you look for taxonomies like dog breeds, or relational aspects like friend, helper, foe or both. The conceptualization we choose may, among other things, depend on our academic background, our research questions, the literature we have been reading and our personal life experiences. It depends, too, on the things that we are able to notice; we must first be able to
120 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI perceive and identify them (Bruner et al., 1956; Goldstone and Kersten, 2003). I will return to this notion when discussing inter-coder agreement in Chapter 9. It is important to understand that we are making assumptions about what we are researching. And I would like to invite you to reflect on the interdependence between the categories you develop and your own background. As Freeman (2017: 19) put it, ‘The conceptual categories you use to make sense of the world are constructed out of experience and, in turn, are used to make sense of experience.’ Do I need to code everything? This is a typical student question. After asking whether they need to transcribe everything, students’ next question is, ‘Do I need to code everything?’ My answer usually is, ‘Yes – with a few exceptions.’ One of those exceptions is if you need to go through a large corpus of data material that holds information you are interested in but also contains lots of information that is not relevant. In this case, you can use the project search and the automatic coding to find relevant sections to code them in more detail. There are methodological approaches where researchers only look at certain aspects of the data – for example, in phenomenological research. Phenomenologists are only interested in how something is experienced and not in any of the interpretations the research participant may provide. They, therefore, may code only those parts that are relevant for a phenomenological assessment. Coding the entire data corpus, however, will allow the same or other researchers to look at other aspects of the data to complement the findings. When you collect your own data, why go through all that effort of conducting interviews or focus groups only to analyze half of them? Parts of an interview that you initially find uninteresting may prove to be very relevant if you look at them more closely or in conjunction with other aspects that are mentioned elsewhere in the interview. What is not said can at times be more interesting than what is said. At first, it may seem no more than a polite gesture when an interviewee asks whether you want some more coffee. Examining the text more closely, it may turn out that the interviewee was trying to avoid answering your question and looking for an opportunity to digress from the topic or gain time to think of an answer. You cannot know this initially and, therefore, I do not think it is wise to transcribe an interview only partly or code it only selectively. If you do not know from the outset whether a passage is important or not, you can invent a code name that reflects this, such as ‘not sure yet’, ‘maybe important’ or ‘review later’. These codes mean that you will not completely forget about these passages. Anything not coded has no chance of being reviewed later, and if these seemingly unimportant passages turn out to clarify what is going on in your data, then this outweighs the little extra work required. BUILDING AN EFFICIENT CODE SYSTEM In Figure 5.12, you can see how the charting of the landscape is taking shape. During firstcycle coding, only larger components such as flowers, animals, houses, etc. were marked.
CREATING A CODING SCHEME 121 Subsequently, the flowers were classified by color, the animals by species, the forest and the house were divided into different components and the various aspects of the lake were identified. This is like what I did above when I reworked the merged code list of the four coders. I identified the various positive and negative effects of having a child, different reasons for having or not having children and a few issues related to parenting. As the codes were only based on a short exercise and saturation of themes was not reached, several topics remained unsorted. When working on your own projects, you need to continue coding until you have the feeling that no new themes will emerge. If you start with a list of predefined codes that you enrich with codes derived from the data, the point of saturation is likely to be reached sooner. The proper time to start structuring your code system is the first saturation point, when you mainly use drag and drop to apply already existing codes. Figure 5.12 Structured data landscape You already know the ATLAS.ti tools and functions that you can use to structure your coding system – but as a novice it may not be obvious how to use them. Therefore, you can practice this in Skills trainings 5.3 and 5.4. You will learn: • How to create subcategories from a larger group of quotations that were collected under one code label. • How to build a main category from small pieces of data that are coded on a descriptive level.
122 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI But first I want to show you how to retrieve quotations for a code in different ways. You will need this skill for later exercises. SKILLS TRAINING 5.2 RETRIEVING ALL QUOTATIONS OF A CODE • Open the Code Manager and select a code. • In the contextual ribbon for Codes, select Report. • In the dialog that opens, chose Selected Items in the first section. • As report options chose Quotations and Content. If you do not select Content, only the quotation ID and name is included in the Report. • Read the quotations and decide whether you have chosen the right length. Figure 5.13 Creating a report in the Code Manager In the Quotation Manager, you can review quotations as follows: • Open the Quotation Manager and select a code in the side panel. Now you only see the quotations of the selected code. Click on each quotation and read it in the preview area (Figure 5.14). (If you double-click you can read it in the context of the data.)
CREATING A CODING SCHEME 123 Figure 5.14 Reviewing quotations by code in the Quotation Manager If you want to export the data to Excel: • Click on Excel Export in the contextual ribbon of the Quotation Manager. • Select Filtered Items in the filter section, and, as report options, ID and Quotation Content. As above, if you select Quotation, only the quotation name will be included in the report. This is a useful choice for audio, video and geo data, where you have no textual content. Figure 5.15 Creating an Excel report
124 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Starting with version 8.4, you will find a preview option in every quotation list. This is a convenient way to read the quotation content without the need to create reports. SKILLS TRAINING 5.3 DEVELOPING SUBCATEGORIES When you start coding, it is sometimes easier to first collect data segments with similar content under one main topic rather than thinking about potential subcategories immediately. As we have seen, coders who mainly code like this are referred to in the literature as ‘lumpers’ (Bernard and Ryan, 2010). If you find yourself being a lumper or lumping occasionally, this skills training is for you. The goal in developing subcategories is to achieve a good description of heterogeneity and variance in the data material. In principle, two approaches are possible: subcategories can be developed based on previous knowledge (i.e. known aspects from the theoretical literature) or generated empirically based on the data material. In the following I will explain the empirically based approach. This is what Bazeley and Richards (2000) refer to as ‘code-on to finer categories’. • Download the project bundle file ch5_skills training 5.3_building sub categories from the companion website and import it (see Skills training 2.1 for more detail on how to import a project). The codes in this project are equivalent to the castle pieces of the jigsaw puzzle. They each hold a loose collection of data segments relating to the ‘but factor’ of having children, the effects of parenting, the variety of definitions for happiness and so on. No subordinated aspects have been coded for yet. • Open the Code Manager and look at the list of codes. All the codes hold lots of quotations and there are too many things lumped into one code. Thus, you need to split them. I will explain the procedure based on the code ‘effects positive’. If you want to practice more, you can repeat the exercise by selecting any of the other codes in the list. When working with your own data, you may already have some subcategories in mind when you start splitting a code. Since you are less familiar with the sample data, please first read all quotations that were collected under the code ‘positive effects’. There are several possibilities for this, as shown in Skills training 5.2. • When reading the quotations, you will notice that some of them refer to the same aspect of the main topic and can be summarized under a sub topic. As soon as you notice some commonalities, start collecting these aspects. One possibility is to write them down on a piece of paper and run a tally. Figure 5.16 shows my notes. Given enough screen space, another way is to type your ideas into a memo.
CREATING A CODING SCHEME 125 Figure 5.16 Noticing and collecting in developing sub codes Looking at the extracted terms helps to conceptualize them further. ‘Extended world view’ could be subsumed under ‘enriched life’, ‘fulfilment’ under ‘gives meaning to life’, and ‘appreciation’ and ‘joy/feel alive’ could be summarized under ‘positive for me’. The aim is to map the bandwidths of the topics mentioned and to find a common label for the aspects that are most similar. Each subcategory needs to be distinct from the others, but all need to be similar enough so that they can be united under one main heading. Once you have decided which subcategories you want to use, you can open the Split Code Tool (Figure 5.17). It is also possible to read the quotations directly in the Split Code Tool. After you work with ATLAS.ti for a while, you will find out which way you prefer. In part, Figure 5.17 Split Code dialog
126 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI this will depend on which device you are working with, a PC with connected monitor(s), where you have more space to review the quotations, or a laptop. The Split Code Tool was first introduced with version 8 of ATLAS.ti and then revised based on user feedback. At the time of writing, the revised version of the tool is only available in the Mac version. Therefore, the screenshots below are from the Mac version (the tool’s implementation in Windows will look similar). • Select the code ‘effects positive’ in the Code Manager, right-click and select Split Code from the context menu. Before you start splitting a code, think about which prefix you want to use. This will also become your category name. Always keep in mind that the limiting factor is the space on your screen. Do not choose long prefixes. This makes it difficult to work with spreadsheets when creating output. That is why I labeled the code ‘effects positive’, not ‘positive effects of parenting.’ Also note that the code name begins with ‘effects...’ because I also have an ‘effects negative’ code. If I were to names these two codes ‘positive effects’ and ‘negative effects’, they would be listed under the letters P and N, based on the alphabetical order of the code list, rather than one following the other. • Click on New Codes four times and enter the following codes as subcategories, replacing the numbers behind the prefix. The category name ‘effects positive:’ is automatically added as a prefix if you move to the next line. • You can now begin to assign the quotations to the various subcategories. • Go through the list of quotations and sort them into fitting subcategories. You can use either of the following two strategies for this task: a. You can check one of the codes in the ‘Resulting Codes’ list on the right and begin to read the quotations on the left. If you find one that fits, select it and click on the Add button. The number of the assigned code will be listed in the ‘Codes’ column behind each quotation. Figure 5.18 Adding codes as subcategories
CREATING A CODING SCHEME 127 b. The second strategy is to read a quotation, decide which subcategory to apply, select the code in the ‘Resulting Codes’ list and click on the Add button. Continue with the next quotation. If you follow this strategy, do not forget to deactivate the previously selected code. Otherwise, both codes will be assigned to the next selected quotation. Figure 5.19 Assigning quotations to subcategories You can add a quotation to two or more subcategories if you think the quotation has more than one aspect. It is quite common to mention several aspects in a sentence or an analysis unit. In order not to lose the context and for further analytical purposes, such as code co-occurrence analysis, it may be useful to apply multiple codes. There are exceptions, though. If you want to analyze inter-coder agreement (see Chapter 9) or use a content-analysis approach like the one by Schreier (2012), applying multiple subcategories of the same category will violate the rule of mutually exclusive coding. If you use the latter, you can activate the option ‘Mutually exclusive’. ATLAS.ti will then not allow you to select multiple subcategories (not yet implemented in the Windows version at the time of writing). We have not yet looked at the settings at the top of Split Code tool. I suggest that you select the settings shown on the left: Keep the original code, but do not keep the quotations. There is no point in double-coding quotations with the category code and the subcategory codes. It a) clutters up the margin, and b) if you want to access all quotations from a category, it is much easier to use a code group for this. This means the original code stays in the list of codes, but it will have no quotations. All quotations will be linked to the subcategories. Whether you want to copy the comment from the original code to all subcategories depends on what you have written as a comment.
128 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI • After you have assigned all quotations, click on the Split Code button. • In the Code Manager, you can now see the result of the splitting process. ATLAS.ti has created the subcategories and sorted all quotations into them. Unless you found some quotations that you could not sort into subcategories, the original code that you split shows a groundedness of 0 (zero). • I suggest renaming the original code and changing the spelling to upper-case. • Select all codes of the category and assign a color. • Next, add all codes of the category to a code group of the same name. The name of the code group can be in lower-case. The code group can be used as a filter to quickly access the codes and all quotations of the category, and it will be used later for further analysis purposes (see Chapter 6). Figure 5.20 Code list after splitting Using upper-case letters for the category code has several purposes. ATLAS.ti offers me as a user the entity ‘code’. What I do with it is my choice. Since it makes sense for methodological and organizational purposes to distinguish between types of codes, I have developed a naming convention (see Table 5.2). Codes in upper-case represent categories, codes in lower-case with a prefix signify subcategories, codes with a hashtag are attribute codes and so on. The categories in upper-case also have the effect that they stand out like a title. I am always asked what the purpose of the category code is because it is empty. As mentioned above, it serves as a title. As you proceed with coding, you can use the category code to capture quotations if none of the existing subcategories can be applied. If a data segment matches an existing subcategory, you can code it with the subcategory. If you encounter something that matches the overall category, but it does not fit any of the existing subcategories, then you can use the category code. After collecting another five to ten instances, you can review them and decide whether to create a new subcategory or change existing ones. Therefore, it is important that the category code is empty before proceeding with the coding. Should it contain quotations, they should be only those that could not yet be assigned to a subcategory. When you review all
CREATING A CODING SCHEME 129 newly collected quotations, you may get an idea for a new subcategory or decide to rename an existing subcategory so that it encompasses the new data segments. Consider the subcategories developed at this time as provisional; they can change. If you continue coding, it will become easier to find better matching code names. It could also be that you find that a subcategory is not suitable at all and that you need to integrate it elsewhere. You have only coded very few documents up to this point and this is not the last version of your code list. But you can already work with it, and the more structured your code list, the easier it will become to use it. The Split Code Tool will be revised in a future update and will allow you to split a category code as often as needed. Knowing how to do it without the tool may, however, also be useful at times – for example, when you only want to reassign a few quotations (this means to manually assign quotations to existing subcategories replacing the main category code). This can be done by dragging the subcategory onto the category code in the margin area (see Skills training 4.5). Figure 5.21 shows a potential workflow. I opened the Code Manager in a new tab group on the right side of the document window and filtered the list by selecting a code group, so that I only see the codes of the category I am currently working on. Note the light-yellow bar on top of the code list. It shows that I have set a local filter. To retrieve its quotations, I double-clicked the category code and docked the list of quotes on the right side of the screen. This way, I can conveniently go through and review the quotes. Figure 5.21 Manual splitting SKILLS TRAINING 5.4 BUILDING CATEGORIES FROM DESCRIPTIVE LABELS Are you a splitter who has already created a few hundred codes for your first two or three interviews? Then this exercise is for you. • Import the project bundle file ch5_skills training 5.4_building categories from descriptive labels. • Open the Code Manager. This project holds 47 codes related to positive and negative effects of being a parent. Many of the code labels are based on the words the respondents used and, thus, are very close to
130 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI the data. I extended the length of the quotations to include more contextual information. For this exercise, I want you to take a closer look at the 47 codes (Figure 5.22). The aim is to learn how to move the analysis from a descriptive to a conceptual level. Let me remind you once more of the puzzle analogy: most of the codes in this list represent only individual pieces of the puzzle. The frequency of each code is very low. If you leave this list unchanged without summarizing similar codes, you will inevitably end up in the code swamp (Figure 5.23). At this stage, it often happens that analysts use code groups to collect their descriptive codes without further conceptualization. That is not a clever idea. It only results in the list of codes remaining unsorted and becoming very long. This complicates querying the data in the next phase of the analysis and the visualization of relationships in networks. It also becomes difficult to explain your coding scheme to a third person. Below, you will see that I use code groups as filters, so I can better focus on specific aspects. However, code groups do not replace the development of categories and subcategories in the Figure 5.23 Avoiding the code swamp code list itself. Figure 5.22 List of codes that need to be aggregated
CREATING A CODING SCHEME 131 The codes in this project stand for things that I noticed in the data. So far, I have not collected much. To master the conglomerate of terms, I can now use the software to group similar codes. This allows me to reduce the code list using the filter function, so that I can focus on those codes that belong to one topic. If you look at the reduced list of codes, it is easier to see which codes are similar or even stand for the same meaning. The next step is to merge these codes. It does not make sense to keep all codes that describe only one or two segments of data. Analysts are often afraid of merging as they fear that they will lose something. Rest assured, you do not lose anything. Any quotations you have noticed will not disappear; they are just linked to a code at a higher aggregated level. What you lose are the descriptive labels that clutter your code list. If you want to keep the project in its current state, make a snapshot (File/Snapshot). You can then revisit your first code list at any time. The name of the snapshot project consists of the original project name plus date and time: ‘ch5_skills training 5.4_building categories from descriptive labels (Snapshot current data & time)’. You can enter a different name if you prefer. In what follows, I show how you can aggregate descriptive codes to create categories. If you look at the list of codes, you will quickly find that it is thematically about the positive and negative effects of parenting. Therefore, I propose to first look for codes that name a positive effect and to collect them in a code group. To do this: • Select the first few codes in the list that you find are about positive effects by holding down the Ctrl key (e.g. the codes ‘a bit wiser’, ‘appreciating my own parents more’, ‘becoming a better person’, etc.). Drag these codes into the side panel to the left and enter the name ‘Effects positive’. Gradually add all the other codes that you think describe a positive effect. If you are not sure, look at the corresponding quote(s). • Once you are done, click on the code group ‘effects positive’. The list of codes is now reduced to the codes that relate to the positive effects of parenting. According to my evaluation, these are 22 of the 47 codes (Figure 5.24). Since this project has only 47 codes, you may wonder why I recommend creating a group. If you code most of the data as in this example, you have not just 47 codes but 500 or more. Reducing them to 22 codes that are about one topic will be a tremendous help. This will allow you to better see which codes are more or less the same and, therefore, can be merged. • Look at the reduced list of codes. The task is to find codes that belong to a common concept. The concept could already be represented by one of the codes in the list, such as personal growth. If this is not the case, you need to find abstract labels that you can use to combine similar codes. I suggest reducing the list further to the four following concepts: � Personal growth � Improved relationships � Meaningful life � Positive feelings
132 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Figure 5.24 Filtered list of codes • Select all codes that are related to personal growth by holding down the Ctrl key. Right click and select Merge Codes from the context menu. • In the Merge dialog, select personal growth as code that you want to keep and click on the button Merge Codes. Figure 5.25 Merge Codes dialog
CREATING A CODING SCHEME 133 • Look at the code list: it has become much shorter; and in my example project, the code ‘personal growth’ now holds eight quotations. • Select all code labels that relate to improved relationships and merge those. As there is no code label ‘improved relationships’, you can merge the codes into any one of the selected. After merging, rename this code ‘improved relationships’. Right-click and select Rename from the context menu or click F2. • Repeat the process for the concepts ‘meaningful file’ and ‘positive feelings’ (Figure 5.26). Figure 5.26 List of reduced codes after merging If you remove the filter at this time, any codes that you have identified as part of the higher-order concept ‘positive effects (of parenting)’ will no longer be displayed one below the other in the code list due to the alphabetical order. Therefore, you need to add a prefix to all these codes, so that they are united under a common heading. I suggest using the prefix ‘effects positive:’, so that all effects of parenting codes (positive and negative ones) are underneath each other later in the complete list. • Add a prefix to all four codes by renaming them. If you enter the prefix for the first time, highlight it and press Ctrl+C to copy it. This will allow you to paste the prefix in all other codes when renaming them Ctrl+V. • When you have finished renaming, remove the filter by clicking on the X on the right side of the light-yellow bar. The last thing missing is the category code. Since you have built the category from the bottom up based on descriptive labels, it does not exist yet (different from Skills training 6.3). • Create a new code with the name EFFECTS POSITIVE. Add it to the code group ‘effects positive’. • Highlight all codes and give them a color (see Skills training 4.10).
134 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Voilà, there is your new category with subcategories (Figure 5.27). Figure 5.27 Representation of a category with subcategories in the code list It is now time to write a definition for the category and for each subcategory in the code comment field. • Repeat the above process for negative effects of parenting starting by collecting all codes that stand for negative effects into a code group with the name ‘effects negative’. • Click on the code group to filter the list of codes. • Look for codes that can be subsumed under a more abstract label and merge them. • Add prefixes. Remove the code group filter. Add the category code EFFECTS NEGATIVE and color the codes of the new category using the same color. In the process of category development, you will notice that some of the quotes are not the right length or that some quotes do not fit into the category you are currently developing. It is about working not only with the code labels but also with the quotations at the same time. As you do so, you become more familiar with your data, learn more about its meaning and, over time, become more confident about which code fits where. The first reordering may not at once lead to the final version of your code system. You may need to go through a second, third or even fourth cycle to sort all your codes. Analysis is an ongoing process and until you have completed the coding process, your code names may change and shift their positions. Suppose you analyze a total of 20 documents. After coding five documents, you begin to add more structure to your coding system. Then you continue to code. If you need to change it a lot, then you know that the coding system needs to be developed further. If you only need to make a few modifications – that is, if you don’t find many new things of interest or only a few new subcategories – then this is reassurance that your coding system
CREATING A CODING SCHEME 135 already fits the data well. An obvious precondition is that you need to stay alert to new, as yet unlabeled, phenomena in the data and that you do not force the developed coding system onto the data. SKILLS TRAINING 5.5 DEFINING CATEGORIES ON THE ‘RIGHT’ LEVEL In Skills training 5.4, you learned what to do to avoid the code swamp. There is another dangerous path that leads into the swamp if you are not careful. You come to this path when you mix two or more categories. Let’s take a look at the coded version of the example project that we used in Chapter 2: Children & Happiness sample project (chapter 2). • Open the Code Manager. The codes at the top of the list that start with a hashtag (#) are so-called attribute codes. They encode characteristics of persons such as male, female, whether they have children or not, etc. Below this, you will find the various categories like ‘CHILDREN & HAPPINESS’, ‘DEFINITION HAPPINESS’, ‘EFFECTS NEG’ and ‘EFFECTS POS’. Figure 5.28 Code list with categories and subcategories on two levels Another choice could have been to convert the attribute ‘childless’ and ‘being a parent’ into a category and add all other aspects as subcategories, such as whether a childless person is male or female and what they wrote about the positive effects of parenting. The code list would then look like this:
136 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Figure 5.29 The second way into the code swamp This effect would continue for all other codes as well. Do you see what the problem is? Each of the subcategories for ‘effects pos’ appears four times in the list: in the code label for childless females, childless males, female parent and male parent. As a rule of thumb, every time you notice a repetition in a code name, something is wrong with the category level. In the example given here, aspects belonging to three categories have all been put together in one code. When this happens, the code label also becomes very long. It could also be that you only have two levels and think that everything looks well structured and your coding system is OK, but that could be a deceptive conclusion. If there is repetition in parts of the code name, something is wrong here, too (see Richards, 2009; Richards and Richards, 1995; Araujo, 1995). The two code lists in Figure 5.30 show two real-world examples that I have modified a bit. In the code list on the left, different types of meats were used as categories; in the code list on the right, theoretical models were used as categories. The decision to use different types of meat and theoretical models as categories allows for nothing more than a descriptive analysis. You can count how many times something was said about the production or the profusion of each type of meat, or how many benefits and challenges have been mentioned for one of the theoretical models.
CREATING A CODING SCHEME 137 Figure 5.30 Two code lists with repetitive labels You would reach a different level of analysis if you were to correct these coding schemas. This would mean creating a category TYPE OF MEAT with the subcategories beef, chicken and turkey. This is like the attribute codes in the Children & Happiness project. All quotations referring to the repetitive sub codes ‘biology’, ‘location’, ‘processing’, etc. must be merged. These codes will then hold many quotations and need to be split. Most likely, the current subcategories will not automatically become categories. Respondents may have mentioned something about different types of locations, pros and cons of locations, production conditions, production-improvement strategies, outcomes and so on. All this is now hidden in the data and difficult to extract as you never get the complete overview of all aspects, as type of meat now divides them. The same applies to the second example with the theoretical models (Figure 5.31). In terms of coding, the data segments need to be coded with multiple codes. The coded segments might be the same or overlapping. In the Children & Happiness sample project, I used the prefix ‘#fam’ for all family characteristics, and the prefix ‘#gender’ for male and female respondents. The hashtag has the effect of pushing all these codes to the top of the list and, therefore, they are sorted above the content codes. In addition, I colored them
138 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Figure 5.31 Suggestion on how to rearrange the codes so that the category is on the right level gray to easily recognize the content codes and the attributes codes in the margin area next to the coded documents. Figure 5.32 shows how this looks in ATLAS.ti. Gender, as well as the fact of whether someone has children or not, are coded for separately. This allows more flexibility for further analysis when relating codes to each other. A third code captures that this segment is a blog entry as compared to other types of data. All other aspects that occur within this blog entry are coded wherever they fit. Figure 5.32 Applying categories and subcategories on the correct level The reason why aspects such as gender or family characteristics need to be coded in documents 3 and 5 of the Children & Happiness project is related to the fact that each document contains several respondents. In total, there are 139 comments from different people in the two documents. If you have case-based data, as is usual in interview studies or survey data, you must use document groups to capture family characteristics and gender.
CREATING A CODING SCHEME 139 SKILLS TRAINING 5.6 COMPARING THEMATIC TO INTERPRETIVE CODING As explained in Chapter 2, I can only address a few methodological issues; I cannot provide a complete description of one or more research approaches, including their epistemological roots. In this skills training, I'd like to show you what other features ATLAS.ti offers to explore your data without the need for coding. As you have learned by now, when you code a data segment, a quotation is created at the same time. You can also create free quotations, which means there are no codes associated with these. The equivalent to the paper-andpencil method would be to mark some lines in a document with a marker. Unlike working on paper, the ATLAS.ti quotation has a name that defaults to the first 70 characters of the quotation or, for multimedia quotes, the name of the document. This default name can be changed, and you can replace it with a short summary description. You can also write a comment. Unlike short margin notes on paper, ATLAS.ti has no restrictions on the length of the quotation comment. If you find that you code very close to the data and generate many codes, you should consider creating only free quotations at first. Instead of linking these quotations to codes that only describe the data segment, use the quotation name instead. This will prevent you from getting into the code swamp, from which it might be difficult to get out again. Instead of generating too many codes, which then must be merged again, you will initially only work with the Quotation Manager and the quotation list. Once you have ideas for codes on a more abstract level, you can begin to code these quotations. The added advantage is that merging is not necessary. All renamed quotations are preserved, and you do not lose the descriptive labels. This is also a suitable approach if you chose an interpretive approach to analysis, as you can write your interpretations into the quotation comments while you go through your data. In the following exercise, I would like to invite you to do so. It is very interesting what happens and the types of insights you gain if you begin brainstorming about the words a respondent has used, asking yourself what they mean and why respondents used them. If you are not used to this style of working, you may be skeptical and even reject it as being subjective. Evaluate it again, after you have completed this skills training. • Open your own project that you have used for Skills training 5.1, where I asked you to code the data for content. Let’s take a look at the first paragraph in document 3: ‘I was happy before I had kids and am happy now. However, the first year of motherhood was rough. I was only 25 and becoming a mom forced me to grow up. All of a sudden, I had to become a lot less selfish and a lot more responsible, which is not easy in a culture that glorifies self-centeredness and irresponsibility. I’m a better person for becoming a mom, even if I’m not any happier.’ • Look at your coding. How have you coded this first comment? If you have not coded it yet, please do so now. You can see my version in Figure 5.33 below.
140 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Figure 5.33 Tagging for content • Now let’s take a closer look at the text. Which words strike you? Take these as a starting point for your interpretive writing. You could start with the word ‘happy’. What does it mean for you? What does it convey? What do we conventionally associate with it? What are ‘happy’ situations or circumstances? How does it feel to be happy? How does the opposite feel? What makes you feel happy or unhappy? What are the consequences? Can we do something to actively achieve happiness (or misfortune)? Can we do something to control it? • I suggest you create a quotation for the entire text section and write your thoughts and ideas on these two paragraphs into a quotation comment. To do so, right-click on the bar in the margin area and select Edit comment from the context menu. Figure 5.34 Opening the Comment Editor for quotations • The next word you can write about is ‘kids’. Why is she using ‘kids’ rather than ‘children’? Interestingly, she is using the plural form. What are the associations that come to your mind when thinking about the word ‘children’ as compared to ‘kids’?
CREATING A CODING SCHEME 141 Here are a few more ideas that you can think and write about: i In the second sentence, the respondent uses the word ‘motherhood’ and in her last sentence the word ‘mother’. ii Look for flag words like ‘always’, ‘never’, ‘only’ and the like. The interviewee writes: ‘I was ONLY 25 and becoming a mom...’ The cultural context is important here. The data comes from a US blog. iii Regarding the use of the words ‘motherhood’ and ‘mom’, the word ‘becoming’ is interesting to look at in this context as well. iv Furthermore, consider the words ‘rough’, ‘forced me’ and ‘glorifies’. It is important to go beyond the context of the current study. This will help you to think more conceptually. What comes to your mind when you hear the word ‘rough’? Based on these first associations, ask further questions and continue with the brainstorming. Further below, I will offer my own interpretation. v The words ‘suddenly’ and ‘responsible/irresponsible’ are likely to yield interesting insights as well, if you take a closer look. Here is my interpretation. The use of the words ‘motherhood’ and ‘mom’ points to an outside–inside perspective. The term ‘motherhood’ is more distant and objective, whereas ‘mom’ is more personal and emotional. In connection with the word ‘becoming’, it appears that the respondent describes a process – in sociological terms, a role change towards the acceptance of the parent role. ‘Forced me’: to be forced into something is something that comes from the outside. It is usually not something you have chosen yourself. In combination with her feeling that she was only 25 when she became a mother, one can assume that she had an unplanned pregnancy. In other cultural contexts, women feel old if they do not have children at the age of 25. (Note to me: look up some statistics about the average age of first-time mothers in the US). ‘Rough’: my first association with the word ‘rough’ is a rough sea, wind, stormy weather, danger. If the weather forecast announces rain and storm, then we can protect against it – for example, by wearing the right clothing or by staying in the house and avoiding the weather conditions (strategy). If we do not, we may get wet, catch cold or even get hurt (consequence). Another association could be: rough surface. On the positive side, rough surfaces are good because we do not slip and fall when we walk on them. On the negative side, when we drop a mobile phone onto it, the mobile phone could be scratched or damaged. What we can deduce from all these considerations are ideas and concepts for further coding and analysis. Besides the feeling that it is a rough process, how do other people describe it? If others also experience the process as rough, what factors are playing a role? And, vice versa, if the process is described as a positive experience, why is that? What makes the case different? If you find it hard to develop ideas, just type the term into a search engine like Google or Bing. There you will find enough inspiration including lists of synonyms. Synonyms for irregular surface: uneven, irregular, bumpy, lumpy, knobby, stony, rocky, rugged, rutted, etc.
142 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Synonyms for rough, non-gentle behavior: violent, brutal, vicious, aggressive, belligerent, pugnacious, thuggish, boisterous, rowdy, disorderly, etc. I also do this exercise in my workshops. One time, rough sex was mentioned as an association with ‘rough’. This took me by surprise for a moment. But even if it sounds a bit far-fetched at first, it fits in well with the overall interpretation and adds another twist. Rough sex could take place under duress. Like the weather, it is something the person cannot control. She may protect or defend herself. You can ask whether one can be prepared for such a situation? And if so, how? But rough sex could also be chosen freely if there is an agreement and all those involved enjoy it. You might also enjoy taking a walk in stormy weather. What are the commonalities and the differences? And what does this have to do with the Children & Happiness project or the data segment we are examining at the moment? While going through this brainstorming exercise, we have discovered several continuums that help us to look more closely at the subject: voluntary – involuntary optional – forced being afraid of something – enjoying it protecting oneself – being unprotected being prepared – being unprepared avoiding – accepting Parenthood can be planned or unplanned. One can accept the role or avoid assuming responsibility. Depending on one’s attitude and context, one can be well prepared for this role or not. This may affect how one experiences the transition. Some may be afraid of it; others are looking forward to it. Here is another interpretation related to the three words ‘rough’, ‘forced me’ and ‘glorifies’: metaphorically speaking, these remind me of war, battle and victory. It was not mentioned directly by the respondent, but by using those words she might be expressing the struggle that took place in accepting her new role as a mother. Other women may be experiencing something similar, like the ongoing dispute between the decision to stay at home with the children or to work. They may have an opinion about it but, due to societal or family pressure, they feel forced to do something they do not want. Or because of their decision, they constantly must defend themselves against hostile comments. This all might affect how they see the relationship between personal happiness and having children. When I coded the data to prepare the sample project, I took a content-analytic approach. If I were to continue the coding now, I would look for indications of role change and how other respondents dealt with it. Do they also describe a process? How did they experience and deal with it? What were the factors and contexts contributing to a successful role change? What were the hindrances? Do others write about planned vs. unplanned pregnancies, and if so, do their attitudes about the relationship between happiness and children differ? Even if you plan to perform a content analysis, it may still be useful to select some interesting segments of data at the beginning of your analysis and interpret them as shown above. I’m sure it will help you to see content that you would not otherwise have noticed.
CREATING A CODING SCHEME 143 Advantages of a well-sorted and structured code list There are several advantages to organizing your code list in the way that I showed you above. Not only does it help you find your way around, but it also adds transparency to the research process and makes it easier for others to understand your code list. It also adds methodological rigor. When reviewing your list of codes, you need to think about which codes are still too close to the data and which, while abstract, collect too many data segments and are, therefore, too crude to be useful for further analysis. You need to consider whether and which codes are part of a category, and on what methodological level a code belongs. See also Fielding and Lee, 1998: 92ff.; Charmaz, 2014: chapter 3; Saldaña, 2009. A well-structured code list is also important for further analysis, where you look for relationships and patterns in the data, with the goal of integrating all results to tell a coherent story. If, as in a survey, you only have questions with the answer categories ‘yes’ and ‘no’ in your questionnaire, your data will only consist of nominal variables. This means that the analysis is limited to a few test procedures, such as cross-tabs and chi-square tests, and does not go far beyond the descriptive level. This is like a code list that consists of a set of codes whose analysis level remains undefined. An example of such a code list is shown in Figure 5.35. The total number of codes is not too bad: 168 for the entire project. Thus, the problem is not that too many codes have each been applied to only a few data segments. Rather, there are codes with low frequencies and codes with very high frequencies, of over 150 quotations. What is missing is the development of subcategories, on the one hand, and the aggregation of codes under a common denominator, on the other (i.e. the processes you learned about in Skills trainings 5.3 and 5.4). Figure 5.35 A fully developed code list should not look like this one
144 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI I can criticize this code list because it is my own, developed in 1999 for my dissertation research. Learning does not stop even after you get your PhD! From an analytic point of view, I did develop the ideas collected in the code list further. For example, I differentiated the code ‘Definition of impulse buying’, which holds 157 quotations, in a network. But this is not visible in the code list. The study was on impulse buying and I interviewed three groups of shoppers: 1 Utilitarian buyers: those who do not like to go shopping at all. 2 Compensatory buyers: those that do not mind going shopping and sometimes also shop for emotional reasons. 3 Addicted buyers: those who go shopping mostly for emotional reasons. I have asked everyone for their own personal definition of impulse buying. Figure 5.36 shows the networks of two of the surveyed buyer groups. On the right-hand side of each network are codes that I have derived from the literature. In addition, I have developed codes inductively from the data material; these are the psychological aspects of impulse purchases and financial considerations. As seen in the networks, not all codes were relevant to each subgroup. For example, ‘Reminder impulse buying’ was not part of the definition for addicted buyers. For them, the psychological aspects were far more important. Financial considerations were mentioned by all respondents, but these were different for each group. From the networks, one can presume that the code ‘Definition impulse buying’ might be a category and the other codes potential subcategories (Friese, 2000). Figure 5.36 Differentiation of the code ‘Definition of impulse buying’ for two buyer groups When I was preparing for the ATLAS.ti summer school recently, I decided to revisit my dissertation project and follow my own advice. In reworking the code system, on the one hand, I looked for all codes with high frequencies with the aim of differentiating them and developing subcategories. On the other hand, I looked at all low-frequency codes to check which ones can be merged. As might have been expected, not all quotations coded with ‘Definition impulse buying’ belonged to the category of the same name, and not all subcategories shown in the
CREATING A CODING SCHEME 145 networks belonged to this category. As it turned out, some of the codes are subcategories of the categories ‘Reasons for impulse buying’ and ‘Triggers’. Figure 5.37 Reworking the code list of my dissertation project in the current version of ATLAS.ti When developing guidelines for building a code system, the available analysis tools also play a role. For example, knowing that one can create a cross-tab with codes of different categories helps to see why it is important either to apply multiple codes from various categories, or to code in an overlapping fashion. The recommendation is, therefore, to develop categories that contain only one level of subcategories (two if necessary), so you can flexibly combine the different aspects when querying the data and avoid unnecessary long code lists and code labels. See also Guest et al. (2012: chapter 3); Bazeley (2013: 177ff). Using syntax to distinguish between distinct levels and types of codes As mentioned in the Introduction, CAQDAS offers its users the entity called code, node, keyword and the like (codes in ATLAS.ti). However, it does not tell us how to use it. It wants to be open methodologically and does not want to be a specific software for methodology X that uses terminology A and B. Thus, it is the user who must add methodological meaning to the entity called ‘code’ and decide whether it is a category, a subcategory, a semantic domain with subcodes, a class, a dimension, a property, etc. In the Table 5.2, I give some suggestions of how you can add methodological meaning to your codes that will allow you to differentiate between different types and levels of codes.
146 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Table 5.2 Syntax for different types and levels of codes What Naming in ATLAS.ti Example Level/type Concept Lower-case letters/black personal growth richer life type Category Upper-case letters/colored EFFECTS POS level Subcategory Lower-case letters / same color as category code Effects pos: personal growth Effects pos: gives meaning level Unsorted concepts Lower-case letters prefixed with an asterisk (*), no color *scientific evidence *advise type Dimension Lower-case letters prefixed with a special character like a forward slash, colored by category /TIME /time: during /time: after type Socio-demographics, names if speakers, actors, locations, organizations, etc. need to be coded Lower-case letter prefixed with a hashtag (#) or @ for speakers, gray #gender: male #gender: female @Tom @Linda type I often find that users who know how to organize codes in ATLAS.ti at once start using prefixes, even though they are just beginning to code. If you do not use a deductive coding scheme, it is usually not possible to know after 20 minutes if a code is a higher- or lower-order code. Therefore, I recommend coding a few documents openly. Gather ideas before you think about how to sort and structure them. If you are using an interpretive method, you will need less data than if you are using a content-analysis approach, because you begin to think in abstract terms earlier (see Skills training 5.6). As mentioned, it is an appropriate time to start organizing your codes once you have reached a first point of saturation. You have achieved this if you only occasionally create new codes but otherwise use existing ones. When you have reached this point, start reviewing your codes and quotes, as shown in Skills trainings 5.3 and 5.4. Split codes with a high groundedness, and group thematically similar codes with low groundedness. The aim of grouping is to filter the list of codes, so you can better see what can be merged and which higher-order concepts can be developed from it. After the first round of restructuring your code system, it is a good time to export your project to keep it on file for later, when you write up your analysis. This way you preserve the initial coding phase and by comparing it with more developed versions of later stages, you can describe how you progressed from A to B to C in your project.
CREATING A CODING SCHEME 147 MOVING ON If you are familiar with quantitative research procedures, it may help to compare the two levels of analysis to descriptive and inductive statistics. The equivalent of descriptive statistics in computer-assisted analysis is the development of the coding system. When set up as explained in the previous chapters, the coding system provides an overview of what is in the data. The codes on the category level can be regarded as equivalent to the variables in a survey; the subcategories indicate the variations within the code, like the values of a variable. Once the data are coded, you have a good overview of your material and can describe it. Maybe you can already see some patterns. You can then take the analysis a step further by querying the data. The tools that can be used include the Code Co-occurrence Table and Explorer, the Code-Document Table, the Query Tool and the networks. The goal is to delve deeper into the data and find relationships and patterns. Writing memos is very important at this stage as much of the analysis does not just happen because you apply a tool. The insights come when reading the data resulting from a query and when writing summaries and interpretations. I recommend using a special type of memo for this, something that I call research-question memos (see Skills training 5.7). At the outset, your research-question memos may simply contain descriptions and summaries of your data. Depending on your research approach and level of methodological knowledge, you may decide not to develop your research-question memos further. A thorough description may be all that you want and need. If you take the analysis a step further, some of the descriptive memos may serve as building blocks for more abstract and theoretically rich memos (see also Bazeley and Richards, 2000: part 6; Corbin and Strauss, 2008: 119). To see the benefits, you really have to do it and not just read about it. If you do, you will find out that a lot of things happen when you enter this second phase of the analysis. It is now that you begin to see how the various codes relate to each other; you develop and test new questions, and you may find answers that you did not expect to find. In time, the research-question memos will become more abstract as you get ideas for further questions, add new research-question memos and basically take it one step further at a time, gaining more and more understanding, exploring more and more details of your data landscape and starting to see relations between them. You are then ready to make more extensive use of the network function, explained in Chapter 7. SKILLS TRAINING 5.7 WRITING RESEARCH-QUESTION MEMOS You may begin to collect ideas in research-question memos as you code, but more extensive writing occurs after all the data have been coded and you start to use the analysis tools which the software provides. In qualitative research, you generally do not start with hypotheses, but in most cases, you have some research questions. When using an inductive approach, researchers develop questions and hypotheses along the way. If you begin your project with some research questions in mind, you can create research-question memos very early on in the analytic process, adjust and modify them and add some more with progressive analysis.
148 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI A fully developed research-question (RQ) memo: • has a proper title that tells you what it is all about; • can be identified as an analytic memo by its type; • begins with a well-written research question, possibly followed by sub questions; • includes the query (!); • contains an answer to the query – when you work on a memo over a prolonged period, you can also insert the date and time to mark your progress; • contains your answer and interpretation; • is linked to quotations that illustrate the points you are making in the memo and supports your analytic thoughts. Let’s get started with what you already know by creating a research-question memo for RQ1 (research question 1): • Open the Memo Manager and create a new memo. • Enter the title ‘RQ1: Definition of happiness – comparing childless respondents and parents’. The new memo opens in a tab next to the Memo Manager. • Change the memo type to ‘Analysis’: go back to the Memo Manager. Select the memo. In the ribbon of the Memo Manager, select Set Type. • The type ‘analysis’ does not yet exist. Type it into the memo type field. It will be added as a new type. Click on the Set Type button (see Figure 5.38). Figure 5.38 Set memo types To fill the memo with content, you can either go back to the memo editor or type the content of the memo into the pane on the lower left-hand side next to the comment field.
CREATING A CODING SCHEME 149 • Start by typing the full research question: ‘Who writes something about the definition of happiness? And of those who write something, what do they write? Are there differences between parents and childless people?’ Next, you must think about how you can find an answer to the research question. To make your research transparent, I suggest that you write down how you queried the data. As we haven’t discussed the advanced analysis tools, you don’t know yet what you need to do to answer this question. Here’s how it might look: Figure 5.39 Research-question memo • Write down the query. Run the query. Include outputs if applicable. I have seen many memos that contained a lot of good interpretation, but when I asked the analysts how they came up with the ideas, they could not remember. It just takes a minute or two to spell out the query, and the return for this effort is manifold. It adds a lot to your analysis in terms of trustworthiness, credibility, transparency and dependability – in other words, the quality criteria by which good qualitative research is recognized (see, e.g., Seale, 1999).
150 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI • Interpret the results and read the quotations that result from the query. Write down your interpretation. When reading through the quotations that result from a query, you can link the memo with selected quotations. It works much like drag-and-drop coding. • If the Memo Manager is in a tab group other than the document you are browsing, you can drag the memo from the Memo Manager onto a quotation or vice versa. Another possibility is to drag the memo from the Project Explorer onto a quotation or vice versa (Figure 5.40). Figure 5.40 Linking a memo to a quotation in a document • If you click on a cell in the Code-Document Table, all quotations that belong to the cell are displayed. Also, from there, you can link quotations with a memo (Figure 5.41). When you link a memo to a quotation, groundedness increases. Figure 5.41 Linking a quotation to a memo from the Code-Document Table
CREATING A CODING SCHEME 151 When you begin to write your report, you can export these memos with all their linked quotations. Thus, research-question memos become the building blocks for your research report as they include the research questions, the query, the results, your interpretation and some example quotes. See Chapter 8 for further detail. Recommendations for organizing research-question memos After the data material has been coded, research-question memos help you approach the data analysis in a systematic way, by formulating precise queries for each individual question. The crucial point is that the memos are thematically related to contents and data segments instead of being spread out in bits and pieces all over the place. My suggestion is to create one memo for each research question. Broader questions need to be divided into sub topics, and it is probably best to create a memo for each sub topic. As you have done for codes, use letters and special characters to create abbreviations to structure and organize the list of memos. For example: RQ1: title RQ2: title RQ3: title Or: RQ1a: title RQ1b: title RQ1c: title RQ2a: title Special characters such as an asterisk (*) are useful if you want to place certain memos, such as the research journal, at the top of the list. If you know the research questions right from the beginning, you can add research-question memos right away when setting up the project. In this way, you can already store reflections and initial ideas on a specific topic in the correct container during the coding process. If you develop further research questions and hypotheses during the course of the analysis, you can add more memos. REVIEW QUESTIONS 1 Explain the puzzle analogy and how it relates to qualitative data analysis. 2 Are there any rules regarding the length of a coded segment? 3 What options do you have to structure the list of codes in the ATLAS.ti? 4 What are code groups useful for? 5 How would you go about developing categories?
152 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI 6 How would you go about developing subcategories? 7 How can the code swamp be avoided? What are common pitfalls? 8 After having gained some personal experience with coding, what type of coder are you? Are you a lumper or a splitter? Or do you mix both styles? 9 How would you explain the difference between a tag and a code in a methodological sense? 10 What are the advantages of a well-sorted and structured code list? 11 Explain thematic and interpretive coding. 12 What is the difference between a memo and a comment in ATLAS.ti? When would you write something into a comment field? For what purposes do you create a memo? Why is it important to know this difference? You might remember that I asked you this question already at the end of Chapter 3. In the meantime, you have gained some more experience in working with the software. What are your answers now? 13 Thinking about your own research project, what kind of memos would be useful to you and why? Test your understanding of computer-assisted N–C–T analysis: 1 Explain the N–C–T model and how it can guide the process of developing a code system. 2 Collecting is an important aspect of the N–C–T model. What role does ‘collecting’ play in the process of developing a code system and why is it important? 3 After reading this chapter and putting it into practice, can you describe how to apply the process of noticing, collecting and thinking when working on your own data set? 4 How is solving a jigsaw puzzle like qualitative data analysis? 5 How is qualitative data analysis different from solving a jigsaw puzzle? 6 What is a WASGIJ puzzle and what can be learned from it regarding qualitative data analysis? FURTHER READING Araujo, Luis (1995). Designing and refining hierarchical coding frames, in Kelle, Udo (ed.) Computer-aided Qualitative Data Analysis, chapter 7. London: Sage. Auerbach, Carl and Silverstein, Louise B. (2003). Qualitative Data: An Introduction to Coding and Analysis. New York: New York University Press. Bazeley, Pat (2013). Qualitative Data Analysis: Practical Strategies, chapters 5–7. London: Sage. Charmaz, Kathy (2014). Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis. London: Sage. Evers, Jeanine C. (2011). From the past into the future: How technological developments change our ways of data collection, transcription and analysis [94 paragraphs]. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 12(1), Art. 38, http://nbnresolving.de/urn:nbn:de:0114-fqs1101381. Evers, Jeanine C. (2016). Elaborating on thick analysis: About thoroughness and creativity in qualitative analysis. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 17(1), Art. 6, http://nbn-resolving.de/urn:nbn:de:0114-fqs160163.
CREATING A CODING SCHEME 153 Gibbs, Graham (2007). Analysing Qualitative Data (Qualitative Research Kit). London: Sage. Guest, Greg, MacQueen, Kathleen M. and Namey, Emily E. (2012). Applied Thematic Analysis, chapter 3. Thousand Oaks, CA: Sage. Kelle, Udo and Kluge, Susann (2010). Vom Einzelfall zum Typus: Fallvergleich und Fallkontrastierung in der qualitativen Sozialforschung. Wiesbaden: VS Verlag. Kluge, Susann (2000, January). Empirically grounded construction of types and typologies in qualitative social research. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 1(1), Art. 14, www.qualitative-research.net/fqs-texte/1-00/1-00kluge-e.htm. Morse, Janice M. (1994). ‘Emerging from the data’: The cognitive process of analysis in qualitative inquiry, in Morse, Janice M. (ed.) Critical Issues in Qualitative Research Methods, 22–43. Thousand Oaks, CA: Sage. Richards, Tom and Richards, Lyn (1995). Using hierarchical categories in qualitative data analysis, in Kelle, Udo (ed.) Computer-Aided Qualitative Data Analysis, chapter 6. London: Sage. Seidel, John (1991). Methods and madness in the application of computer technology to qualitative data analysis, in Fielding, Nigel G. and Lee, Raymond M. (eds) Using Computers in Qualitative Research, 107–16. London: Sage. Seidel, John (1998). Qualitative Data Analysis, www.qualisresearch.com/qda_paper.htm (originally published as Qualitative data analysis, in The Ethnograph v5.0: A Users Guide, Appendix E. Colorado Springs, CO: Qualis Research). Seidel, John and Kelle, Udo (1995). Different functions of coding in the analysis of textual data, in Kelle, Udo (ed.) Computer-aided Qualitative Data Analysis, chapter 4. London: Sage. Silver, Christina and Lewins, Ann (2014). Using Software in Qualitative Research: A Step-by-step Guide, 2nd edn, chapter 7. London: Sage. Wolcott, Harry F. (1994). Transforming Qualitative Data: Description, Analysis and Interpretation. London: Sage. Woolf, Nick and Silver, Christina (2018). Qualitative Analysis Using ATLAS.ti: The Five level QDA® Method. London: Routledge.
Querying the data and further steps in the analysis process 6 Ideally, you have begun to explore your own data landscape: developed some ideas, marked those things that are interesting and collected similar items in code containers. The next step is to query the data in a systematic way by going through the research questions and thinking about how to find an answer for each question in the data. ATLAS.ti offers several options to query the data. Thus, you need to attend some more skills training sessions as part of your journey. The Query Tool needs theoretical instruction first before you can apply and work with it. Other tools I can explain by way of example. Querying data goes hand in hand with writing memos. Much of the analysis ‘happens’ when you write down your findings, not by clicking buttons in the software and outputting results. You need to take a closer look at the results, read the related quotes and then summarize and interpret what you see. You can do this using the research-question memos that I introduced in Skills training 5.7. LEARNING OBJECTIVES In this chapter, you will learn how to create queries to find answers to your research questions. First, I’ll introduce you to the operators you can use to build queries. They are also
QUERYING THE DATA AND WRITING MEMOS 155 the basis for the code co-occurrence tools. Then we’ll look at the two analysis options, making it possible to tabulate coded data. You will probably use these analysis functions first to get an overview. The Query Tool in comparison allows you to ask more specific questions, and you can use a combination of operators to formulate complex queries. Once you have an idea of how this works, we add an extra layer to the analysis – the use of global filters. For example, with global filters, you can compare and contrast groups or limit code co-occurrence requests to selected documents and codes. SKILLS TRAININGS Skills training 6.1: getting to know the operators Skills training 6.2: creating and working with smart codes Skills training 6.3: getting to know the Code Co-occurrence Table Skills training 6.4: getting to know the Code-Document Table Skills training 6.5: creating queries in the Query Tool Skills training 6.6: learning about code queries in combination with document attributes Skills training 6.7: creating smart groups Skills training 6.8: working with global filters SKILLS TRAINING 6.1 GETTING TO KNOW THE OPERATORS Knowing about the available operators is important when using the code co-occurrence tools, when building queries in the Query Tool and when creating smart codes and smart groups. ATLAS.ti offers three sets of operators: • Boolean operators allow combinations of keywords according to set operations. They are the most common operators used in information-retrieval systems. • Proximity operators are used to analyze the spatial relations (e.g. distance, embeddedness, overlapping, co-occurrence) between coded data segments. • Semantic operators exploit the network structures that were built from the codes. Boolean operators You probably know and have used at least two of them (OR and AND) from literature or Internet searches. For the operator OR you usually enter a vertical bar: ‘|’. The following operators are available: OR, AND, ONE OF and NOT:
156 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI Figure 6.1 Boolean operators You may remember set theory from your math classes in school. Let’s go back a bit in your memory to retrieve some of that knowledge. Below, the four Boolean operators found in ATLAS.ti are explained and for each a Venn diagram is shown. Your task is to mark the area that results from the query for each operator. You will find the solutions at the end of the chapter in Figure 6.42. OR (or Any). The operator OR is the non-exclusive OR. The term ‘A OR B’ is true if A, B or both are true. Using this operator, the search system will return all quotations that are linked to at least one of the codes in the search term. Therefore, the Query Tool’s mission is to find at least one of the specified arguments. When you want the quotations of several codes, then it is easier to create a code group rather than use the Query Tool. A code group basically consists of codes linked with the OR operator: (C1 | C2 | C3 | C4 | C5). • In Figure 6.2, mark the area that results from the search A OR B. Figure 6.2 A OR B ONE OF (exactly one of the following is true). The ‘One of’ is the exclusive OR. ‘A XOR B’ is true if either A or B is true, but it is false if both A and B are true. ‘One of’ represents the colloquial ‘either–or’: ‘You can have either chocolate or ice cream.’ Thus, you can have one of the two but not both. In contrast, the OR operator allows you to have both. • In Figure 6.3, mark the area that results from the search A XOR B. Figure 6.3 A XOR B
QUERYING THE DATA AND WRITING MEMOS 157 AND (or ALL). The operator AND will find only those quotations that meet all conditions of the search term. It is very selective and finds only those segments that you have coded with two codes. Remember Skills training 5.5 on defining categories on the ‘right level’ and how to avoid the code swamp? My recommendation was to create different codes for different content areas like theme and dimension. Let’s assume you come across a data segment that contains an opinion that children make you happier and that parenting is hard but rewarding at the same time. Instead of having one code like ‘Children > happiness_hard work but rewarding’, it is better to code the text segment with ‘Children > happiness ‘and’ hard work but: rewarding’. Coded like that, you can use the AND operator in the Query Tool or the Quotation Manager to find all data segments where both codes have been applied. • In Figure 6.4, mark the area that results from the search A AND B. Figure 6.4 A AND B NOT. This operator is used to check if a condition is not applicable. Its formal meaning is that all results of the negated term are subtracted from all data segments in question. You only need to select one code to use this operator. Usually it is used to exclude the quotation of one or more codes: all A or B but not C. • In Figure 6.5, mark the area that results from the search NOT B. Figure 6.5 NOT B Proximity operators Proximity describes the spatial relation between quotations. Quotations can be embedded in one another, one can enclose the other, one can overlap the other or be overlapped by the other quotation. The operators in this section exploit these relationships. They require two operands as their arguments. They differ from the other operators in one important aspect: you need to observe the place where you insert them in a query. While ‘A OR B’ is equal to ‘B OR A’, this does not hold for any of the proximity operators: ‘A WITHIN B’ is not equal
158 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI to ‘B WITHIN A’. When building a query, always enter the expressions in the order in which they appear in their natural language manifestation. Figure 6.6 Proximity operators Finding embedded quotations If you are interested in reading everything about friendship in a larger segment that you have coded ‘childhood’, you use the WITHIN operator: friendship WITHIN Childhood. If you enter Childhood WITHIN friendship, you will find nothing as such a constellation does not exist. Figure 6.7 An example for the use of the WITHIN operator An example for the use of the ENCLOSES operator is: find all blog posts that contain information about sources of happiness (Figure 6.8).
QUERYING THE DATA AND WRITING MEMOS 159 Figure 6.8 An example for the use of the ENCLOSES operator Finding overlapping quotations The next two operators describe quotations that overlap one another: overlaps and overlapped by. A OVERLAPS B retrieves all quotations coded with A that overlap quotations coded with B. One could also say: quotations overlapping at start. A OVERLAPPED BY B retrieves all quotations coded with A that are overlapped by quotations coded with B. One could also say: quotations overlapping at end. The ability to ask exactly where A B overlaps or vice versa is a viable option when working with video data in which the order of events is often more important than in interview data. For example, consider a classroom situation. The teacher stands at the blackboard explaining something (A). The door opens, and a student comes in (B). Does the teacher continue with the lesson (A ENCLOSES B), or does he turn to the pupil who comes in and stops teaching (A is overlapped by B)? ATLAS.ti can currently only retrieve quotations and not the intersection of the overlapping segments or the area where both codes apply, as neither are quotations. This might be possible in the future. Figure 6.9 Overlap operators Finding co-occurring quotations Often when you’re interested in the relation between two or more codes, you don’t really care whether something overlaps or is overlapped by or is within it or encloses it. If this is
160 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI the case, you simply use the COOC (code co-occurrence) operator. This operator is a shortcut for a combination of the four proximity operators discussed above, plus the operator AND. AND is a set operator but also finds co-occurrence, namely all coded segments that overlap 100%. Figure 6.10 Example of a co-occurrence query The query shown in Figure 6.10 retrieves all quotations that are coded with the codes of the code group ‘effects of parenting positive’ that co-occur with quotations coded with the code ‘#fam: 2 or more children’ in whatever way. The possible combinations are: Figure 6.11 Five ways in which quotations can co-occur
QUERYING THE DATA AND WRITING MEMOS 161 The more general co-occurrence operator is quite useful when working with transcripts. In interviews, people often jump back and forth in time or between contexts, and, therefore, it often does not make much sense to use the very specific embedding or overlap operators. With other types of data, however, they are quite useful. If you have coded longer sections in your data like biographical time periods in a person’s life and then did some more finegrained coding within these time periods, the WITHIN operator comes in very handy in such instances. The same applies when working with pre-coded survey or focus group data where all questions/speakers are automatically coded by ATLAS.ti. Using the WITHIN operator you can ask, for instance, for all quotations coded with ‘topic X’ WITHIN ‘question 5’ or by ‘speaker Y’. Finding adjacent quotations The adjacency operators describe a sequence of disjointed quotations. Quotations following quotations: A FOLLOWS B retrieves all quotations coded with A that follow quotations coded with B. Quotations preceding quotations: A PRECEDES B retrieves all quotations coded with A followed by quotations coded with B. When selecting any of the two operators, you can specify a maximum distance. Possible base units are characters and paragraphs for text, milliseconds for audio files, frames for video data and pixels for images. I will not go into these operators in this book because at the time of writing this part of the Query Tool was still in development. The interested reader can consult the user manual. Semantic operators Semantic operators retrieve data based on transitive links between codes. Available operators are SUB, UP and SIBLINGS. Transitive links between codes are created with the help of the network function (see Chapter 7). A relation is transitive if whenever an element A is related to an element B and B is related to an element C then A is also related to C. There may be projects in which transitive relationships for all codes are conceptually relevant. So far, I have not had such a project, which does not mean it will not happen in the future. Some ATLAS.ti users use transitive links to build a hierarchical coding system. If so, the Code Forest in the Navigator on the left side of the main editor can be used to access codes. However, I recommend building the code system as described in Chapter 5 so that the various code levels and code types are visible in the Code Manager rather than using transitive links and the Code Forest for the display of code lists. If you do the latter, it becomes difficult to use networks for linking data across categories as shown in Figure 6.13. It requires a different way of thinking when using network links for organizational purposes. This is shown in the video tutorial ‘What’s behind the network view function’ that you can find Figure 6.12 The logic of transitive relations
162 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI on my YouTube channel and on the companion website. The structural links will always be in the way when you want to create conceptual links. Furthermore, all codes in all code lists remain unsorted. Figure 6.13 Using the network for organizing a code list vs. conceptual linking For the sake of completeness, here is an explanation of how the three semantic operators retrieve data. Maybe you will have a need for them in your project. The UP operator collects all quotations of all codes at higher levels – thus, of all parents of a code. The DOWN operator traverses the network from higher to lower concepts, collecting all quotations from any of the subcategories. The SIBLINGS operator retrieves all quotations of the selected code plus all quotations of codes on the same level.
QUERYING THE DATA AND WRITING MEMOS 163 EXPLORING THE DATA TERRAIN FURTHER – THE JOURNEY CONTINUES I will now take you on an excursion to explore the Children & Happiness data terrain in a bit more detail. We will look at various research questions and how to find an answer to them using the analysis tools that ATLAS.ti provides. These examples prepare you to explore more of the terrain on your own and to transfer this knowledge to investigations of other data landscapes in the form of your own projects. For the following exercises, please download a specially prepared version of the Children & Happiness project from the companion website. It is a project bundle file with the name Children & Happiness_analysis (chapters 6 to 8). • Import the project bundle file. • Open the Code Manager to get a feeling for what was coded and what kind of categories and code groups are available. SKILLS TRAINING 6.2 CREATING AND WORKING WITH SMART CODES Smart codes are a convenient way to store queries. They are very similar in look and feel to normal codes, with one important difference: instead of ‘hardwired’ connections to quotations, smart codes store a query to compute their virtual references whenever needed (see Figure 6.14). They automatically adjust in the course of the analysis. If you have a smart code based on a query like ‘(Code A | Code B) COOCCUR Code C’ and you add or delete quotations linked to either Code A, B or C, then the smart code will always deliver the current results. Depending on the chosen analytic approach, smart codes can be regarded as ‘stored’ hypotheses and, thus, be used to test hypotheses based on newly collected data. Figure 6.14 Difference between a smart code and a regular code Smart codes are symbolized in the program by a gray dot at the bottom left of the code icon. You can select smart codes just like other codes in the Code Manager or in networks and display their virtual connections to quotations. Smart codes can be part of a code group.
164 QUALITATIVE DATA ANALYSIS WITH ATLAS.TI They can also be an argument in other queries and, thereby, support the creation of highly complex queries. What you cannot do with smart codes is code data. Another aspect that you need to be aware of is that smart codes do not remember filters. If you create a smart code while a global filter is set, this will not be considered when creating the smart code. The list of results always displays the quotations from the entire project once the global filter is removed again. You will learn more on global filters in Skills training 6.8. Creating smart codes The first step in creating a smart code is the formulation of a query. For that matter, it is irrelevant whether the query returns any results or not. Smart codes are ‘intentional’, meaning you can also create them based on a query without results. This saves the trouble of having to reformulate the same query later in the analytic process. Simple queries based on the AND and OR operator can be formulated in the Quotation Manager. For more complex queries you will need to use the Query Tool. In the following, I show you how to create a smart code in the Quotation Manager. We will need this smart code for further analysis in the other tools. Take a look at the attribute codes in the Code Manager: Figure 6.15 Attributes codes in the sample analysis project Fifteen comments were written by people who do not want children, 75 by people with children and five by people who do not have children yet. There are more comments than these 95, but it was not obvious from all comments whether they had children or not. If we want to compare comments by those with and without children, we need to work with quite uneven group sizes. To improve this a little bit, we can combine all comments of those who do not want children and of those who do not have children yet. Their commonality is that they have no experience with their own children, albeit this is based on a different motivation. Thus, we need to run an OR query. • Open the Quotation Manager.
QUERYING THE DATA AND WRITING MEMOS 165 • Hold down the Ctrl key and select the two codes ‘fam: don’t want children’ and ‘#fam: no children yet’ in the side panel on the left. • Check the filter setting in the light-yellow bar above the list of quotations. It should be set to ANY (read: show quotations that are coded with any of the selected codes). Figure 6.16 Simple OR query in the Quotation Manager • Switch between the ANY and the ALL operator, just for practice. If you select ALL, there will be no quotations in the list as these two codes are mutually exclusive and never applied together. • Right-click on one of the two codes in the side panel and select Create Smart code from the context menu. Or click on the Smart Code button in the ribbon. • ATLAS.ti creates the smart code and uses as the default name the query ‘#fam: don’t want children|#fam: no children yet’. Rename the code (you can do it in the side panel – right click and select Rename – to ‘#fam: don’t have children’. The code has 20 quotations: 15 + 5. If you want to create a smart code combining codes with OR, you can also select two or more codes in the Code Manager, right-click and select Create Smart Code. In version 8.4 and higher, you can create smart codes using the OR operator, also in the Code Manager. Select two or more codes, right-click and select Create Smart Code from the context menu. Editing smart-code queries If you get into the habit of clicking queries and develop a taste for complicated queries that you want to store as smart codes, you have the option to edit the query directly.