This R Markdown file accompanies the one-hour workshop for Simple Qualitative Administration For File Organization and Development (SQAFFOLD) created by Szilvia Zörgő. The script below integrates the directory structure and functionality of the R package {rock}, which implements the Reproducible Open Coding Kit (ROCK) standard. For more ROCK functions and materials, see: https://rock.science. Below “{rock}” is used to refer to the R package, “ROCK” refers to the standard or to both the R package and the standard simultaneously.
Resource type | URL |
---|---|
Open Science Framework repository | https://osf.io/v7c9x |
Website | https://www.sqaffold.org |
Git repository | https://gitlab.com/sqaffold/1-hour-workshop |
License | CC0 1.0 Universal |
Rendered version of script | https://sqaffold.gitlab.io/1-hour-workshop |
SQAFFOLD Main repository | https://gitlab.com/sqaffold/sqaffold-main |
This is the one-hour workshop for SQAFFOLD. For a more detailed script, please download SQAFFOLD Main at https://gitlab.com/sqaffold/sqaffold-main.
SQAFFOLD is a directory system, which aims to facilitate the organization of materials generated in qualitative and unified research projects. The subdirectories can be flexibly rearranged or extended to suit your current project. SQAFFOLD also contains plain text files; some are to be filled with project content (if it makes sense for your project), other plain text files contain ideas, tools, links to resources that may be useful in various stages of your research process.
Through {rock} functionality, SQAFFOLD also has the potential to be used for various tasks within a qualitative or unified research project, such as cleaning qualitative data, adding unique utterance identifiers to data segments, coding and segmenting data, and tabularizing them. To access these functions, you need to use the R script below.
Supporting slides for this workshop are accessible here:
https://osf.io/xfudc
The first slides are on the fundamentals of Open Science (OS) and its various domains, followed by a discussion on how OS can apply to qualitative and unified research methods.
The discussion ends in viewing an example for a preregistration
template. We look at the Preregistration Template for Qualitative and
Quantitative Ethnographic Studies, which can be downloaded here:
https://osf.io/fchqr
This section of the workshop closes with downloading a zip of SQAFFOLD at: https://gitlab.com/sqaffold/sqaffold-main. Unzip it on your computer and take a look at the directory.
The next section of the workshop focuses on using the R programming language to leverage the script found in SQAFFOLD.
If you haven’t yet, please create an account on Posit Cloud
(formerly: RStudio Cloud; https://posit.cloud) and then go to the URL for the
project for this workshop:
https://posit.cloud/content/6308975
Once it has loaded, click “Save a permanent copy” at the top:
This will store the project in your account’s workspace, so you that your changes are preserved and you can always return to it. If you do not save a permanent copy, you will be ejected from the temporary project after a while and will have to start over.
The script below contains R commands (in the gray sections called “chunks”), which can be run individually by pressing the green “play button” in the chunk’s upper right corner. Note, you will only see this option if you open the script in posit/RStudio; otherwise, this file, like every other file in SQAFFOLD, merely contains plain text.
Below are some basic {rock} functions you can run from SQAFFOLD. To
access full {rock} functionality, see: https://rock.opens.science.
Go to the script within the posit Cloud project (click on the .rmd file
within the scripts directory).
Run this chunk every time you start a session! The chunk below will install all R packages needed to run the commands in the script. It also contains default options for {rock} and paths to subdirectories. Run it by clicking on the green play button in the top right corner of the chunk.
### package installs and updates
packagesToCheck <- c("rock", "here", "knitr", "writexl");
for (currentPkg in packagesToCheck) {
if (!requireNamespace(currentPkg, quietly = TRUE)) {
install.packages(currentPkg, repos="http://cran.rstudio.com");
}
}
knitr::opts_chunk$set(
echo = TRUE,
comment = ""
);
rock::opts$set(
silent = TRUE,
idRegexes = list(
cid = "\\[\\[cid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]",
coderId = "\\[\\[coderid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]"
),
sectionRegexes = list(
sectionBreak = "---<<([a-zA-Z][a-zA-Z0-9_]*)>>---"
),
persistentIds = c("cid", "coderId")
);
### Set paths for later
basePath <- here::here();
dataPath <- file.path(basePath, "data");
scriptsPath <- file.path(basePath, "scripts");
resultsPath <- file.path(basePath, "results");
Three plain text files containing data (i.e., “sources”) have been placed into the “010—raw-sources” subdirectory located within the data directory. Also, there are also some attributes of the mock data providers listed in the file called “case-attributes”.
The cleaning command places each of the sentences in your data on a new line. The {rock} package enables you to code data line-by-line, and recognizes newline characters as indicators of this, lowest level of segmentation. The chunk below will write the cleaned sources found in “010—raw-sources” into the subdirectory “020—cleaned-sources”.
rock::clean_sources(
input = file.path(dataPath, "010---raw-sources"),
output = file.path(dataPath, "020---cleaned-sources")
);
If it makes sense for your project, you may choose to add a unique identifier to each line of data (i.e., “utterances”). This is helpful, for example, if you want to merge different versions of the coded sources into a source that contains all codes applied by multiple researchers. The chunk below will write the sources with uids into the subdirectory “030—sources-with-uids”.
rock::prepend_ids_to_sources(
input = file.path(dataPath, "020---cleaned-sources"),
output = file.path(dataPath, "030---sources-with-uids")
);
Please visit the rudimentary graphical user interface, iROCK (available at: https://i.rock.science). This interface allows you to upload your sources, as well as codes and section breaks (for higher levels of segmentation), then drag and drop those into the data.
Click the ‘Sources’ button at the top to load a source. It will show you a dialogue similar to that shown in Figure 3. To load the example source, copy-paste the following URL into the field as shown in Figure 3 and press [ENTER].
Then repeat that to load the example codes and section breaks, this time copy-pasting these two URLs:
Example deductive codes to use: https://gitlab.com/sqaffold/1-hour-workshop/-/raw/main/codes/codes.txt
Example breaks to use: https://gitlab.com/sqaffold/1-hour-workshop/-/raw/main/codes/section-breaks.txt
When you loaded all three the files into the right place, you should see something similar to what is shown in Figure 4:
You can now start coding and segmenting. To use one of the codes or section breaks you loaded, drag them from the right-hand panel and drop them where you want them in the source. If you make a mistake, simply click the section break or code to delete it again.
When you are done coding, you can download the coded source by clicking Download. Normally, it is vital to not forget that, but in this workshop, you will be working with pre-added coded sources.
Run this chunk every session during which you want to employ the functionality below (e.g., inspecting fragments, code frequencies, heatmaps)! This command will assemble all your coded sources and attributes into an R object that can be employed to run analyses and other commands below. Note, coded sources and attributes have been pre-added for your convenience.
dat <-
rock::parse_sources(
dataPath,
regex = "_coded|attributes"
);
If you’d like to collect and inspect coded fragments for only certain codes, you can use the command below by changing the code labels “CodeA” and “CodeB” to the codes you’d like to inspect. You can modify the amount of context you wish to have around the coded utterance by changing “2” to any other number.
rock::inspect_coded_sources(
path = here::here("data", "040---coded-sources"),
fragments_args = list(
codes = "CodeA|CodeB",
context = 2
)
);
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
With this command, {rock} creates a code tree, which can be flat or hierarchical depending on the employed codes. In this workshop, we use a flat code structure.
rock::show_fullyMergedCodeTrees(dat)
This command will allow you to see a bar chart of the code frequencies within the various sources they were applied. The command also produces a legend at the bottom of the visual to help identify the sources based on color.
rock::code_freq_hist(
dat
);
Code co-occurrences can be visualized with a heatmap. This representation will use colors to indicate the code co-occurrence frequencies. Co-occurrences are defined as two or more codes occurring on the same line of data (utterance). The console will also show you the co-occurrence matrix from which the visualization was generated.
rock::create_cooccurrence_matrix(
dat,
plotHeatmap = TRUE);
CodeA CodeB CodeC CodeD
CodeA 6 1 2 1
CodeB 1 8 1 3
CodeC 2 1 4 0
CodeD 1 3 0 11
This command will enable a tabularized version of your dataset, which for example, can be employed to further process your data with software such as Epistemic Network Analysis (https://www.epistemicnetwork.org), or “merely” represent your coded data in a single file. In this dataset, rows are constituted by utterances, columns by variables and data. The file will be an Excel called “mergedSourceDf” located in the results subdirectory.
Beware, when re-generating the qualitative data table the {rock} default is to prevent overwriting, so either allow overwrite within the script, or delete the old excel before you run this chunk. (The posit Cloud version of this script allows overwriting.)
rock::export_mergedSourceDf_to_xlsx(
dat,
file.path(resultsPath,
"mergedSourceDf.xlsx")
)
It is very important to remember that this script is for the SQAFFOLD
workshop; a complete script with more functions and more elaborate
explanations is available for download at:
https://gitlab.com/sqaffold/sqaffold-main
For more on ROCK terminology, see: https://sci-ops.gitlab.io/rockbook/vocab.html.
Thank you for participating in this workshop. If you have any
questions or would like to make suggestions on how to improve SQAFFOLD
or ROCK, feel free to write to: info@rock.science.