Author

Silvia Canelón

Published

May 12, 2020

R script R Markdown JSON example file

Airtable is a user-friendly and powerful tool that until recently I’d been using for personal projects (i.e. document organizing, apartment hunting, etc.). A couple of weeks ago I leaned on Airtable to create a base designed for the Philadelphia Reproductive Freedom Collective to support our COVID-19 mutual aid efforts.

Shout out to the Silicon Valley Democratic Socialists of America who provided inspiration and support! Their mutual aid network is extensive and operates at an impressive scale. Get connected with the South Bay Mutual Aid network here: https://www.aidnetwork.org/

Having fallen in some Airtable deep-work I figured maybe it was time to retire my Trello boards in favor of some task bases. Airtable accepts CSV files from Trello but, alas, my free Trello account only gave me the option to print as a PDF or export as JSON. I decided this would be a good opportunity to learn how to parse JSON data and export it as a CSV ready for import to Airtable.

What to expect

The code provided in the R Markdown file requires the tidyverse, JSONlite, and lubridate packages as well as a JSON file containing your Trello data. It will generate the following:

  • tidy_cards.csv file ready to import to Airtable
  • attachments folder containing any content you’ve attached to your cards using a URL.
  • attachment_errors.csv file containing the card name, or task, as well as the URL/file path of any attachments that failed to download (i.e. links to files stored locally), so that you exactly which links you’ll need to locate manually.

By the end of the tutorial you’ll have imported a CSV file into Airtable and configured Airtable to match your Trello board. Specifically, you’ll be able to migrate labels, lists, attachments, card names, card descriptions, card due dates, date last modified, and whether the cards were marked “complete” or archived.

What not to expect

  • Rendered markdown from your card descriptions. Airtable does support markdown in its “long text” field type, but it does not automatically render from imported text.

Choose your own adventure

  1. Download the R Markdown file or R script linked above and run with it. Jump down to section Formatting in Airtable for pointers on how to get your Airtable base configured like your Trello board.
  2. Keep reading for a code-through of the R Markdown file, using the example featured in the picture above, with tips and resources along the way.

Follow along with me!

The Trello board used for this code-through is this Program Management Template created by Soniya Ahuja. To download the Trello board data, go to Menu > More > Print and Export and then right-click on “Export as JSON” to save the JSON file in your working directory. Or save a little time and download the program-mgmt.json file linked above .

You may also choose to read through Making Sense of Trello’s JSON Export.

Last but not least – we’ll be keeping things tidy, with the help of the tidyverse collection of R packages, to ensure each variable is in its own column, observations are in rows, and values are in cells. You can read more about tidy data in Chapter 12 of R for Data Science.

Importing JSON file

# loading libraries
library(jsonlite)
library(lubridate)
library(tidyverse)
trello <- stream_in(file("program-mgmt.json")) # produces a data frame
glimpse(trello)

A lot of information about the Trello board is contained in JSON data. We’ll be parsing information about the cards as well as the lists and then joining them together by a common identifier id or idShort.

Parsing information

Parsing information from JSON data was not something I was familiar with and found myself coming back to Kan Nishida’s post Working with JSON data in a very simple way time and time again throughout this process.

Cards

The first step is to extract information about the Trello cards themselves. This information is contained within a list of data frames and requires flattening which makes the nested hierarchical data structure into a flatter structure by assigning each of the nested variables its own column as much as possible. Then, the most important variables are selected as cards_trim before moving on to extracting label information.

# selecting cards information
cards <- trello$cards # list of 1

# flattening
cards_flat <- flatten(cards) #list of 37

# tibble time
cards_flat_tbl <- as_tibble(cards_flat) # 32 obs of 37 variables
glimpse(cards_flat_tbl)

# selecting wanted variables
cards_trim <- cards_flat_tbl %>%
  select(id, idShort, idList, dateLastActivity, name, desc, dueComplete, due,
         labels, attachments, shortUrl, closed) %>%
  arrange(desc(dateLastActivity))

Labels

Relevant information about the labels is selected and the unnest function is used to flatten because labels is a list of data frames. Again, I found Kan’s post helpful here! Particularly for saving the label details as a character list, which is helpful later on. Once we get to Airtable it’ll be important that label information for each card be structured as a simple list of words (i.e. label1, label2, label3). We get close once the labels are contained within a character list labelList, but there are still “c”s and parentheses that need to be removed. String manipulation is something I’m still learning about so the code below is far from elegant!

# extracting labels details
labels_info <- cards_trim %>%
  select(id, idShort, labels) %>%
  unnest() %>% # no arguments because the nested items don't have names
  rename(labelName = name) %>%
  select(id, idShort, labelName) %>%
  group_by(id, idShort) %>%
  summarize(labelList = list(labelName)) %>%
  mutate(labelList = as.character(labelList)) %>%
  mutate(labelList_tidy = str_remove_all(labelList, pattern = "\"")) %>%
  mutate(labelList_tidy = str_remove_all(labelList_tidy, pattern ="c\\(")) %>%
  mutate(labelList_tidy = str_remove_all(labelList_tidy, pattern ="\\)")) %>%
  unique()

knitr::kable(labels_info %>% head(n = 3L))

# joining back with main cards data frame
ct_labels <- left_join(cards_trim %>% select(-labels), labels_info %>% select(-labelList))

Attachments

The next step is to download all of the items attached to the cards onto a local folder. I found this StackOverflow post really helpful. When I tried this out on my own Trello board I also found that I couldn’t download the few attachments I had made from my local drive. This StackOverflow post helped me figure out how to flag and catch these download errors so that I could create a list of the urls with “attachment errors” that I could follow up with manually.

# expanding the attachment lists into separate url records
att_urls <- ct_labels %>%
  select(idShort, attachments) %>%
  unnest() %>%
  select(idShort, url) %>%
  mutate(url = as.character(url),
         attachmentError = 'FALSE')
knitr::kable(att_urls %>% head(n = 3L))

# creating directory for attachments
dirAttachments <- "attachments/"
dir.create(dirAttachments)

# downloading urls and checking for errors using try()
for (i in 1:length(att_urls$url)){
  locAttachments <-
        paste(dirAttachments, "/", att_urls$idShort[i], "_", basename(att_urls$url[i]), sep = "")
  step_to_try <- try(attachment_check <- download.file(att_urls$url[i], destfile = locAttachments))
  if("try-error" %in% class(step_to_try)) {
    cat("Error row: ", i, "\n", "Error message: ", step_to_try[1], sep = "")
    att_urls$attachmentError[i] = 'TRUE'
  }
}

The following selects the attachment records with errors, renames somes variables, and exports the data frame as a CSV.

# preparing data frame for export to CSV
attachment_errors <- att_urls %>%
  filter(attachmentError == TRUE) %>%
  rename(Task_Id = idShort, Attachment_URL = url, Attachment_Error = attachmentError)

# exporting to CSV
write.csv(attachment_errors, file = "attachment_errors.csv")

Aside: If you have a lot of attachments per card, you may want to create a directory folder for each card. This for loop will get you there – use it instead of the one above:

# creates individual directory folders for each card id
for (i in 1:length(att_urls$url)){
  dirAttachments <- paste(dirFiles, "attachments", att_urls$idShort[i], sep = "/")
  dir.create(dirAttachments) # creates directory for each unique card id
  locAttachments <- paste(dirAttachments, basename(url[i]), sep = "/")
  download.file(url[i], destfile = locAttachments)
}

Records in the main cards data frame are labeled “TRUE” within the attachments column if they have attachments and “FALSE” if they don’t.

# converts the attachment column to a categorical variable in the main cards+labels data frame
ct_labels <- ct_labels %>%
  mutate(attachments = ifelse(idShort %in% att_urls$idShort, TRUE, FALSE))

Lists

The lists information is extracted similarly to the cards information, but flattening is a little more straightforward because it involves only one data frame. With more data frames, the unnest function is a better choice.

# selecting lists information
lists <- trello$lists # list of 1 data frame
glimpse(lists)

# flattening
lists_flat <- lists[[1]] # 17 obs of 9 variables

# selecting wanted variables
lists_trim <- lists_flat %>%
  select(id, name, closed) %>%
  rename(idList = id, nameList = name, closedList = closed)
knitr::kable(lists_trim %>% head(n = 3L))

# joining back with main cards+labels data frame
ct_labels_list <- left_join(ct_labels, lists_trim) %>%
  select(id:shortUrl, labelList_tidy:nameList, closed, closedList)

Exporting to CSV

Data prepping

Columns in the new ct_labels_list data frame are given new names, and the lubridate package is used next to convert the date fields. This resource was helpful in understanding date conversions and formatting.

# changing variable names
tidy_cards <- ct_labels_list %>%
  select(-id, -idList, -closedList) %>%
  rename(Task = name, Task_ID = idShort, Notes = desc, Done = dueComplete, Date_Due = due,
         Labels = labelList_tidy, Trello_List = nameList, Trello_Last_Modified = dateLastActivity,
         Trello_Url = shortUrl, Trello_Attachments = attachments, Archived = closed) %>%
  select(Task, Task_ID, Notes, Done, Date_Due, Labels, Trello_List, Trello_Last_Modified,
         Trello_Url, Trello_Attachments, Archived) %>%
  mutate(Trello_Last_Modified = as_datetime(Trello_Last_Modified, tz = ""),
         Date_Due = as_datetime(Date_Due, tz = ""),
         Done = ifelse(is.na(Date_Due) == TRUE, 'NA', Done)) # ensures only undone tasks assigned a due date get marked as "FALSE"

The last step before exporting the final data frame tidy_cards is to check the unique number of tasks to make sure it matches the number of records in the data frame (i.e. one task per observation).

# determining the number of unique tasks
length(unique(tidy_cards$Task_ID))

# final look at tidy_cards
glimpse(tidy_cards)

Data exporting

write.csv(tidy_cards, file = "tidy_cards.csv")

Formatting in Airtable

Import CSV into Airtable as a new base. Add a base > Import a spreadsheet > CSV file. Upload the cards_tidy.csv file. Pick a name, icon, and color for your brand new base! More detailed instructions can be found in this Airtable article.

Task

  1. Because we know each observation in our table is unique, we can copy and paste Task into the first column and hide/delete the original column. Task is now the primary field.

Task_ID

  1. Convert Task_ID field type to “number”.

Notes

  1. Convert Notes field type to “long text” and enable rich text formatting. This gives us the option of using Markdown in the future, but sadly doesn’t automatically recognize fully formatted Markdown in the imported text.

Labels

  1. Change Labels field type to “multiple select” so that it turns each item in each list into a label.

  2. Optional: Create a new Projects column next to Labels and use the labels to guide you in creating Project labels/categories: Group the records by Labels field and add to Projects field as appropriate.

    I recommend creating an NA project from the NA labels so that these tasks aren’t marked as “uncategorized” in the Projects column. Having records with an “empty” assignment gets in the way whenever you want to group by that category. To that end, it’s helpful to group by Project and make sure any “empty” records get assigned to the “NA” Project.

  3. Delete from Labels any labels that were converted to Projects and ungroup the records.

Trello_Lists

  1. Convert Trello_Lists column field type into “single select”. This gives us the option of using the Kanban style we were used to in Trello.

    Every record should already be associated with a Trello_List.

  2. If you want to replicate the Trello kanban layout, change the order of the single select options in Trello_Lists to match the order from left-to-right of your Trello board

Trello_URL

  1. Convert the Trello_URL field type to “URL” and then hide it if you don’t think you’ll reference it often.

Trello_Attachments

  1. Convert Trello_Attachments field type to “single select”

  2. Create a new Attachments column with field type “attachment”. This is where you’ll upload your downloaded attachments.

  3. Filter your records by Trello_Attachments so only show “TRUE” results

  4. Sort records by Task_ID and simplify your view by temporarily hiding all columns except for Task_Name (primary field), Task_ID, Trello_Attachments, and the new Attachments column.

  5. Open your local attachments folder and drag and drop the files to their corresponding Attachments field according to their Task_ID in the filename.

    Increase the height of the records for this step. It’ll make it easier to make sure you’re dragging and dropping to the correct record

  6. If you encountered errors downloading some of your attachment URLs, now is the time to check your local attachment_errors.csv file for the URLs with errors during the download process. These are attachments you’ll have to find elsewhere and upload to the Attachments field as needed.

  7. Remove the filter to your view and unhide any columns you wish to remain visible.

Done & Archived

  1. Convert the Done and Archived field types to “Checkbox” and it will automatically assign a “check” to all records marked “TRUE” and leave the ones marked “FALSE” or “NA” unchecked. So easy!

  2. There is no direct option to “archive” tasks that have been completed like you can do in Trello, but you can apply a filter to your table view to hide the tasks that are complete. To do this, set the filter so that the Done and Archived fields are unchecked.

    This must be repeated for each saved View of your records.

Moving forward

Dates

  1. Modify Date_Due and Trello_Last_Modified field types to “Date” with time.
  2. You can sort the records by Trello_Last_Modified if that’s helpful, but otherwise you can hide the column and keep it for historical reference.
  3. Create a new column Last_Modified with field type “Last modified time” and select all columns you want to track changes to on a date/time basis moving forward.

Record Views

  1. Select Kanban from the Views options and group by Trello_List to see your tasks similar to how you saw them in Trello, complete with attachment covers! A bonus is that if you have multiple images attached to a card, you can view them without expanding the card by just hovering over the attachment cover!
  2. Move, collapse, and delete stacks as you see fit. Customize cards with as little or as much information as you want.

Tasks vs Subtasks

There are probably many ways to parallel the checklist option Trello gives you within a card.

  • The most straightforward is to use the basic checklist formatting within the Description field to create lists
  • Another is to think of your primary field Tasks instead as ‘subtasks’ and create a new column to serve as the umbrella ‘task’. This new ‘task’ column would be field type “single selection”, then you could group your records by ‘task’.
Back to top

Reuse

Citation

For attribution, please cite this work as:
Canelón, Silvia. 2020. “Migrating from Trello to Airtable: Working with JSON Data in R.” May 12, 2020. https://silviacanelon.com/blog/2020-05-12-trello-to-airtable.