library(tidyverse)
load_file_list <- function(folder) {
# Load the list of files into a tibble called 'files'
files <- tibble(filename = list.files(folder))
# Extract the dates from the file names in their own column
# These represent different ways of formatting dates
files <- files |>
mutate(
filedate = case_when(
str_detect(filename, "\\d{4}-\\d{2}-\\d{2}") ~ ymd(str_extract(filename, "\\d{4}-\\d{2}-\\d{2}")),
str_detect(filename, "\\d{2}.\\d{2}.\\d{4}") ~ mdy(str_extract(filename, "\\d{2}.\\d{2}.\\d{4}")),
str_detect(filename, "\\d{2}.\\d{1}.\\d{4}") ~ mdy(str_extract(filename, "\\d{2}.\\d{1}.\\d{4}")),
str_detect(filename, "\\d{1}.\\d{1}.\\d{4}") ~ mdy(str_extract(filename, "\\d{1}.\\d{1}.\\d{4}")),
str_detect(filename, "\\d{8}") ~ mdy(str_extract(filename, "\\d{8}")),
TRUE ~ NA_Date_
)
)
# Extract the most recent report date for later file naming
file_date <- files |>
filter(filedate == max(filedate)) |>
distinct()
}This function looks at the files inside of a specified folder and find the most recent file. I often do regular reporting and, depending on the project, I’ll organize my files in one of two ways:
- By date I ran the data (i.e
2025-06-12 Report Data/2025-06-12 File 1.csv) - By file type (i.e.
Address Data/2025-06-12 Address Data.csv)
In the case of the latter, I want to be able to get the most recent file without manually typing in the date into my code. Doing things that way ensures I will inevitably forget to adjust the date the next time and end up with the wrong results.
Here is the code:
How do I use it? The first thing I usually do is extract the date of the most recent file. I use this when naming my output files so that the date of my file matches the dates of the data.
The here package is incredibly helpful for referencing files relative to the project you’re working in and prevent things from breaking when switching machines or sharing code.
# Get the list of files from the specified folder
address_data <- load_file_list(here::here("data - input", "Address Data"))
# Extract the most recent date
address_date <- pull(pyr_date[,1])
# Ensure the date is formatted as a date
address_date <- ymd(pyr_date)
# Read in the files from the folder
address_df <- read_csv(here("data - input",
"Address Data",
address_date,
"Address Data.csv")
)