MS Data Base Connection with R

Connecting and importing data from Microsoft Access Data Bases with R

Microsoft Access Data Bases may contain several tables of interest. Using the odbc and DBI packages we are able to import all desired tables to the R environment with just a few steps.


Connecting to multiple mdb files from R

First, let´s list all the files we want to work with:

  # Full file path to Access DB
    
      files_path <- list.files("./mdb", full.names = TRUE, recursive  = TRUE)

Opening a ‘mdb’ or ‘accdb’ file creates a temporary ‘Microsoft Access Record-Locking Information (laccdb)’ file in the same folder. So, before moving on, you must close all ‘laccdb’ files to allow a remote connection from R.

  # Install and load packages 

    library("odbc", quietly = FALSE) # ODBC for MS Access files
                                  
    library("DBI",  quietly = FALSE) # read tables from connections

  # Pass MS Access file path to connection string in loop

    connections <- as.list(files_path)
      
       for (i in seq(files_path)) {
          
          connections[[i]] <- dbConnect(drv = odbc(),
                                        .connection_string = paste0("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=",
                                                                    files_path[i], ";"))
       }

The .connection_string parameter indicates both ‘mdb’ and ‘accdb’ extensions.


We should retrieve connections names. Later on, we might use it to identify the tables.

  # Retrieve connections names
    
    conn_ls <- list()
    
    for (i in seq(connections)) {
      
      conn_ls[i] <- connections[i][[1]]@info$dbname
      
    }

If we wish to select some of the tables from multiple data sets, we can list the names in the following way.

    # List tables

      all_tables <- lapply(connections, dbListTables)

Let´s create a vector of names to subset from the list.

    slct_tables <- c("Table 1", "Table 2", "Table 3")

Now, we can read the selected tables to a list, assigning names based on connections and original tables names. We access each connection, and check for the existence of any of the tables names using dbExistsTable(). If the table exists in that connection the function uses assign() to determine the name and valueof the table. It uses the conn_lsobject created before, so that we can identify from which MSdb it came from. The last lines simply print a note so that we can identify the function progress.

    # Read tables to list
        
      for (i in seq(connections)) {
            
        for (j in seq(slct_tables)) {
          
          if(dbExistsTable(connections[[i]], slct_tables[j]) == FALSE) next
              
              assign(x = paste0(slct_tables[j], "_", conn_ls[[i]]),
                     value = dbReadTable(connections[[i]], slct_tables[j]))
              
              print(paste0(" extraction for ", "'", slct_tables[j], "'", " table",
                           " in connection nº ", i))
            }
          }

Before moving on your code, close the connections, which will remove record locking (laccdb) files.

    # Disconnect from databases

      lapply(connections, dbDisconnect)
Rodrigo Abreu Carvalho
Rodrigo Abreu Carvalho
Bachelor Degree in Economic Sciences