Agents of Neoliberal Globalization Now in Print!

After many years of hard work I’m thrilled to announce that the collaborative research project Michael Dreiling and I began in 2000 has been published with Cambridge University Press! Our book, Agents of Neoliberal Globalization can be purchased directly from Amazon, or with a discount using the promotional code in this flyer. Many thanks to the editors, reviewers, and friends who helped improve this book over many years and many drafts. Our thanks, also, to Athena Llewellyn, Creative Director of the Barat Foundation, for creating the excellent cover art!

At a time when trade policies like the TPP have emerged as a topic of major national and global interest, we hope that our work will inform public debate about the drivers of globalization and, looking ahead, the kinds of economic policies that will best support a more sustainable and peaceful future. The publisher’s summary follows:


Depictions of globalization commonly recite a story of a market unleashed, bringing Big Macs and iPhones to all corners of the world. Human society appears as a passive observer to a busy revolution of an invisible global market, paradoxically unfolding by its own energy. Sometimes, this market is thought to be unleashed by politicians working on the surface of an autonomous state. This book rejects both perspectives and provides an analytically rich alternative to conventional approaches to globalization. By the 1980s, an enduring corporate coalition advanced in nearly synonymous terms free trade, tax cuts, and deregulation. Highly networked corporate leaders and state officials worked in concert to produce the trade policy framework for neoliberal globalization. Marshaling original network data and a historical narrative, this book shows that the globalizing corporate titans of the late 1960s aligned with economic conservatives to set into motion this vision of a global free market.


Placement: An R package to Access the Google Maps API

A few months ago I set out to write an R package for accessing the Maps API with my employer’s (paid) Google for Work/Premium account. At the time, I was unable to find an R package that could generate the encrypted signature, send the URL to Google and process the JSON returns in one fell swoop. Following Google’s directions for Python, however, I was able to create an R function that creates valid signatures for a URL request using the digest package’s implementation of the sha-1 algorithm. Along the way I added a few additional features that are useful in our workgroup, including (1) a function to retrieve Google Map’s distance and travel time estimates (via public transit, driving, cycling, or walking) between two places (drive_time), (2) a general purpose function for stripping address vectors of nasty characters that may break a geocode request (address_cleaner), and (3) methods for accessing the Google API with a (free) standard account (see also the excellent ggmaps package, which provides a similar facility for geocoding with Google’s standard API).

In daily use I’ve seen few issues thus far, and I’ve used earlier versions of this package to geocode about a quarter million physical locations in North America. The placement package, which includes examples, can be viewed on Github and installed in the usual way:

library(devtools)
install_github("DerekYves/placement")
library(placement)

Here’s a few examples using the standard (free) API (see here to get a free API key from Google, which has higher quota limits than supplying an empty string):

# Get coordinates for the Empire State Building and Google
address <- c("350 5th Ave, New York, NY 10118, USA",
			 "1600 Amphitheatre Pkwy,
			 Mountain View, CA 94043, USA")

coordset <- geocode_url(address, auth="standard_api", privkey="",
            clean=TRUE, add_date='today', verbose=TRUE)
## Sending address vector (n=2) to Google...
## Finished. 2 of 2 records successfully geocoded.
# View the returns
print(coordset[ , 1:5])
##        lat        lng location_type
## 1 40.74871  -73.98566       ROOFTOP
## 2 37.42233 -122.08442       ROOFTOP
##                                      formatted_address status
## 1                 350 5th Ave, New York, NY 10118, USA     OK
## 2 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA     OK

Distance calculations (note that some transit options are not accessible with the standard API):

# Bike from the NYC to Google!
address <- c("350 5th Ave, New York, NY 10118, USA",
			 "1600 Amphitheatre Pkwy, 
			 Mountain View, CA 94043, USA")

# Google allows you to supply geo coordinates *or* a physical address 
# for the distance API. In this example, we will supply coordinates
# from our previous call. Google requires a string format of: 
#   "lat,lng" (with no spaces) for coordinates.

start <- paste(coordset$lat[1],coordset$lng[1], sep=",")
end   <- paste(coordset$lat[2],coordset$lng[2], sep=",")

# Get the travel time by bike (a mere 264 hours!) and distance in miles:
howfar_miles <- drive_time(address=start, dest=end, auth="standard_api",
						   privkey="", clean=FALSE, add_date='today',
						   verbose=FALSE, travel_mode="bicycling",
						   units="imperial")

# Get the distance in kilometers using physical addresses instead of lat/lng:
howfar_kms <- drive_time(
     address="350 5th Ave, New York, NY 10118",
		dest="1600 Amphitheatre Pkwy, Mountain View, CA",
		auth="standard_api", privkey="", clean=FALSE,
		add_date='today', verbose=FALSE, travel_mode="bicycling",
		units="metric"
		)

with(howfar_kms, 
	 cat("Cycling from NYC to ", destination,
	 	":\n", dist_txt, " over ", 
	 	time_txt, sep=""), sep="")
## Cycling from NYC to 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA:
## 5,232 km over 11 days 13 hours

Address cleaning function:

# Clean a "messy" or otherwise incompatible address vector:
address <- c(" 350 5th Ave. ½, New York, NY 10118, USA ",
			 "  ª1600  Amphitheatre Pkwy, 
			 Mountain View, CA 94043, USA")

# View the return:
address_cleaner(address)
## 	* Replacing non-breaking spaces
## 	* Removing control characters
## 	* Removing leading/trailing spaces, and runs of spaces
## 	* Transliterating latin1 characters
## 	* Converting special address markers
## 	* Removing all remaining non-ASCII characters
## 	* Remove single/double quotes and asterisks
## 	* Removing leading, trailing, and repeated commas
## 	* Removing various c/o string patterns
## [1] "350 5th Ave.  1/2, New York, NY 10118, USA"           
## [2] "a1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA"

If you would like to apply this function to multiple address fields stored in separate columns (e.g., only “street 1” and “city”), you might try something like:

address[] <- sapply(address, placement::address_cleaner)

Using your Google for Work account obviously requires a client ID and API key, but the methods to do so are well documented in the package help files. Feel free to shoot me an email if you run into any issues!

Rsurveygizmo: An R package for interacting with the Survey Gizmo API

Several years ago our team began using SurveyGizmo for our online surveys (and, actually, a bunch of other projects as well, from polls to data entry templates). At the time, SurveyGizmo provided a nice balance between cost and customization when compared to similar products from, e.g., Qualtrics and SurveyMonkey. Over the years SurveyGizmo has greatly expanded the kinds of user customization and tweaking that is possible, particularly in the area of API calls. Because we mostly work in R, I decided to write a package that accesses the SurveyGizmo API directly so that survey and email campaign data can be pulled directly within a project script (as opposed to manually downloading the data from the webpage).

Some usage examples for this package follow. To really test it out you will need to supply your private SurveyGizmo API key and a valid numeric survey id. There are many more function options outlined in the package help files than are presented below for those who’d like to learn more.

# Download a "regular" survey with no email campaign data,
# keeping only complete responses:
api <- "your_api_key_here"
a_survey <- pullsg(your_survey_id_here, api, completes_only=T) 

# Download all email campaign data for a particular survey:
a_campaign <- pullsg_campaign(your_survey_id_here, api) 

# Combine the previous steps in one function:
# 1. download email campaign 
# 2. merge it, where possible, with a survey response 
a_survey_with_campaign <- pullsg(your_survey_id_here, api, mergecampaign=T)

If you’d like to give the package a spin you can visit the Github repository or install directly within R:

library(devtools)
install_github(repo="DerekYves/rsurveygizmo")

I hope this package is helpful to somebody, and feel free to drop me an email or post to the repository if you have any questions or suggestions for improvement! Many, many thanks to Ari Lamstein for teahing me the ropes of R package development and the wonders of Roxygen.

Sharing R code in a workgroup of Mac/Windows/Linux users

In recent years there’s been a great deal of interest in, and work toward, creating more “reproducible” statistical code. I think this is a fantastic development. Looking at code I wrote in the late 90’s and early oughts, it’s clear how much a lot of my work from that period would have benefited from the coding habits I’ve developed over the last five or so years.

Working in R, a simple way to make code portable is to locate all data, functions, and scripts within a single directory/subdirectory which gets declared at the head of script and is followed by a series of relative paths in the lines to follow. This is standard process for projects you might find on Github.

But this is not always possible. Working in a corporate environment with protected/sensitive client data, I’m often forced to separate code from scripts, specifically, to keep certain sensitive data isolated on a particular server. I’m also not allowed to have any PII containing data in my git repositories. While I could exclude certain directories or file types from Git using, e.g., a .gitignore file, I’ve found it’s easier to just keep data in one place and scripts/functions in another. One more wrinkle: I work with Windows users, so simple references like “/data” will have different meanings if the Windows user is working from, say, the “D:\” drive.

I’ve tried and discarded a number of approaches to create a workflow that allows the same master code to run on a variety of computers which may have data stored in different places. Along the way I’ve refined my approach and have, in the last year or so, arrived at a process that allows me to work with others and with different types of OS’s in a reliable manner that requires little ongoing maintenance.

An added advantage the approach I’ve developed: if I want to move ALL of my main data sets, say from /data to /var/data, I can do this by changing one line of code in one file. After this change, all of my hundreds of scripts and markdown reports gracefully adapt to the change. They adapt because they all depend upon the same file to set up “the lay of the land” before any analysis. This requires a little extra work up front but, I find, saves a lot of headaches down the road — especially when working with other data scientists.

In short, my process is basically this:

1. The same (hopefully source controlled) file is sourced at the top of each analysis script. At the point of sourcing some parameters are declared which do the correct thing depending on whether the host is Linux, Mac, or Windows. Here’s an example:

## Load host-dependent directory environment
winos <- ifelse(grepl("windows", Sys.info()['sysname'], ignore.case=T), 1, 0)
if(winos==1) source("C:/data/projects/scripts/R/functions/file_dir_params.R")
if(winos==0) source("~/projects/scripts/R/functions/file_dir_params.R")
rm(winos, host)

2. The next step is to build your version of the file/directory parameter file that was sourced in step 1 by a script (in this post I’ll call this file “file_dir_params.R”).

#--begin file_dir_params.R script--#

# Make a new environment:
fdirs <- new.env()

3. Now add this function to the “file_dir_params.R”, which is used to save a simple string indicating the host’s OS type:

# Function to standardize host OS name
get_os <- function(){
	sysinf <- Sys.info()
	if (!is.null(sysinf)){
		os <- sysinf['sysname']
		if (os == 'Darwin')
			os <- "osx"
	} else {
		os <- .Platform$OS.type
		if (grepl("^darwin", R.version$os))
			os <- "osx"
		if (grepl("linux-gnu", R.version$os))
			os <- "linux"
	}
	tolower(os)
}
fdirs$computeros <- get_os()

4. With our new environment loaded and knowledge of the computer’s OS, we’re ready to build platform agnostic variables that point to the most important shared directories in your group:

## Declare the root project and data directory:

show  <- 1
build <- 1

if(grepl("windows", fdirs$computeros)==F){
	fdirs$prjdir <- "~/projects/"
	fdirs$prjdta <- "/your_data/"
  }else{
	fdirs$prjdir <- "C:/projects/"
	fdirs$prjdta <- "D:/your_data/"
}

5. Depending on your setup you may have multiple project or data folders you want to declare. To keep my life simple on my development machine, I try and make all data a sub-directory of “prjdta” and all scripts a subdirectory of “prjdir”. Once you have these set using the platform agnostic pattern outlined in step 4, you’re ready to build out all your subdirectories. Because every sub-directory is a child of the root directories defined in step 4, make sure to always build new directory variables using one of the root directories (in my example, “prjdta” or “prjdir”). This is the key to making the code work across different platforms. It’s what allows the variable representing a folder to seamlessly shift between “C:/projects/some_folder” and “~/projects/some-folder” depending on the host which calls the script. One added and very useful bonus: if I move, say, my main data folder, I don’t have to rewrite 100 variable names in dozens of scripts. I just change the root data folder (in this example, “fdirs$prjdta”) once in one file and everything else takes care of itself in my batch environment. Here’s some examples:

# Add some child objects to the fdirs environment:
fdirs$dqrptsrc   <- paste0(fdirs$prjdta, "data_quality/source/")
if(show==1 & interactive()) cat("\ndqrptsrc =", fdirs$dqrptsrc)
if(make==1) system(paste0("mkdir -p ", fdirs$dqrptsrc))

# Define some colors for (say) GGPlot using your organization's color codes:
fdirs$com_ppt_orange <- "#FF6122"
fdirs$com_cr_blue    <- "#30812E"
fdirs$com_cr_red     <- "#D0212E"
fdirs$com_cr_green   <- "#8EB126"

6. Notice how, in the lines above, I reference the “show” and “build” flags we declared near the top of the “file_dir_params.R” script. What these do, respectively, is print the file location of the variable to standard output and build the folder if it does not exist. This can be useful but is not strictly necessary.

7. Our last step is to attach the “fdirs” environment (or safely reload it if it’s already attached):

# Attach the new environment (and safely reload if already attached):
while("fdirs" %in% search())
detach("fdirs")
attach(fdirs)

Final Thoughts

Attaching the environment as we did in step 7 is a great time saver because your can omit the “fdirs$” prefix when you reference variables in your scripts. For example, to load a file in a data folder I now just write:

x <- readRDS(paste0(dqrptsrc, "somefile.Rds"))

As opposed to:

x <- readRDS(paste0(fdirs$dqrptsrc, "somefile.Rds"))

But, as with everything, there’s a draw back: you’ll want to make sure the names of your variables don’t overlap with functions or other items you declare in a script. For this reason I use expressions like “prjdta” rather than “data”, and I avoid overly concise constructions that are used in a lot of example code, e.g. “x”, “y”, or “z”.

I hope some or all of the above is helpful to somebody, and feel free to drop me an email if you have any questions!