Preface

Why read this book

There are many books that cover various topics in fundraising. There are many books that explore the R programming language, data analytics, and machine learning. However, there are few books that explore how to use R to build data-driven solutions for fundraising problems. This was one of the key motivations for writing this book.

At the time of writing this, a quick web search of “fundraising analytics” produces 1,850,000 results (in 0.46 seconds), which clearly suggests there is appetite, curiosity, and demand for useful material regarding this subject. Fundraising analytics articles, products, and webinars tend to focus on technical tools and approaches such as data visualization and predictive modeling across many fundraising areas, including, to name a few, prospect identification, donor segmentation, pipeline management, direct response marketing, and fundraising forecasts.

As lifelong learners, we strongly encourage you to search, study, and explore as much as you can about the art of fundraising and the science of data, visualization, and machine learning because the future of fundraising, like many other industries, will continue to be shaped and strongly influenced by advancements in technology. And why do institutions and firms (for-profit and non-profit) invest in technology? To gain competitive advantage. To gain efficiency. To increase results.

Contact reports are no longer dictated or hand-typed on a typewriter. In fact, some customer relationship management (CRM) software and donor databases allow you to enter contact reports via voice entry on your mobile device using speech-to-text technology. Speech-to-text and natural language processing software is far from perfect, but it’s still a useful step forward to a technology-driven future. And what’s the point of all these bells and whistles? In the case of contact reports, one could argue that minimizing the effort required to enter contact reports saves time and energy (scarce resources), which fundraisers could apply towards other important areas, such as fundraising strategy, donor outreach, qualification, cultivation, and solicitation.

Why read this book?

Because you’re curious. Because you’ve already read many articles about the potential of analytics to improve your fundraising programs, but you want a clear path and practical methods on how to analyze and distill the actionable insights in your data. Also, because you’re ready. As Rodger’s beloved piano teacher, Katherine, says, “I can teach you something in 10 minutes that took me 10 years to figure out.” Many of you reading this book may have heard or been interested in data science, analytics, and machine learning for years, but never had an opportunity to dive in. After reading the book and executing the code, you will understand and learn about 80 to 90% of the knowledge of applied data science, specifically as applied to fundraising. You’re lucky because it took more than 10 years to acquire that knowledge.

This book will help get you going on your data science journey. So, no more waiting, rainy days, or “maybe someday” stalling. This is the future, and the time is now.

The value proposition of this book is simple: There are countless resources on data science and its applications, but this one book will take you from a beginner to an advanced level of understanding. From there, you must continue to sharpen your skills to attain mastery.

Structure of the book

This book is organized into 17 chapters, with chapters 1 through 4 beginning with an introduction to analytics, analytics adoption, and fundraising application examples. Chapters 5 through 9 introduce the R programming language and provide a series of recipes to help familiarize you with basic tasks such as loading, cleaning, manipulating and exploring data for patterns.

Chapter 10, Data Visualization, explores a broad set of data visualization methods and best practices that you can incorporate into your own solutions. Chapter 11 explores Recency, Frequency, Monetary (RFM) modeling, which is a descriptive analytics technique drawn from direct marketing and is commonly used in fundraising. Chapter 12 covers machine learning concepts (including both supervised and unsupervised learning methods) and provides R recipes (coding scripts) that you can explore with an example donor file to demystify the application of machine learning techniques to donor data.

Chapters 13 explores predicting next gift size, including simple forecasting and some of the challenges with this task. Chapter 14 focuses on text mining, which is a form of data analytics focused on gleaning patterns from text-based data. Chapter 14 also introduces you to some powerful R packages you can use to generate reports, acquire web-based data, and enhance your analysis by blending data sources and generating comparative insights.

Chapter 15 explores social network analysis, which uses graph theory to visualize and analyze information as networks of connections or relationships. Chapters 16 explores the concept of finding prospects, including some popular use cases that are frequently discussed, researched, and debated. Chapter 17 concludes with a survey of new trends and applications that highlight future fundraising applications and beyond.

Not Technical?

This book does not assume any previous programming knowledge, experience, or background. A familiarity with basic statistical concepts is helpful, but not required as additional reading and resources are introduced and suggested in each chapter.

If you are a manager or leader and not interested in directly learning data science, we encourage you to get a copy of this book for your data-driven staff or curious colleagues who would benefit from learning the best-in-class machine learning methods to build solutions and add value to your organization.

Software information and conventions

All the code in each chapter needs to be run in order as examples tend to build on previous examples.

We encourage you to explore the book’s content according to your interest and need, but advise against jumping into the middle of a chapter and expecting your code results to match without running the code introduced earlier in the chapter.

Please note the following information and coding conventions used within this book:

This is a note.

Any text displayed within this text box is intended to be an important note, concept, or takeaway.

This is a quote.

Any text displayed within this text box is intended to be an important quoted reference.

This is an exercise.

Any text displayed within this text box is designed to be a challenge or exercise to help you learn how to customize the solution or recipe to explore your own questions.

If you’re reading an electronic copy of this book, you may have difficulty copying and pasting the code directly. All the code and data used in each chapter are available as standalone files. You can download them from http://www.nandeshwar.info/ds4fundraisingcode.

Every time you see the library command along with a library name, such as library(tidyverse), we assume that the package is installed on your computer. If not, use the install.packages(<package_name>) command to install the library. Note that we are using the words “library” and “package” interchangeably.

You will notice that the code chunks are in a grey box and R code is highlighted, and the output is shown in orange. The output in the book always begins with #> characters. Since # denotes the beginning of comments in R, you can safely copy the output on your console. (The output in your console will show up without the #> characters.)

Although all the code will run on the base R application/console, we highly recommend you to use RStudio as an editor (integrated development environment (IDE)). RStudio makes coding faster and comes with many wonderful functionalities, which you will uncover as you follow the recipes in this book.

The R ecosystem is continuously evolving, and although we made sure that all the code in this book runs on multiple computers, it is likely that some library will be out of date or may cause errors. Feel free to reach out to us so that we can help you and update the code in the book. The last compilation of this book used this configuration:

#> R nickname: Kite-Eating Tree
#> R version 3.4.3 (2017-11-30)
#> On platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running: macOS Sierra 10.12.6

If you can’t reproduce the output using the code from the book, try setting the seed to 777 with this command: set.seed(777).

Acknowledgments

Ashutosh Nandeshwar

I’m thankful to many people for various reasons. There are some who have helped me every time I’ve stumbled. Then there are those who helped me before I could stumble. There are some who showed me the light. There are a few who paved the way for me. Some helped me without knowing. Let me list a few names, but this list isn’t exhaustive, and I apologize for my omissions.

Dr. B. R. Ambedkar for creating opportunities for all the marginalized communities of India. I would be far from writing this if it were not for him.

Mom and Dad for believing in me, especially when I failed, and setting the path for my future. Unfortunately, my father passed away before he could see this book. Thank you for everything, Baba. I miss you.

Mr. Prabhu, my math tutor in high school, for building the belief that I could do math (my brother will tell you how hopeless I was). Dr. Rakesh Chandran for helping me out through the tough years of graduate school. Dr. Tim Menzies for teaching not only machine learning, but also the value of simplicity. Mike Sperko for giving me a chance to do data analysis for a living. Karen Isble for taking a chance on me and letting me apply analytics to fundraising.

All my terrific colleagues at USC. Tracey Vranich for seeing the value of data science for fundraising and letting me run free with my ideas. Doug Byers for helping me in immeasurable ways. Al Checcio for his vision and belief in data-driven fundraising.

Chris Sorenson for helping me spread my ideas and for writing the fantastic foreword.

My wonderful wife, Utpalvarna, for taking care of everything while I focused on this book on weekends and evenings. Asanga and Dinnaga, my amazing children, for teaching me so many things, including the value of spending time with your loved ones. They understood when I could not attend their soccer and Taekwondo practice.

Many thanks to everyone who provided feedback on the book: Andrew Schultz, Michie Spradling, and Dibyendu Mondal. Thanks also to our editors: Madhusudan Uchil and Kathy Osborn.

Lastly, huge thanks to the R community. It would have been impossible to do anything without the work of every package builder and every Stackoverflow [r] answerer. A few R developers deserve special mention: Hadley Wickham, author of “R for Data Science” and ggplot, dplyr, devtools, and others. Yihui Xie, author of the bookdown and knitr packages (we wrote this book using bookdown). Because of RStudio and all the people behind this great product, literate programming and interactive analysis has never been this easy. Thanks #rstats!

And, of course, my co-author Rodger! :)

Rodger Devine

This book would not have been possible without the support and encouragement of many friends, family, and mentors over the years, including:

Lada Adamic, Prabhanshu Agrawal, Kaleena Bajo, Reginald Becton, Kerri Bennett, Donald Boettner, Russell Brown, Bob Burdenski, John and Susan Butler, Carly Capitula, Cathleen Conway-Perrin, Kevin Corbett, Gin Corden, Dondi Cupp, “Lo” de Janvry, Hui Cha “Kim” Devine, Lindsay Devine, Rodger A. Devine, Terri Devine, Ryan Donnelly, Susan Engel, Nathan Fay, Nicole Ferguson, Josh Fields, Tyson FitzGerald, Patrick Franklin, Christina Frendo, Knekoh Frugé, Jennifer Dunn Greenspan, Renee Haraburda, Kelli Harrington, Yuping He, Christina Hendershaw, Salijo Hendershaw, Chere Hooks, Dianna Gladstone, Mike Glier, Risa Gotlib, Lorri Grubaugh, Nathan Gulick, Karen Isble, Josh Jacobson, Kim “Bella” Jacobson, Sonya Vanhoof Jimenez, Caitlin Johnson, Bob Jones, Kevin and Liz Jones, Sam Jones, Justin Joque, Frieda Kahn, Nick Kennedy, Jim King, Jennifer Kranz, Margaret Krebsbach, Dan Kugler, Andrew Kulpa, Brett Lantz, Karen Latora, Deborah Lennington, Jessie Lipkowitz, Shalonda Martin, Kim “McData” McDade, Qiaozhu Mei, Jaime Miranda, Xiomara Moncada, Chandra Montgomery, Andrew Mortensen, Tadd and Nayiri Mullinix, Ashutosh Nandeshwar, Leah Nickel, Kathy Osborn, Todd Osborn, Linda Pavich, Michael Pawlus, Jayne Perilstein, Andrea Perry, Joseph “Cacaww” Person, Matthew Pickus, Fabian Primera, Cynthia Radecki, Christopher Rael, Janiece Richard, Matthew Rizner, Jane Roach, Rose Romani, Jesse Ruf, Shahan Sanossian, Steve Sarrica, Eddie Sartin, James Sinclair, Jeff Sims, Alison Sommers-Sayre, Chris Sorensen, Katherine Teves, Henry Tyler, Jarrod Van Kirk, Tyler Varing, Tracey Vranich, Andrea Waldron, Tom Wamsley, Kathy Welch, Aaron Westfall, Hanah Wilkins, William Winston, Paul Worster, Jeff Wright, Julie Wright, Jing Zhou, R Community, Stack Overflow, WLAP Group, U-M Information and Technology Services, U-M MIDAS, U-M Ross School of Business Development and Alumni Relations, U-M School of Information, USC Dornsife Advancement, USC University Advancement, APRA, CASE, @DRIVE, Innovation Enterprise, MOTM, and all of my other friends, colleagues and teachers who have been a positive influence. Thank you for your support and encouragement. Onward and upward!

Many thanks to everyone who provided feedback on the book and, of course, special thanks to our editors, Kathy Osborn and Madhusudan Uchil. Lastly, thank you to Lada Adamic for making me learn R, Qiaozhu Mei for teaching me to develop intuition and apply my knowledge to real-world problems, Katherine Teves for inspiring passion and discipline, Eddie Sartin for his vision and leadership, and Todd Osborn for always keeping in real. I am forever grateful.

If you like this book, consider sharing it with your network by running source("http://arn.la/shareds4fr") in your R console.

Ashutosh R. Nandeshwar
Rodger Devine
February 2018

Data Science for Fundraising: Build Data-Driven Solutions Using R