Preface
Why read this book
There are many books that cover various topics in fundraising. There are many books that explore the R
programming language, data analytics, and machine learning. However, there are few books that explore how to use R
to build data-driven solutions for fundraising problems. This was one of the key motivations for writing this book.
At the time of writing this, a quick web search of “fundraising analytics” produces 1,850,000 results (in 0.46 seconds), which clearly suggests there is appetite, curiosity, and demand for useful material regarding this subject. Fundraising analytics articles, products, and webinars tend to focus on technical tools and approaches such as data visualization and predictive modeling across many fundraising areas, including, to name a few, prospect identification, donor segmentation, pipeline management, direct response marketing, and fundraising forecasts.
As lifelong learners, we strongly encourage you to search, study, and explore as much as you can about the art of fundraising and the science of data, visualization, and machine learning because the future of fundraising, like many other industries, will continue to be shaped and strongly influenced by advancements in technology. And why do institutions and firms (for-profit and non-profit) invest in technology? To gain competitive advantage. To gain efficiency. To increase results.
Contact reports are no longer dictated or hand-typed on a typewriter. In fact, some customer relationship management (CRM) software and donor databases allow you to enter contact reports via voice entry on your mobile device using speech-to-text technology. Speech-to-text and natural language processing software is far from perfect, but it’s still a useful step forward to a technology-driven future. And what’s the point of all these bells and whistles? In the case of contact reports, one could argue that minimizing the effort required to enter contact reports saves time and energy (scarce resources), which fundraisers could apply towards other important areas, such as fundraising strategy, donor outreach, qualification, cultivation, and solicitation.
Why read this book?
Because you’re curious. Because you’ve already read many articles about the potential of analytics to improve your fundraising programs, but you want a clear path and practical methods on how to analyze and distill the actionable insights in your data. Also, because you’re ready. As Rodger’s beloved piano teacher, Katherine, says, “I can teach you something in 10 minutes that took me 10 years to figure out.” Many of you reading this book may have heard or been interested in data science, analytics, and machine learning for years, but never had an opportunity to dive in. After reading the book and executing the code, you will understand and learn about 80 to 90% of the knowledge of applied data science, specifically as applied to fundraising. You’re lucky because it took more than 10 years to acquire that knowledge.
This book will help get you going on your data science journey. So, no more waiting, rainy days, or “maybe someday” stalling. This is the future, and the time is now.
The value proposition of this book is simple: There are countless resources on data science and its applications, but this one book will take you from a beginner to an advanced level of understanding. From there, you must continue to sharpen your skills to attain mastery.
Structure of the book
This book is organized into 17 chapters, with chapters 1 through 4 beginning with an introduction to analytics, analytics adoption, and fundraising application examples. Chapters 5 through 9 introduce the R
programming language and provide a series of recipes to help familiarize you with basic tasks such as loading, cleaning, manipulating and exploring data for patterns.
Chapter 10, Data Visualization, explores a broad set of data visualization methods and best practices that you can incorporate into your own solutions. Chapter 11 explores Recency, Frequency, Monetary (RFM) modeling, which is a descriptive analytics technique drawn from direct marketing and is commonly used in fundraising. Chapter 12 covers machine learning concepts (including both supervised and unsupervised learning methods) and provides R
recipes (coding scripts) that you can explore with an example donor file to demystify the application of machine learning techniques to donor data.
Chapters 13 explores predicting next gift size, including simple forecasting and some of the challenges with this task. Chapter 14 focuses on text mining, which is a form of data analytics focused on gleaning patterns from text-based data. Chapter 14 also introduces you to some powerful R
packages you can use to generate reports, acquire web-based data, and enhance your analysis by blending data sources and generating comparative insights.
Chapter 15 explores social network analysis, which uses graph theory to visualize and analyze information as networks of connections or relationships. Chapters 16 explores the concept of finding prospects, including some popular use cases that are frequently discussed, researched, and debated. Chapter 17 concludes with a survey of new trends and applications that highlight future fundraising applications and beyond.
Not Technical?
This book does not assume any previous programming knowledge, experience, or background. A familiarity with basic statistical concepts is helpful, but not required as additional reading and resources are introduced and suggested in each chapter.
If you are a manager or leader and not interested in directly learning data science, we encourage you to get a copy of this book for your data-driven staff or curious colleagues who would benefit from learning the best-in-class machine learning methods to build solutions and add value to your organization.
Other reading
This book will not satisfy all readers because of omissions, intentional or not. Let’s go over the intentional omissions. We don’t cover all the issues you will face when you’re just getting started with R
. We don’t discuss any big data tools or techniques. We don’t explain the machine learning algorithms. If we did so, this book would’ve never been completed. We want to help you get to the 80 to 90% of knowledge level in a short time. We highly recommend these books to fill any gaps as well as to sharpen your skills.
- R Cookbook by Paul Teetor, 2011: http://a.co/bMrC9ot. This book covers many of the basics of
R
using recipes. An easy to follow book and a must read. - R for Data Science by Hadley Wickham et al., 2017: http://a.co/4luF7t. Learn from a master. Not only is this book well written, but it also teaches you efficient ways to complete complicated tasks.
- Machine Learning with R by Brett Lantz, 2015: http://a.co/6pRrjGb. Rodger and I have both been fortunate to know and work with Brett. He’s written elegantly and explained complicated algorithms in this book.
- Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten et al., 2011: http://a.co/ifmkIvh. This book will help you easily understand many machine learning techniques and concepts. Its simple language and explanations will speed up your understanding.
- Code Complete by Steve McConnell, 2004: http://a.co/9f27xQc. If you want to write code that you can understand even after many years, you must read this book. Steve provides hundreds of recommendations on writing good code.
- Making Things Happen by Scott Berkun, 2008: http://a.co/5hWXU18. Analysis is wasted if it’s unused. In this book, Scott explains how to manage projects to make things happen. Every analyst should read this book.
- Information Dashboard Design (1st edition) by Stephen Few, 2006: http://a.co/bVGINJ5. If you are just getting started in data visualization, you should read this book. You will never see charts the same way again.
- Confessions of a Public Speaker by Scott Berkun, 2011: http://a.co/5Ym5nqV. Another Berkun book, and another gem. Ashutosh got his start in public speaking after reading this book, but more importantly, he learned the skills he needed to explain his analysis.
- Presentation Zen by Garr Reynolds, 2011: http://a.co/e9OduPx. After all your hard work, don’t let your presentation fail you. Read this book to create beautiful slides and effectively present your points.
Software information and conventions
All the code in each chapter needs to be run in order as examples tend to build on previous examples.
We encourage you to explore the book’s content according to your interest and need, but advise against jumping into the middle of a chapter and expecting your code results to match without running the code introduced earlier in the chapter.
Please note the following information and coding conventions used within this book:
This is a note.
Any text displayed within this text box is intended to be an important note, concept, or takeaway.This is a quote.
Any text displayed within this text box is intended to be an important quoted reference.This is an exercise.
Any text displayed within this text box is designed to be a challenge or exercise to help you learn how to customize the solution or recipe to explore your own questions.If you’re reading an electronic copy of this book, you may have difficulty copying and pasting the code directly. All the code and data used in each chapter are available as standalone files. You can download them from http://www.nandeshwar.info/ds4fundraisingcode.
Every time you see the library
command along with a library name, such as library(tidyverse)
, we assume that the package is installed on your computer. If not, use the install.packages(<package_name>)
command to install the library. Note that we are using the words “library” and “package” interchangeably.
You will notice that the code chunks are in a grey box and R code is highlighted, and the output is shown in orange. The output in the book always begins with #>
characters. Since #
denotes the beginning of comments in R
, you can safely copy the output on your console. (The output in your console will show up without the #>
characters.)
Although all the code will run on the base R
application/console, we highly recommend you to use RStudio as an editor (integrated development environment (IDE)). RStudio makes coding faster and comes with many wonderful functionalities, which you will uncover as you follow the recipes in this book.
The R
ecosystem is continuously evolving, and although we made sure that all the code in this book runs on multiple computers, it is likely that some library will be out of date or may cause errors. Feel free to reach out to us so that we can help you and update the code in the book. The last compilation of this book used this configuration:
#> R nickname: Kite-Eating Tree
#> R version 3.4.3 (2017-11-30)
#> On platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running: macOS Sierra 10.12.6
set.seed(777)
.
Acknowledgments
Ashutosh Nandeshwar
I’m thankful to many people for various reasons. There are some who have helped me every time I’ve stumbled. Then there are those who helped me before I could stumble. There are some who showed me the light. There are a few who paved the way for me. Some helped me without knowing. Let me list a few names, but this list isn’t exhaustive, and I apologize for my omissions.
Dr. B. R. Ambedkar for creating opportunities for all the marginalized communities of India. I would be far from writing this if it were not for him.
Mom and Dad for believing in me, especially when I failed, and setting the path for my future. Unfortunately, my father passed away before he could see this book. Thank you for everything, Baba. I miss you.
Mr. Prabhu, my math tutor in high school, for building the belief that I could do math (my brother will tell you how hopeless I was). Dr. Rakesh Chandran for helping me out through the tough years of graduate school. Dr. Tim Menzies for teaching not only machine learning, but also the value of simplicity. Mike Sperko for giving me a chance to do data analysis for a living. Karen Isble for taking a chance on me and letting me apply analytics to fundraising.
All my terrific colleagues at USC. Tracey Vranich for seeing the value of data science for fundraising and letting me run free with my ideas. Doug Byers for helping me in immeasurable ways. Al Checcio for his vision and belief in data-driven fundraising.
Chris Sorenson for helping me spread my ideas and for writing the fantastic foreword.
My wonderful wife, Utpalvarna, for taking care of everything while I focused on this book on weekends and evenings. Asanga and Dinnaga, my amazing children, for teaching me so many things, including the value of spending time with your loved ones. They understood when I could not attend their soccer and Taekwondo practice.
Many thanks to everyone who provided feedback on the book: Andrew Schultz, Michie Spradling, and Dibyendu Mondal. Thanks also to our editors: Madhusudan Uchil and Kathy Osborn.
Lastly, huge thanks to the R
community. It would have been impossible to do anything without the work of every package builder and every Stackoverflow [r] answerer. A few R
developers deserve special mention: Hadley Wickham, author of “R for Data Science” and ggplot
, dplyr
, devtools
, and others. Yihui Xie, author of the bookdown
and knitr
packages (we wrote this book using bookdown
). Because of RStudio and all the people behind this great product, literate programming and interactive analysis has never been this easy. Thanks #rstats!
And, of course, my co-author Rodger! :)
Rodger Devine
This book would not have been possible without the support and encouragement of many friends, family, and mentors over the years, including:
Lada Adamic, Prabhanshu Agrawal, Kaleena Bajo, Reginald Becton, Kerri Bennett, Donald Boettner, Russell Brown, Bob Burdenski, John and Susan Butler, Carly Capitula, Cathleen Conway-Perrin, Kevin Corbett, Gin Corden, Dondi Cupp, “Lo” de Janvry, Hui Cha “Kim” Devine, Lindsay Devine, Rodger A. Devine, Terri Devine, Ryan Donnelly, Susan Engel, Nathan Fay, Nicole Ferguson, Josh Fields, Tyson FitzGerald, Patrick Franklin, Christina Frendo, Knekoh Frugé, Jennifer Dunn Greenspan, Renee Haraburda, Kelli Harrington, Yuping He, Christina Hendershaw, Salijo Hendershaw, Chere Hooks, Dianna Gladstone, Mike Glier, Risa Gotlib, Lorri Grubaugh, Nathan Gulick, Karen Isble, Josh Jacobson, Kim “Bella” Jacobson, Sonya Vanhoof Jimenez, Caitlin Johnson, Bob Jones, Kevin and Liz Jones, Sam Jones, Justin Joque, Frieda Kahn, Nick Kennedy, Jim King, Jennifer Kranz, Margaret Krebsbach, Dan Kugler, Andrew Kulpa, Brett Lantz, Karen Latora, Deborah Lennington, Jessie Lipkowitz, Shalonda Martin, Kim “McData” McDade, Qiaozhu Mei, Jaime Miranda, Xiomara Moncada, Chandra Montgomery, Andrew Mortensen, Tadd and Nayiri Mullinix, Ashutosh Nandeshwar, Leah Nickel, Kathy Osborn, Todd Osborn, Linda Pavich, Michael Pawlus, Jayne Perilstein, Andrea Perry, Joseph “Cacaww” Person, Matthew Pickus, Fabian Primera, Cynthia Radecki, Christopher Rael, Janiece Richard, Matthew Rizner, Jane Roach, Rose Romani, Jesse Ruf, Shahan Sanossian, Steve Sarrica, Eddie Sartin, James Sinclair, Jeff Sims, Alison Sommers-Sayre, Chris Sorensen, Katherine Teves, Henry Tyler, Jarrod Van Kirk, Tyler Varing, Tracey Vranich, Andrea Waldron, Tom Wamsley, Kathy Welch, Aaron Westfall, Hanah Wilkins, William Winston, Paul Worster, Jeff Wright, Julie Wright, Jing Zhou, R Community, Stack Overflow, WLAP Group, U-M Information and Technology Services, U-M MIDAS, U-M Ross School of Business Development and Alumni Relations, U-M School of Information, USC Dornsife Advancement, USC University Advancement, APRA, CASE, @DRIVE, Innovation Enterprise, MOTM, and all of my other friends, colleagues and teachers who have been a positive influence. Thank you for your support and encouragement. Onward and upward!
Many thanks to everyone who provided feedback on the book and, of course, special thanks to our editors, Kathy Osborn and Madhusudan Uchil. Lastly, thank you to Lada Adamic for making me learn R
, Qiaozhu Mei for teaching me to develop intuition and apply my knowledge to real-world problems, Katherine Teves for inspiring passion and discipline, Eddie Sartin for his vision and leadership, and Todd Osborn for always keeping in real. I am forever grateful.
If you like this book, consider sharing it with your network by running source("http://arn.la/shareds4fr")
in your R
console.
Ashutosh R. Nandeshwar
Rodger Devine
February 2018