Richard Berk is fighting crime and busting bad guys with an algorithm as his weapon of choice.
Berk, a Wharton statistics professor, has developed an algorithmic software to forecast crime. Though extremely versatile, Berk said, “you hand tailor [it] so you make sure that you provide the information that’s tuned to the particular problem.”
The variety of predictive applications ranges from domestic violence situations to dolphin killings and fugitive arrests. “Different question, different data, different outcome,” he said.
The California Department of Corrections and Rehabilitation was the first to call on Berk’s statistical expertise in the 1990’s. He was asked to create a method that could predict which inmates required a higher level of security. It was through this case that Berk developed his software. He concluded, “We did it, it worked, we’re still using it.”
The algorithm is fed basic core variables such as age, prior record, gender, offense for which the person was convicted, age of first felony and the elapsed time from present to most recent offenses.
After inputting these variables, the computer algorithm then generates hundreds of new variables. “That’s how you get the statistical power; you’re squeezing more information out of these core variables than people have done in the past,” Berk said.
Berk admits that the techniques for his forecaster aren’t new. “We get this done using information that everybody else has, but we use it better,” he said.
However, they allow Berk to be forty times better at finding a group of individuals that are very high risk. The algorithm is a methodical way of searching for “a needle in the haystack.”
“Forecasting who’s going to be trouble is better done by a computer,” he said
The data of the algorithm needs to be “routinely available and in real time.” The probation officer needs to make an immediate decision, so all applications need to occur in a setting where data is available, he added.
Berk used classic algorithmic functions in his software construction. “They come up everywhere in statistics,” said Wharton statistics professor Nancy Zhang.
Berk said the forecaster is based off of simple techniques and equations.
“KISS — Keep it simple, stupid. That applies to everything in life. It’s hard enough when it’s simple.”
Random Forests, a principal algorithm in the software, produces results that “people can understand,” Berk added.
Random Forests is widely-used in a variety of models. It holds a “very good reputation in the field for being predictive,” said Wharton statistics professor Robert Stine. Though “not always very interpretable,” Stine said, “it works well in a variety of models.”
Berk doesn’t consider himself particularly skilled. Instead, he built on the foundational techniques of decades of computer scientists and statisticians.
“I like working with data, I like working with computers, I like this whole idea of rare things that nobody has seen.”
Berk continued that the big success story in Philadelphia was the discovery that two- thirds of the city’s criminals under supervision were not actually high-risk.
“Everybody focuses on finding the bad guys, but when you find the bad guys, you also find the good guys,” he said
Getting calls for new opportunities every day, Berk’s model may soon go international.
His future plans for the model remain purely practical. “The next step is how to convey the uncertainty at the same time that you convey the forecast,” he said.
The Daily Pennsylvanian is an independent, student-run newspaper. Please consider making a donation to support the coverage that shapes the University. Your generosity ensures a future of strong journalism at Penn.
DonatePlease note All comments are eligible for publication in The Daily Pennsylvanian.