A well-functioning, professionalized bureaucracy is increasingly seen as a key determinant of economic development. Historically, the standard recipe for achieving such professionalization has been the enactment of civil service reforms introducing recruitment through competitive exams and employee protection from political dismissals. The promise of these reforms is that reducing politicians’ control over hiring and firing decisions would allow governments to attract and retain more qualified employees, which would, in turn, lead to improvements in bureaucratic performance.

Although civil service reforms have improved performance in some contexts, they appear to be no silver bullet: in an assessment of 71 reforms funded by the World Bank, only 42 percent were rated as successful by an independent agency. Moreover, there has been a recent push in some developed countries toward greater flexibility for hiring and removing public employees, often fueled by concerns that traditional civil service rules might result in a bureaucracy that is unresponsive to citizens’ needs. Ultimately, whether reducing political control over the civil service actually enables governments to hire and retain more qualified employees is an empirical question, and one for which there is remarkably limited direct evidence. Opening the black box of the bureaucracy is a necessary step to understand the mechanisms underlying civil service reforms’ varying degrees of success.

Our research studies the impacts of the 1883 Pendleton Civil Service Reform Act. This act, which introduced competitive exams for the selection of certain federal employees, is widely regarded as the first step toward a professionalized civil service in the United States. Our analysis focuses on the consequences of the act for the functioning of the Customs Service, a key government agency that, by the time of the reform, collected more than half of federal revenue. Although we find that the reform indeed improved targeted employees’ professional backgrounds and reduced turnover, we show that these changes did not translate into higher cost-effectiveness in customs revenue collection.

In our research, we show that the reform worked as expected by its proponents along two main dimensions. First, it led to a sizable (25 percent) reduction in employee turnover. This reduction was larger for workers in positions subject to exams, and it was also larger during years in which control of the federal administration changed party hands. Second, it led to an improvement in targeted employees’ professional backgrounds: new hires in positions requiring exams were 11 percentage points less likely to report working in an unskilled occupation prior to joining the Customs Service and 9 percentage points more likely to report working in a professional one. Since exams were aimed at testing practical knowledge relevant to positions in the Customs Service (rather than formal academic training), we interpret these changes in occupational background as reflecting a likely improvement in targeted employees’ actual qualifications for their jobs. Indeed, a shortage of workers with a professional background might have been a binding constraint in achieving cost-effectiveness: prior to the reform, there was a strong positive correlation between changes in the share of such employees and changes in districts’ revenue.

Ten years after the reform, nearly 60 percent of employees in reformed districts had been appointed through an exam. We next ask if this change led to increased cost-effectiveness in customs revenue collection. We expected improvements in this regard through three main channels. First, limiting the room for patronage could have curtailed personnel expenses by reducing the number of employees hired solely to reward political loyalty. Second, by creating a separation between bureaucrats and politicians, the reform could have reduced corruption, thereby increasing revenue. Third, to the extent that the reform increased bureaucratic expertise, workers might have been better equipped to interpret and enforce the customs laws. Indeed, the conventional wisdom among both practitioners at the time and modern scholars is that the reform improved the overall efficiency of the federal bureaucracy.

Surprisingly, however, we find that the reform had limited impacts on cost-effectiveness. First, the reform did not lead to a reduction in total expenses (or in districts’ total number of employees); our point estimates are close to zero and are statistically insignificant. Similarly, we find no statistically significant evidence that the reform led to increased customs revenue. Indeed, we see little evidence that would suggest an increase in revenue over time: the postreform estimated effects are sometimes positive and sometimes negative, lacking a clear pattern. Finally, as expected, given the limited effects on expenses and revenue, we also see no indication of an improvement in our main measure of cost-effectiveness, the revenue per dollar spent.

We also investigate possible reasons why the changes in personnel outcomes did not appear to have translated into higher cost-effectiveness in revenue collection. We first discuss the potential role played by the incomplete scope of the reform. As typical in civil service reforms, the Pendleton Act targeted only a subset of employees. Specifically, it targeted employees in midtier positions but exempted those below a salary threshold and exempted districts’ top managers (the collectors of customs). This incomplete scope could have been important for two reasons. First, by exempting employees below a salary threshold, the reform created incentives to hire additional workers for low-paid positions. Indeed, we document that the reform nearly doubled the share of workers in such positions. This shift was likely pernicious for the performance of reformed districts because it distorted their hierarchical structures and because low-paid employees had weaker professional backgrounds. Second, retaining the method for selecting collectors—to the extent that they mattered for districts’ outcomes—also likely limited the reform’s ability to improve cost-effectiveness.

Finally, we discuss three additional potential explanations for the reform’s inability to improve cost-effectiveness. First, we find limited evidence that the lack of detectable influence on cost-effectiveness was due to the reform spilling over to the nonreformed districts; proximity to a reformed district does not predict either increases or decreases in revenue in the postreform period. Second, the impact of the reform on cost-effectiveness was limited even over a 20-year horizon. This result is contrary to the hypothesis that the policy’s full benefits could only become apparent after at least 10 years (to fully replace employees hired through the old regime, for instance). Third, we consider the possibility that although employees hired through exams might have been of better quality, they might have also exerted less effort (or might have otherwise been less responsive) than patronage hires. Although we do find some suggestive evidence consistent with this explanation, we note that, unlike some modern civil service protections, the Pendleton Act did not provide tenure to employees. Hence, the disincentive effects of the reform might have been less prominent than in other contexts.

Our data do not enable us to establish whether the reform led to improvements in performance along margins other than revenue per dollar spent. For instance, reformed districts may have become faster at clearing imports or may have improved how closely they followed the tariff laws (which would not have necessarily led to higher revenue). Although revenue per dollar spent does not incorporate all dimensions of performance for an agency whose primary goal was revenue collection, it does capture an important aspect of it. Indeed, this measure was regularly discussed both in government publications and by proponents of civil service reform, who blamed patronage for the high cost to collect. Moreover, similar measures have been used by other scholars studying the performance of government units in charge of revenue collection.

We contribute to two main categories of the literature. First, we contribute to the literature on the recruitment and hiring of civil servants. A number of studies in this literature show some of the potential costs of hiring discretion. Our work analyzes the impacts of a commonly used (but understudied) tool for limiting such discretion: competitive civil service exams. Specifically, we show that such exams can improve employees’ qualifications but that these improvements might not necessarily translate into gains in overall performance, as introducing exams might trigger additional countervailing organizational responses.

Second, we contribute to the literature on civil service reforms. In the United States, state and local reforms reduced incumbent parties’ chances of reelection, reduced political budget cycles, and improved bureaucratic performance. Remarkably, however, there is very limited evidence on the effects these reforms have on the main objects they are intended to change—namely, the turnover rate and qualifications of bureaucrats—and the existing evidence casts doubts on whether these reforms actually generate these intended changes. Our data allow us to investigate how these reforms affect both the personnel outcomes and the overall organization and performance of reformed units. Doing so enables us to better unpack the factors mediating a reform’s overall success: the reform was binding, and it partially succeeded in improving personnel outcomes, yet it led to distortions in personnel structure by incentivizing hiring in exempted positions. We also focus on an important historical context and policy context: the Pendleton Act, which is a landmark reform in U.S. history, and the ability to collect revenue, which is a key determinant of state capacity.

Note
This research brief is based on Diana Moreira and Santiago Pérez, “Civil Service Reform and Organizational Practices: Evidence from the Pendleton Act,” NBER Working Paper no. 28665, April 2021, http://​www​.nber​.org/​p​a​p​e​r​s​/​w​28665.