In the Public AI

How Governments Can Apply Responsible AI Principles to Artificial Intelligence for Public Service Delivery

Logo for Partnership for Public Service
The Partnership for Public Service is a nonpartisan, nonprofit organization that works to revitalize the federal government by inspiring a new generation to serve and by transforming the way government works. The Partnership teams up with federal agencies and other stakeholders to make our government more effective and efficient.
Logo for Microsoft
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. We enable digital transformation for the era of an intelligent cloud and an intelligent edge, empowering government—from large federal agencies to small-town governments—to promote citizen well- being, manage communications and enhance crisis response. At Microsoft we are committed to developing solutions that enable our government stakeholders to capitalize on opportunities at the accelerated pace required to deliver modern mission outcomes.
Table of Contents


Photo credit: Shutterstock

Artificial intelligence is increasingly part of our lives—voice assistants on our smartphones, chatbots on retail websites and algorithms that suggest the next television show we should watch. As AI becomes more common in our everyday interactions with private sector entities, it is also increasingly relevant for the delivery of public services by federal, state and local governments. Some government agencies have already incorporated artificial intelligence tools into their services—such as Federal Student Aid’s Aidan chatbot or Utah’s efforts to prevent pandemic unemployment fraud using AI 1. Others, however, are just beginning to explore whether and how AI tools can be incorporated into their service delivery.

What we mean by artificial intelligence
Artificial intelligence is computers and software performing tasks typically associated with people, such as recognizing speech or images, predicting events based on past information, or making decisions. AI tools use data to learn a task, and they continue to improve at functions such as transferring information from paper into computers, recognizing images, answering questions by quickly finding relevant information in databases or documents, detecting patterns in data, making decisions about simple queries and predicting someone’s behavior based on past conduct. 2

The scale and speed of artificial intelligence tools give them enormous potential to enhance the efficiency of government service delivery, but also mean these tools must be employed carefully to avoid automating biased or inaccurate results. This is particularly important in the context of public service delivery, where governmental organizations have an obligation to provide trustworthy and equitable services to all possible customers.3 While a wide body of research exists on the potential risks of governmental use of AI in law enforcement and national security contexts,4 less work has been done to examine what is needed for public sector organizations to responsibly use artificial intelligence in service delivery.

The existing resources on standards for responsibly using artificial intelligence often focus on technical and data specifications.5 These are fundamental considerations, but such recommendations are often difficult to understand for those without deep technical knowledge. And while technical experts play an integral role in deciding whether and how to employ artificial intelligence, many more of the government leaders who contribute to this process—program managers, acquisition professionals, lawyers and frontline service providers—lack technical backgrounds.

This research brief from the Partnership for Public Service and Microsoft examines how principles of responsible artificial intelligence can apply to government service delivery and offers recommendations and considerations that non-technical government leaders should take into account as they decide whether and how to incorporate AI tools into their services. It also outlines recommendations for facilitating collaboration between technical and non-technical leaders, as both sets of perspectives are vital to ensuring responsible use of artificial intelligence.

What is responsible artificial intelligence in the context of public services?

Photo credit: Shutterstock

Technological tools such as artificial intelligence “always have potential benefits and risks,” according to Terrence Neumann, an academic studying AI at the University of Texas at Austin. Many experts have created frameworks that aim to provide guardrails to help organizations take advantage of AI’s capabilities while avoiding its potential risks. One tally counts over 160 published frameworks—from organizations as diverse as the American Medical Association and the New York Times—delineating how automated decision-making tools can be developed and implemented ethically.

These frameworks often center on the concept of responsible artificial intelligence: the idea that AI tools must meet certain governance and ethical standards in their development, implementation and operation. Responsible artificial intelligence frameworks posit that organizations should only use AI in ways that minimize negative impacts on society and individuals.

Various public sector organizations in the United States also have begun work on frameworks for responsible artificial intelligence use. For example, the U.S. Agency for International Development’s Artificial Intelligence Action Plan and the University of California’s Responsible Artificial Intelligence report lay out standards and recommendations for future action to promote responsible AI use. The Government Accountability Office’s 2021 AI Accountability Framework describes how key practices in the areas of governance, data, performance and monitoring can assist public sector organizations in ensuring responsible AI use. And the October 2022 Blueprint for an AI Bill of Rights released by the Biden administration outlines principles that seek to protect the public in their interactions with automated systems.

Although they highlight many of the same principles, each of these frameworks addresses specific considerations for how to achieve responsible artificial intelligence in a particular context—for example, in the medical or legal fields. The experts with whom we spoke outlined how several core principles of responsible AI apply to public service delivery. The principles below should serve as examples for leaders of how to adapt established tenets of responsible artificial intelligence to the specific context of public services.



A key facet of responsible AI is understanding when AI is or is not well-suited to address a specific problem. In their current form, AI tools are most well-suited for specific tasks that involve clear parameters—for example, identifying whether an X-ray shows a broken bone. Many public services, on the other hand, involve complex decisions “where there’s a lot more room for uncertainty…and where there’s a really significant cost to getting it wrong,” one public service expert said. AI might be the right solution for an agency looking to answer website visitors’ simple questions via a chatbot, but might not be an appropriate choice to evaluate the likelihood an applicant will commit benefits fraud. AI is often not the right tool when the consequences—such as denial of much-needed financial assistance—of an AI tool making an incorrect decision have significant potential to negatively impact customers, and when correct decision-making requires complex thinking and evaluation for which AI is not well suited.



The data used in, and outputs of, artificial intelligence models related to public service delivery should be representative of the eligible constituents of a particular public service. Particularly when certain constituent characteristics are underrepresented in data or form a small portion of a service’s customer base, public service leaders should ensure that the data underpinning AI tools accounts for everyone who might interact with a service. “If, for example, this service is serving a population that is underbanked, we know that many datasets out there have known gaps around this population,” said Taka Ariga, chief data scientist at the Government Accountability Office.

Without robust attention to representativeness, an AI model in this situation could fail to perform correctly and could even worsen service delivery. A model that relies on financial record data, for example, might interpret a lack of this history as an indicator of a customer’s ineligibility, denying them access to a service they qualify for because the data was not representative of the full range of potential customers.



Equal treatment under the law is a core principle of democratic public service delivery, but agencies must be aware of the particular challenges that AI tools can present to the principle of non-discrimination. AI does well at eliminating certain types of bias—for example, unlike humans, it’s not susceptible to making different decisions because it is tired or hungry, or because the applicant is a family member— but it also faithfully reproduces any biases present in the data it was trained on. For example, if participants in a farm loan program have historically been primarily from certain states, an algorithm trained on that historical data may draw incorrect conclusions about who the program’s customers should be and reproduce existing patterns, unintentionally leading to the exclusion of eligible customers. “Generally, AI faithfully learns from the training data it’s fed but doesn’t automatically highlight qualitative issues that could contribute to skewed results,” said the GAO’s Ariga, noting that agencies must be particularly vigilant in building and training AI models to prevent automating biased outcomes.



When using artificial intelligence tools for service delivery, governments must be transparent with the public about why and how these tools are being used. Making information easily available can help build trust between agencies and the people they serve, and this is all the more important when customers may not have much familiarity with how these tools operate. However, public service leaders should focus on providing transparency in a way that is meaningful for the public, rather than providing technical information that is likely to create more confusion. “Explaining the algorithm itself is likely not sufficient,” said Vince Dorie, principal data scientist at Code for America. Instead, governments can demonstrate transparency by developing and adhering to a set of publicly available criteria that an AI tool must fulfill before it can be used for service delivery. Effective transparency requires agencies to “as much as possible, explain how the algorithm was evaluated, so that people understand…this is the standard it met,” Dorie said.



Like traditional processes that provide opportunities for members of the public to appeal government decisions, public services using AI must provide customers with due process and opportunity for redress if they are negatively impacted by a decision made by or reliant on AI. For example, if an AI tool decides an applicant is likely a fraud risk and denies a benefit, “do they have the ability to see that result and its accompanying confidence metrics and contest that result?” asked Ariga. According to Dorie, due process in this context also requires that an AI-enabled service use the same standards—for example, the same requirements for eligibility—that would be used if the service operated without an AI tool.

How can government leaders ensure their use of AI is responsible?

Photo credit: Shutterstock

The standards above are among the many principles that can guide public sector leaders in ensuring their use of artificial intelligence in service delivery is responsible and contributing to the public good. But how should non-technical leaders apply these principles in their decision-making around AI?

Ensuring responsible AI use is not a one-time exercise, but a continual process that requires attention every step of the way—from the first contemplation of incorporating AI into a program to the routine use of a fully-implemented tool. Below are some of the specific considerations and questions non-technical leaders should address at each stage of the process.


When considering using artificial intelligence for a program, leaders should:
  • Clearly define the problem they are hoping to solve. Before deciding whether AI is the right fit, leaders must define what they are hoping to achieve. Is it solving an efficiency problem to deliver services more quickly and reduce a backlog? Or is it tackling an effectiveness problem, where the service is not achieving its intended outcomes? Framing the problem as a specific question can be a useful starting point to help leaders more clearly evaluate whether they have the data they would need to answer that question using AI. Leaders should not begin by “sitting and looking at the technology” and then deciding what problem to apply it to, noted one state technology leader we interviewed.
  • Explore a range of possible solutions, including those with and without artificial intelligence. Once the need is clearly defined, leaders can begin to explore the types of artificial intelligence tools that could be appropriate. However, leaders should also consider whether there are non-AI solutions that might serve the same purpose. In many cases, “government would benefit from much less complex technological improvements” than AI, said one public service expert. An AI tool may end up being the right fit, but leaders should not exclude other types of solutions during the exploratory phase.
  • Solicit input and buy-in from a range of perspectives. Like any other significant change to a public service program, the decision to implement an AI tool requires input from a variety of people. Within their organization, leaders should be sure to consult technical and legal colleagues who can help weigh the privacy, security and legal implications of using AI, as well as any frontline employees whose duties might be affected. Leaders should also seek input from members of the public who would be affected by the system and must be deliberate in ensuring that the input reflects the views of the full range of the service’s customers.
  • Understand the data available and its suitability for AI use. Data is the backbone of any artificial intelligence tool. Before contracting for or building a particular tool, leaders should evaluate the data they have available and its suitability for AI use—paying particular attention to the data’s origins, quality, consistency and inherent biases. A thorough understanding of the data—and how it will be updated—will prepare leaders for the next phrase of acquiring or building the AI tool itself.


When contracting for or building an artificial intelligence tool, leaders should:
  • Define clear objectives for the AI system and metrics that will be used to evaluate it. Just as they must clearly define the problem to be solved, leaders must set out parameters for what the AI system should be capable of and how it will be evaluated. These metrics should include both technical specifications and measures related to responsible AI principles, such as the tool’s security and ability to protect customers’ data or the consistency of its outputs across different types of demographic data.
  • Recognize the unique context of public service delivery. Even when they are providing similar services, public sector entities have different obligations than private sector companies, such as their need to serve all members of the public rather than a targeted customer base, or their duty to responsibility manage public resources. And in many cases, public services do not have private sector equivalents. This means that AI tools that have been successful for private sector uses are not always adaptable to the public service context. In some cases, therefore, an AI tool built by the agency itself may be a better fit. In other cases, an off-the-shelf tool may be the right choice, but leaders must still be mindful that there may be a need to adapt it to the public sector context.
  • Understand and develop the organization’s ability to build or evaluate AI systems. Whether building or contracting for an AI tool, agencies need the talent to be able to do so. It’s clear that to build an AI tool, agencies need data science and machine learning talent. But contracting for AI tools also requires specialized knowledge: acquisition professionals must be able to evaluate whether an AI tool is appropriate for the government context and meets the defined metrics. “It’s important for government to independently evaluate the plausibility” of vendors’ claims about AI tools, said Dorie of Code for America. When deciding if they should build, buy or use a mix of the two, leaders should evaluate their agency’s capabilities and, if needed, further develop those necessary for the chosen option.


When implementing an AI solution in a public service program, leaders should:
  • Create mechanisms for transparency and public communication about the AI system. Before an AI system is fully implemented into service delivery, leaders should build mechanisms for transparency and communication with the public about the tool. The appropriate level of transparency is highly dependent on the tool and the context in which it is operating. Explaining exactly how a chatbot arrived at each of its answers might be unnecessary information for a customer, but more transparency may be needed when AI is making decisions about approving or denying food assistance benefits, for example. Input from frontline service providers and customer research can help ensure that transparency mechanisms are tailored to the communication needs and preferences of a service’s customer base.
  • Develop due process mechanisms for those impacted by the AI’s decisions. Governments face particular challenges “when AI is making decisions that have the opportunity to discriminate from the public’s perspective,” noted Teri Takai, senior vice president at the Center for Digital Government. To build public trust in services that use AI tools, governments must provide due process mechanisms for people to appeal or contest decisions made by or reliant on AI. Leaders should establish these mechanisms before the AI is operational and then further refine them as needed. Leaders should also establish internal mechanisms to ensure the tool’s decisions will be audited on a regular basis.
  • Ensure data used to train and operate the tool is suitable and high-quality. At every step of the process, leaders must pay attention to the quality of the data used for an AI tool. In the implementation phase, leaders should ensure that the data used to train the AI tool is appropriate given the data that will be used to operate it. This should include checking the data for built-in biases that could be replicated by the model and finding alternative data sources or mitigation strategies to prevent that replication. Leaders must also ensure that these operational data streams are well set up and ready to feed into the model, with particular focus on data completeness, accuracy and reliability.
  • Establish clear metrics of success. Building on the metrics used to initially evaluate the AI system, leaders should establish a framework to evaluate the performance of the AI tool. These measures should evaluate the AI’s contribution to solving the initial problem identified and include technical and responsible AI considerations. Data scientists, legal and privacy experts and program managers should all contribute to the determination of these metrics. Performance metrics should include measures at both the component level—each technical building block working as intended—and system level—all the components working well together as a whole.6


When routinely using AI to deliver public services, leaders should:
  • Regularly audit data inputs and model outputs to ensure consistency. Once an AI tool is operational, leaders should regularly audit the operational data as well as the outputs being produced. AI tools are dynamic, and their performance can vary over time. Regular audits allow needed adjustments to be made so that AI tools can operate consistently and achieve their intended purpose. To the extent practicable, public sector organizations should also provide opportunities for academics and other outside experts to independently audit the data and outputs.
  • Evaluate the tool’s performance against established metrics of success. At regular intervals, leaders should evaluate how the artificial intelligence tool is performing against the measures established for its success. These monitoring activities should be tracked and publicly documented as much as possible to promote transparency and trust in the system. Leaders should involve evaluation teams, as well as technical experts, in this process.
  • Understand when an AI system is no longer serving its purpose and upgrade or retire it accordingly. Even when they are successful, AI models may reach a point when they are no longer serving their intended purpose, particularly when there are changes in the operating environment. Based on their regular evaluations, leaders should recognize when AI tools are no longer working as intended, and then make needed changes or remove the tool from use, as appropriate.

How can non-technical and technical government leaders collaborate to ensure responsible AI use?

Photo credit: Shutterstock

Responsibly evaluating, implementing and using artificial intelligence tools requires successful collaboration between technical and non-technical leaders. The data scientists building AI tools, the chief information officers operating them, the general counsels reviewing their privacy implications, the program managers interpreting their results and many others all need to be collaborating for AI use to be follow responsible artificial intelligence principles. The experts we interviewed highlighted key recommendations for how technical and non-technical leaders can productively collaborate to ensure responsible AI use:


Focus on the problem and intended outcomes. Focusing on the common reference point of the problem an AI tool is intended to solve can help technical and non-technical leaders communicate more effectively. Leaders can collaborate better when they focus on ensuring the tool is achieving intended outcomes rather than getting caught up in technical specifications or program management frameworks. According to Neumann of the University of Texas, “really thinking about the quality of the outcomes” is the key to more effective communication and collaboration.


Build a common foundation. Technical and non-technical leaders each bring important expertise to conversations around responsible AI, but this expertise is sometimes difficult to communicate across different frames of reference. Developing a baseline understanding of artificial intelligence and how it operates can prepare non-technical leaders for collaboration around responsible AI, while technical experts can benefit from learning more about the service being delivered and the service population’s needs and concerns. Ensuring that everyone has a common understanding of the technical and non-technical foundations can help leaders better understand each other and more productively collaborate.


Think about AI in context. “It’s often a very disconnected conversation” between the development of an AI tool and its implementation in the context of a program, noted Ariga of the GAO. Technical and non-technical leaders can improve their coordination by recognizing from the beginning that AI tools do not operate independently, but rather as part of a larger context. Considering questions such as “how will frontline employees interpret model outputs” and “what are the privacy implications of using this system” will help technical and non-technical leaders find concrete points of collaboration and ensure the AI tool is well-integrated into the broader system.


Recognize when AI is not the answer. Successfully collaborating on responsible AI requires technical and non-technical leaders to not only work together to implement AI tools, but also to mutually recognize when AI is not an appropriate solution. Program managers may have their heart set on improving their service delivery by incorporating an AI tool that has worked well in other contexts, or data scientists may be eager to implement an innovative prototype, but each group must heed concerns from the other when they arise. Open discussions and a commitment to responsible AI principles can help leaders build understanding about why colleagues may believe AI is not appropriate in a certain context and come to agreement about when to pursue and when to abandon artificial intelligence solutions.

Conclusion: Building blocks for responsible AI

Photo credit: Shutterstock

Different agencies and levels of government have widely varying experience with using artificial intelligence in public service delivery. Some have already begun deploying AI tools in their services, while others have not so far adopted AI. Whatever their level of experience, public sector organizations must put responsible AI principles at the center of their decision-making. But to successfully apply these principles, agencies need to have in place the building blocks that create an environment that fosters responsible AI use: data, talent and governance structures.

High-quality data is fundamental to successful and responsible artificial intelligence tools. Organizations that have a rigorous, wholistic approach to cleaning and storing data will be better positioned to responsibly use artificial intelligence. However, some organizations are addressing questions of AI and data quality separately rather than as intertwined considerations. “I am just beginning to see that connection [between data quality and readiness for AI] happen in a meaningful operational way in state and local governments,” said Takai of the Center for Digital Government. Robust data quality is an important consideration regardless of an agency’s intent to use AI, but agencies should take particular care to have this foundation in place if they are interested in using artificial intelligence tools to deliver public services. An explicit connection to AI readiness can also help drive data quality initiatives if data is recognized as a necessary precursor to many potential uses of AI.

Talent is also a crucial building block for responsible AI use. Whether agencies are building their own AI systems or acquiring them from outside vendors, they should ensure they have sufficient expertise to evaluate and operate artificial intelligence tools. Public sector organizations should explore ways to develop technical and non-technical staff capacity to understand the risks, benefits and implications of using AI for service delivery. Some current efforts recognize this need and aim to assist agencies in developing expertise—the AI Training Act signed into law in October 2022 charges the Office of Personnel Management with developing a training program to help acquisition professionals better understand artificial intelligence and its potential risks and benefits.

Responsible use of artificial intelligence for public service delivery also requires strong governance structures that facilitate collaboration and agility. Before beginning AI initiatives, organizations should be sure they have in place processes that enable collaborative decision-making that takes into account the many perspectives needed for truly responsible AI use. “It’s important to say, does your organization have the governance structure to methodically bring these perspectives in throughout the [AI] lifecycle, and do they have enough authority in the matter?” said the GAO’s Ariga. Agencies also should consider how to establish governance processes that facilitate agility, so that they can adapt as circumstances change and continue to adhere to responsible AI principles.

Public sector organizations interested in using AI for service delivery can enhance their ability to deliver responsible artificial intelligence principles such as non-discrimination and transparency through collaboration between technical and non-technical leaders and a focus on establishing strong data, talent and governance foundations.


For additional resources on responsible AI for public service delivery, take a look at our overview of the recommendations in this report and a set of guiding questions to help leaders work through how to responsibly use AI in their own specific scenarios.


  • 1. Center for Digital Government, IBM and NASCIO, “AI Meets the Moment,” 2021, 11. Retrieved from
  • 2. See also: Partnership for Public Service and Microsoft, “Into the Storm,” July 9, 2020, 1. Retrieved from
  • 3. Partnership for Public Service and Accenture Federal Services, “Government for the People: Designing for Equitable and Trusted Customer Experiences,” Nov. 16, 2021. Retrieved from
  • 4. For examples see: Hope Reese, “What Happens When Police Use AI to Predict and Prevent Crime?,” JSTOR Daily, February 23, 2022. Retrieved from; National Institute of Standards and Technology, “NIST Study Evaluates Effects of Race, Sex, Age on Face Recognition Software,” Dec. 19, 2019. Retrieved from
  • 5. Ibid.
  • 6. Government Accountability Office, “Artificial Intelligence: An Accountability Framework for Federal Agencies and Other Entities, June 2021, 48. Retrieved from

Elizabeth Byers contributes to the Partnership’s portfolio of government effectiveness research, in particular projects on improving the customer experience with federal services. The daughter and granddaughter of public servants, she grew up with a deep respect for federal workers and their dedication to working on behalf of the public. Elizabeth’s favorite public servant is Carla Hayden, the first woman to be appointed Librarian of Congress and a strong advocate for open and equal access to public services.

Email Elizabeth
The individuals listed below generously offered their input on how government leaders can apply responsible AI principles to the use of artificial intelligence in public service delivery. We greatly appreciate their time and counsel. The contents of this research brief do not necessarily reflect the views of those with whom we spoke, and the views of participating federal, state and local officials do not necessarily reflect positions or policies of federal, state or local governments or their agencies.

Taka Ariga

Chief Data Scientist and Director of Innovation Lab

Government Accountability Office

Nikhil Deshpande

Chief Digital Officer

Georgia Technology Authority

Vincent Dorie, Ph.D.

Principal Data Scientist

Code for America

Laurel Eckhouse

Quantitative Criminal Justice Researcher

Code for America

Terrence Neumann

Ph.D. Candidate

University of Texas at Austin

Steve Nichols

Former Chief Technology Officer

Georgia Technology Authority

Teri Takai

Senior Vice President

Center for Digital Government


Header photo credit: Shutterstock

Project Team
Partnership for Public Service
Elizabeth Byers
Associate Manager, Research and Analysis
Cheyenne Galloway
Bob Cohen
Senior Writer and Editor
Mark Lerner
Senior Manager, Technology and Innovation
Loren DeJonge Schulman
Vice President, Research, Analysis and Modernizing Government
Imani Miller
Samantha Donaldson
Vice President, Communications
Nicky Santoso
Digital Design Associate


Darcy Bellerjeau
National Lead, Influencer and Association Strategy, Public Sector
Dustin Ryan
Director, Data & AI, U.S. Education
Michael Mattmiller
Senior Director, State Government Affairs