Historically, the common conception of what it means to be a leader has been built on characteristics typically associated with white men. Understanding women’s experiences in federal leadership roles, and the barriers and challenges they face, is critical to creating a federal workforce that reflects the diversity of the United States and is better equipped to serve people with different backgrounds and needs.
Collecting and analyzing this information will enable organizations that support the federal government and its institutions—and our government itself—to address the barriers that contribute to gender disparities in federal leadership roles and to build a more effective federal workplace.
This is the first brief in the Partnership for Public Service’s LeadHERship series, which explores these issues in greater depth. For more information about this series, please review our introductory brief.
In 2019, the Partnership for Public Service launched the Public Service Leadership Model to set a new standard for effective federal leadership. The model is designed around two core values—commitment to public good and stewardship of public trust—and four key competencies—becoming self-aware, engaging others, leading change and achieving results—that leaders must demonstrate to lead in the federal government.
To support the application of this model for federal leaders, the Partnership developed the Public Service Leadership 360 assessment tool. This 360 assessment uses evaluations from multiple raters—from managers to peers to individual employees themselves—designed to evaluate how well a federal leader is demonstrating the Public Service Leadership Model’s values, competencies and subcompetencies. This assessment tool also provides us with unique data on the federal leadership experience for a variety of different social identities.
In this brief—our first exploration of this 360 data—we examine federal employees’ leadership experiences based on gender, specifically for men and women serving as leaders at all levels across government.
As we outlined in the introduction to this series of research briefs, women remain underrepresented in certain federal leadership positions. This analysis—backed by our unique 360 data—both sheds light on and raises new questions about why this gap persists, contributing to ongoing research on the experiences of women in the federal government. Based on rating scores given to the 1,123 federal employees who completed the Public Service Leadership 360 assessment tool from December 2020 to April 15, 2022, we uncovered several trends.
First, most employees surveyed, regardless of gender, exhibited the model’s key leadership competencies most or all the time, according to both their self-assessments and their ratings by others. These findings reinforce that federal leaders possess the skills and perspectives to make an impact in government.
We also found that women scored statistically significantly higher than men on all the model’s key competencies and core values. While both men and women tended to rate themselves lower on these values and competencies than others did, women’s self-assessments and ratings by others were more aligned than they were for men. Research has shown that this alignment—between self-ratings and ratings by others—can be a predictor of leadership success and career advancement.
Nonetheless, we also discovered that women rated themselves lower than men on a few key competencies and subcompetencies: leading change—which requires leaders to help others navigate organizational transformations and support creative solutions to organization-wide challenges—as well as innovation and creativity and embracing risk and uncertainty. These findings suggest that, despite their high ratings on many key federal leadership competencies, the women in our sample may lack experience or confidence in some areas that prioritize riskier and less certain aspects of organizational management and change. It is impossible to identify the precise reason for this perceived lack of confidence in our data. However, it is possible that long-standing structural or organizational barriers to women in leadership positions may result in these lower self-ratings.
Overall, our data demonstrates that women in federal leadership positions—at least those employees in our sample—are perceived by others as effective leaders with diverse skill sets. As such, it is likely that the broader gender gaps in certain federal leadership positions are unlikely to be based on actual competency or skill, but rather on long-standing barriers or gender stereotypes that restrict career advancement for women. These notions and obstacles may affect women’s own sense of leadership competency in some areas, perpetuating women’s underrepresentation in some government leadership roles.
In 2019, the Partnership for Public Service launched the Public Service Leadership Model to set a new standard for effective federal leadership, and to help leaders assess their performance and identify opportunities to enhance their skills. The model is built around two core values and four key competencies. Each competency is comprised of five subcompetencies that are specific to public service leadership.
To collect meaningful data and allow for a robust assessment of their skills, individual leaders are asked to submit a minimum of three names per rater category and encouraged to provide five to six names outside their individual managers. Our previous analyses demonstrate that the tool is valid and reliable from an analytic perspective, and an effective resource that supports federal employees on their leadership journey.
Now that the tool has been used by almost 2,000 federal employees enrolled in our leadership development programs, we identified an opportunity to explore patterns in the data to help us better understand public service leaders.
For this analysis—our first exploration of this 360 data—we compared rating scores for men and women in federal leadership positions. While examinations of gender differences in the federal government are not new, the depth and breadth of our 360 data makes this brief unique.
Our findings will contribute to ongoing research on gender gaps in federal leadership, help us refine our leadership development programming at the Public Service Leadership Institute and enable us to better support public servants in their growth. It will also enable greater awareness of the potential barriers, or societal or organizational systems that may contribute to the underrepresentation of women leaders in the federal government. Having a federal workforce that better reflects the gender diversity of the United States—while understanding and removing any barriers to increasing the representation of women leaders—is important for everyone, as it will enable our government to better serve and meet the public’s needs.
Multi-rater assessments such as our 360 tool—assessments that rate employees based on feedback from different categories of individuals—have been used by organizations since at least the 1950s and 1960s. Some of the first research supporting the benefits of having multiple raters provide input on multiple leadership traits emerged in 1967.2
Multi-rater assessments have been demonstrated to be an important tool in leadership development programs designed to help individuals grow and develop and become better supervisors of people and more effective leaders of organizations.3 Using multiple raters to assess employee performance and experiences provides unique and valuable perspectives4, and research has long demonstrated a connection between likelihood of promotion and consistency between how employees rate themselves and how others rate them.5 However, using multiple raters is not without challenges, including possible implicit biases6, but understanding overall rating scores, as well as scores by different rater type, is key to supporting and tracking leader development.
Overall, we reviewed a total of 15,130 sets of rating scores across the model’s two values, four key competencies and 20 subcompetencies. These scores were completed by managers, direct reports, peers, and friends or family for 1,123 federal employees who completed the Public Service Leadership 360 assessment tool from December 2020 to April 15, 2022. Out of the 15,130 ratings7, 7% were self-ratings and 93% were ratings completed by others.8
We only have demographic data for the federal leaders who received ratings, not for individuals who provided the ratings. This is an area of potential future research and modification of the 360 tool.
Figure 2. Overview of gender data.
Figure 3. Overview of leadership level data.
Figure 4. Overview of supervisor status.
The Public Service Leadership Model identifies four key competencies that federal leaders need to exhibit to best serve our country—becoming self-aware, engaging others, leading change and achieving results. Additionally, the model outlines two core values—stewardship of public trust and commitment to public good—that federal leaders should demonstrate to live up to the highest ideals of public service.
According to our analysis of the data from the 360 assessments, federal leaders scored on average between 6.33 to 6.86 on these two core values and four competencies. The scores were issued on a scale of 1 to 7, with 1 being the lowest and 7 being the highest.9
These leaders scored highest on the two core values of stewardship of public trust (6.85) and commitment to public good (6.84). The lowest average score was on the competency of leading change (6.33), which requires leaders to initiate, sponsor and implement innovative solutions in their organizations.
Table 1. Average scores on core competencies or core values from 1 to 7. 1-never, 2-rarely, 3-occasionally, 4-about half the time, 5-frequently, 6-usually/most of the time, 7-always. See Appendix A for the number of responses.
Each of the four competencies in the Public Service Leadership Model includes five subcompetencies that research identifies as critical for leaders to demonstrate.10 Several examples include emotional intelligence, evidence-based decision-making, equitably engaging a diverse workforce, understanding the importance of technology, and encouraging innovation and creativity.
For the 20 subcompetencies that we analyzed, average scores ranged from 6.16 for self-reflection to 6.78 for integrity. This range demonstrates that the leaders in our sample are generally perceived as exhibiting the 20 subcompetencies usually or most of the time both in their self-ratings and their ratings by others.
The high scores on integrity and diversity, equity and inclusion (DEI) are particularly important. At a time when just four in 10 Americans trust government to do what is right at least some of the time11 agencies need honest leaders who are committed to their oath of office and earn the faith of the public. In addition, the Biden administration has prioritized advancing the federal government’s support for racial equity and underserved communities. Leaders’ understanding of DEI will be critical as agencies work toward achieving more inclusive and equitable impact of their programs and services.
The three highest and the three lowest subcompetencies are displayed in the table below:
Table 2. The three highest and three lowest subcompetency average scores. See Appendix A for number of responses.
Interestingly, two of the lowest scoring subcompetencies—emotional intelligence and self-reflection—and the highest scoring subcompetency—integrity—are all within the becoming self-aware competency. The fact that two of the lowest scoring subcompetencies focus on skills needed to become self-aware may not be a surprise given the gaps in scores we identified and will be discussed below between how employees rate themselves and how others rate them.
However, a low score on self-reflection may simply indicate that leaders lack the time and space to address these gaps and build the emotional intelligence needed to become self-aware. This is something that future research and leadership development trainings might explore to better meet the needs of and strengthen public service leaders.
The other lowest scoring subcompetency—embracing risk and uncertainty—is in the leading change competency, which was also the lowest scoring of the four key competencies. As workplaces continue to move toward hybrid models because of societal shifts due in large part to the COVID-19 pandemic, and efforts to build more inclusive, equitable and accessible workplaces, it will be important to track data trends for this competency to better support leaders who need to manage evolving work patterns and expectations. Furthermore, the increasing complexity of work and ambitious missions of agencies requires leaders to embrace risk, uncertainty and lead through changes, making this an important trend to focus on.
We wanted to know whether the average scores on the key values, competencies and subcompetencies differed for men and women based on both their self-ratings and ratings by others—managers, direct reports, peers, and family or friends. To get at this answer, we performed a t-test—a statistical test that analyzes differences across average scores—to determine if these differences are meaningful or due to random chance.
Overall, we found that:
Some of these gender differences may be related to role congruity theory, which we discuss in our introduction. That theory posits that women are often rated lower in these areas due to stereotypical perceptions that they have less technical or professional expertise than men.12 Another factor may be raters giving lower scores to women based on external or situational factors, such as major organizational transitions or disruptions (e.g. COVID, budget shortcomings, staff turnover, etc.), challenging teams, or other known structural barriers affecting that leader’s performance, rather than on specific attributes of a particular leader.5
We also examined how individuals perceived their own leadership skills differently than others perceived them. We explored this difference across both genders and then for men and women separately.
Overall, we found that both men and women rated themselves statistically significantly lower than others rated them for all key competencies and core values. Differences between self-rating scores and others’ rating scores is common with 360 tools and can help leadership coaches provide leaders with more accurate and targeted feedback on areas of improvement. These analyses also help leaders better understand where they may lack self-confidence and enable greater self-awareness in terms of the competencies and skills they bring to the workplace.
Table 3. Average scores on key competencies and core values for self-ratings and ratings completed by others across both men and women. See Appendix A for number of responses.
We found that women consistently rated themselves slightly higher than men rated themselves for both core values and three of the four key competencies, with differences being statistically significant for the commitment to public good value and the becoming self-aware competency.
Women rated themselves lower on leading change than men did. While this difference is not statistically significant, it may suggest that women have less experience than men in positions that require certain competencies related to leading organizational transformation—or that women are not as confident as men in these areas due to persistent stereotypical notions that identify men as innately possessing certain leadership qualities. Women may also be more self-critical in this category and rate themselves lower, accordingly. It may also be that the leading change competency itself is more complex and harder to grasp than the others—leading both men and women to have less confidence that they exhibit this skill.
Figure 5. Average scores on core values and key competencies by gender (man vs woman) for self and other ratings. See Appendix A for number of responses.
Consistent with this finding, we also found that women rated themselves slightly lower than men on innovation and creativity and embracing risk and uncertainty—two leading change subcompetencies—although this difference was not statistically significant.
Overall, we found that women rated themselves statistically significantly higher than men on emotional intelligence, continuous learning, empowering others, collaboration and customer experience. Most of these subcompetencies fall under the engaging others or becoming self-aware competencies—ones that require strong interpersonal capabilities and are often mistakenly seen as “softer” and more easy-to-master skills despite the high level of expertise and aptitude they require. Taken in context of our broader gender differences findings, it is possible—though not certain—that these ratings thus highlight that some women have internalized broader societal stereotypes that they possess these “softer” skills more than men. It is also possible that our sample of federal leaders simply has women outscoring men in these key leadership competencies. Future research might explore the causes of these self-rating differences.
Nevertheless, the finding that women rated themselves higher than men on most competencies is unique to our research. The limited research on 360 assessments that includes women and men typically suggests that women self-rate lower than men—perhaps due to gaps in self-awareness or stereotypical notions of leadership characteristics.13 This brief contradicts these findings and highlights opportunities to better understand why women self-rate lower than men on certain subcompetencies and not others.
Figure 6. Average scores on subcompetencies for self- and other ratings based on gender (man/woman).
In most cases, women’s ratings of themselves aligned with how others—their managers, direct reports, peers, and family or friends—rated them.
We found that these other raters scored women statistically significantly higher across three of the key competencies and both core values. While not statistically significantly different, women also scored higher than men on the fourth key competency: engaging others.
We also found that other raters scored women higher than men on all the subcompetencies except for embracing risk and uncertainty, where men outscored women by just .01 points. These differences were statistically significant except for conflict management, embracing risk and uncertainty, adaptability and tech savviness.
Altogether, women scored themselves higher than men—and were scored higher by others—across almost all the core values and competencies.
The notable exception is the leading change competency. When rating themselves on this competency, women scored themselves lower—5.80—than men—5.82 and lower than others rated them—6.40. This is a potentially significant difference. Researchers have suggested that a discrepancy between how individuals rate themselves and how they are rated by others may relate to self-confidence and be one potential reason women are underrepresented in leadership positions across sectors.14 The gap may also be due to internalized stereotypes of leadership, resulting in women feeling that they do not have those skills even when they do or it could be due to self-imposed higher expectations. We will explore this topic further in a future research brief on self-efficacy.
Overall leadership performance of federal leaders in our sample
The top two scoring competencies or values for all rating scores regardless of gender were stewardship of public trust and commitment to public good. These two values are considered core to the Public Service Leadership Model and are unique to skills required for leaders in public service versus other industries. We can conclude that for the federal leaders included in our dataset they are high achieving on these two core values. This finding is important as it suggests that the federal employees in our sample are upholding the constitutional oath they take when they entered the federal service, since that is where these values derived from.15
The top two subcompetencies were integrity, and diversity, equity and inclusion. Having federal leaders scoring highly on these competencies is of importance for achieving the Biden administration’s focus on building a federal workforce that reflects the diversity of the United States16 and efforts to advance support for racial equity and underserved communities.17 Scoring highly on these two competencies suggests the federal leaders in our dataset may have the knowledge, skills, or aptitude to lead with integrity and inclusively, both of which will be necessary to achieve those areas of focus for the Biden administration.
The lowest scoring key competency was leading change, and the three lowest rated subcompetencies were embracing risk and uncertainty, emotional intelligence, and self-reflection. Emotional intelligence and self-reflection are both within the becoming self-aware competency. While it is difficult to determine from our data if the lower scores in these areas are due to deficits, these lowest scoring competencies do offer opportunities to improve among the federal leaders included in our dataset. Furthermore, it may be that if federal employees are provided more resources, including time and structure, for self-reflection, they may strengthen their emotional intelligence and increase their confidence for leading change.18
For perspective, the range in average scores across all competencies was between 6 and 7 on the 7-point ranking scale. Having an average this high suggests that federal leaders in our sample are perceived as usually or most of the time performing the specific leadership skills in our 360 tool, even for the lowest rated competencies.
However, the range in scores is the largest when people are rating themselves compared to others’ ratings. The largest range of scores for self-ratings was in the key competencies of becoming self-aware and leading change. This finding suggests opportunities for improvement for federal leaders in our sample in either their skills or their self-awareness in these areas, or both.
Furthermore, having a large range of scores and discrepancy between self and others’ ratings may, in and of itself, suggest a lack of self-awareness.19
Gender differences in leadership scores
While most of the research on 360 ratings focuses on men, studies on women suggest that they tend to self-rate lower than men. Authors hypothesize that this gap exists because women leaders may be less self-aware than men or because external factors—such as systematic stereotypical bias of leadership characteristics favoring men—could result in women self-rating lower and not advancing in the workplace.13
Contrary to much of this research, our data shows that women rated themselves—and were rated by others—higher than men with only a few exceptions. Specifically, our analysis suggests that women federal leaders have less confidence in their ability to lead change than men. We found that women self-rated lower than men on two key leading change subcompetencies: innovation and creativity and embracing risk and uncertainty.
We also discovered that there was only one subcompetency where men outscored women- embracing risk and uncertainty. Overall, contrary to other historical research, we found that when rated by others, women federal leaders scored higher on all core values, key competencies, and subcompetencies except four where they scored about the same as men: embracing risk and uncertainty, adaptability, conflict management, and tech savviness. These findings suggest that women in federal leadership positions—at least those employees in our sample—are perceived as strong leaders who maintain a diverse skill set. They also may have opportunities to improve their score on the leading change competency or address structural barriers (including stereotypes) that may result in the lower scores in these categories for women.
In addition, men and women federal leaders in our sample rated themselves significantly lower than others rated them on all the core values and key competencies—a sign that the becoming self-aware competency is an area of growth for not just women leaders, but for men too. However, women tend to rate themselves more similarly to their raters than men do, suggesting a more accurate assessment of their own skills in the workplace. This alignment between how employees rate themselves and how others rate them has implications on effectiveness of leadership trainings for the leader and whether they continue to develop leadership skills over time.20
Some research supports the relationship between this scoring alignment and a leader’s effectiveness, but it is complex. Overall effectiveness appears to be highest for leaders who rate themselves at the same level as other raters. In contrast, leadership effectiveness tends to be lowest for those who significantly overestimate their performance.21 Harsher self-ratings may be related to an individual setting high goals or standards and a commitment to continually improve, however, and the alignment of self-ratings and ratings by others may not always predict leader effectiveness.22 Furthermore, it is possible these discrepancies in scores will reduce with time, driven by motivation to reduce the gap in self-perception and perceptions of others to reduce feelings of cognitive dissonance23, or by changing their self-ratings and changing their behavior over time.24
Considering that persistent gaps in representation by women exist in our federal leadership, our data is unique in providing some trends for understanding how federal leaders are perceived by themselves and by others on key leadership skills. Given that women scored higher than men across most competencies, there is clearly work to be done to remove barriers to entry for these highly qualified women into the elite leadership ranks, as well as more research to be done to better understand these trends.
The trends we uncovered present multiple areas of future exploration, including:
Depending upon the results of these future explorations, we plan to investigate two key interventions for supporting leadership growth and development, and improvement on 360 scores: coaching and leadership trainings. We propose ongoing evaluation and research to better understand the key drivers of leadership skill development and the alignment of self-ratings versus others’ ratings over time, specifically when it comes to federal leaders.
These future analyses will continue to advance our understanding of the barriers that may contribute to the ongoing gaps in representation of women leaders in the federal government. As previously mentioned, understanding and removing these barriers is important for everyone, as it is one way to help government better reflect the diversity of the country and meet the needs of the people.
Emily Kalnicky oversees and advances efforts at the Partnership to understand and improve overall program effectiveness and mission achievement through monitoring and evaluation data. Her appreciation of the importance of using data to support decisions and improve effectiveness has spanned her career in nonprofits, academia, and government agencies. Kalnicky holds a Ph.D. in Ecology and has led behavioral research and evaluation studies across the globe. Emily’s favorite public servants are EPA scientists and staff committed to using data and a lens to environmental justice to serve the mission to protect human health and the environment for all.
Loren DeJonge Schulman Vice President, Research, Evaluation and Modernizing Government
Nadzeya Shutava Research Manager
Andrew Marshall Vice President, Leadership Development
Christina Schiavone Director for Executive and Team Coaching
Barry Goldberg Senior Writer and Editor
Samantha Donaldson Vice President, Communications
Max Stier President and CEO
Range: 5.68 to 6.66
Range: 5.68 to 6.43
Range: 5.65 to 6.05
Range: 5.97 to 6.23
Range: 6.16 to 6.81
Range: 6.25 to 6.69
Range: 6.24 to 6.48
Range: 6.49 to 6.62