2021 Federal Standard of Excellence
Data
Did the agency collect, analyze, share, and use high-quality administrative and survey data - consistent with strong privacy protections - to improve (or help other entities improve) outcomes, cost-effectiveness, and/or the performance of federal, state, local, and other service providers programs in FY21? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; open data policies)
Score
7
7
Millennium Challenge Corporation
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- In FY21, MCC is continuing to develop a strategic data plan. As detailed on MCC’s Digital Strategy and Open Government pages, MCC promotes transparency to provide people with access to information that facilitates their understanding of MCC’s model, MCC’s decision-making processes, and the results of MCC’s investments. Transparency, and therefore open data, is a core principle for MCC because it is the basis for accountability, provides strong checks against corruption, builds public confidence, and supports informed participation of citizens.
- As a testament to MCC’s commitment to and implementation of transparency and open data, the agency was again the highest-ranked U.S. government agency in the 2020 Publish What You Fund Aid Transparency Index for the sixth consecutive Index. In addition, the U.S. government is part of the Open Government Partnership, a signatory to the International Aid Transparency Initiative, and must adhere to the Foreign Aid Transparency and Accountability Act. All of these initiatives require foreign assistance agencies to make it easier to access, use, and understand data. All of these actions have created further impetus for MCC’s work in this area, as they establish specific goals and timelines for adoption of transparent business processes.
- Additionally, MCC convenes an internal Data Governance Board, an independent group consisting of representatives from departments throughout the agency, to streamline MCC’s approach to data management and advance data-driven decision-making across its investment portfolio.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- MCC makes extensive program data, including financials and results data, publicly available through its Open Data Catalog, which includes an “enterprise data inventory” of all data resources across the agency for release of data in open, machine readable formats. The Department of Policy and Evaluation leads the MCC Disclosure Review Board process for publicly releasing the de-identified microdata that underlies the independent evaluations on the MCC Evidence Platform, following MCC’s Microdata Management Guidelines to ensure appropriate balance in transparency efforts with protection of human subjects’ confidentiality.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- MCC’s new Evidence Platform offers a first-of-its-kind virtual data enclave for users to access and use public- and restricted-use data. The Platform encourages research, learning, and reproducibility and connects datasets to analytical products across the portfolio. In addition to the Evidence Platform, which links and provides access to all of MCC’s microdata from evaluation packages, MCC’s Data Analytics Program (DAP) enables enterprise data-driven decision-making through the capture, storage, analysis, publishing, and governance of MCC’s core programmatic data. The DAP streamlines the agency’s data lifecycle, facilitating increased efficiency. Additionally, the program promotes agency-wide coordination, learning, and transparency. For example, MCC has developed custom software applications to capture program data, established the infrastructure for consolidated storage and analysis, and connected robust data sources to end user tools that power up-to-date, dynamic reporting and also streamlines content maintenance on MCC’s public website. As a part of this effort, the Monitoring and Evaluation team has developed an Evaluation Pipeline application that provides up-to-date information on the status, risk, cost, and milestones of the full evaluation portfolio for better performance management.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- MCC’s Disclosure Review Board ensures that data collected from surveys and other research activities is made public according to relevant laws and ethical standards that protect research participants, while recognizing the potential value of the data to the public. The board is responsible for: reviewing and approving procedures for the release of data products to the public; reviewing and approving data files for disclosure; ensuring de-identification procedures adhere to legal and ethical standards for the protection of research participants; and initiating and coordinating any necessary research related to disclosure risk potential in individual, household, and enterprise-level survey microdata on MCC’s beneficiaries.
- The Microdata Evaluation Guidelines inform MCC staff and contractors, as well as other partners, on how to store, manage, and disseminate evaluation-related microdata. This microdata is distinct from other data MCC disseminates because it typically includes personally identifiable information and sensitive data as required for the independent evaluations. With this in mind, MCC’s Guidelines govern how to manage three competing objectives: share data for verification and replication of the independent evaluations, share data to maximize usability and learning, and protect the privacy and confidentiality of evaluation participants. These Guidelines were established in 2013 and updated in January 2017. Following these Guidelines, MCC has publicly released 117 de-identified, public use, microdata files for its evaluations and evidence studies. MCC also has 25 Disclosure Review Board-cleared, restricted data packages that it can make accessible on the new MCC Evidence Platform. MCC’s experience with developing and implementing this rigorous process for data management and dissemination while protecting human subjects throughout the evaluation life cycle is detailed in Opening Up Evaluation Microdata: Balancing Risks and Benefits of Research Transparency. MCC is committed to ensuring transparent, reproducible, and ethical data and documentation and seeks to further encourage data use through its new MCC Evidence Platform.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- Both MCC and its partner in-country teams produce and provide data that is continuously updated and accessed. MCC’s website is routinely updated with the most recent information, and in-country teams are required to do the same on their respective websites. As such, all MCC program data is publicly available on MCC’s website and individual MCA websites for use by MCC country partners, in addition to other stakeholder groups. As a part of each country’s program, MCC provides resources to ensure data and evidence are continually collected, captured, and accessed. In addition, each project’s evaluation has an Evaluation Brief that distills key learning from MCC-commissioned independent evaluations. Select Evaluation Briefs have been posted in local languages, including Mongolian, Georgian, French, and Romanian, to better facilitate use by country partners.
- MCC also has a partnership with the President’s Emergency Plan for AIDS Relief (PEPFAR), referred to as the Data Collaboratives for Local Impact (DCLI). This partnership is improving the use of data analysis for decision-making within PEPFAR and MCC partner countries by working toward evidence-based programs to address challenges in HIV/AIDS and health, empowerment of women and youth, and sustainable economic growth. Data-driven priority setting and insights gathered by citizen-generated data and community mapping initiatives contribute to improved allocation of resources in target communities to address local priorities, such as job creation, access to services, and reduced gender-based violence. DCLI’s impact is being extended through a new partnership in Côte d’Ivoire. MCC, Microsoft, and others are partnering to develop a Women’s Data Lab and Network program. The program will empower women-owned or women-led small and medium enterprises and female innovators and entrepreneurs with digital and data skills to effectively participate in the digital economy and grow their businesses.
Score
8
8
U.S. Department of Education
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- The ED Data Strategy–the first of its kind for the U.S. Department of Education–was released in December of 2020. It recognized that we can, should, and will do more to improve student outcomes through more strategic use of data. The ED Data Strategy goals are highly interdependent with cross-cutting objectives requiring a highly collaborative effort across ED’s principal offices. The Strategy calls for strengthening data governance to administer the data it uses for operations, answer important questions, and meet legal requirements. To accelerate evidence-building and enhance operational performance, it requires that ED make its data more interoperable and accessible for tasks ranging from routine reporting to advanced analytics. The high volume and evolving nature of ED’s data tasks necessitate a focus on developing a workforce with skills commensurate with a modern data culture in a digital age. At the same time, safely and securely providing access for researchers and policymakers helps foster innovation and evidence-based decision making at the federal, state, and local levels.
- Goal 4 of the Data Strategy calls for ED to “Improve Data Access, Transparency, and Privacy.” Objective 1.4 under this goal is to “Develop and implement an Open Data Plan that describes the Departments efforts to make its data open to the public.” Improving access to ED data, while maintaining quality and confidentiality, is key to expanding the agency’s ability to generate evidence to inform policy and program decisions. Increasing access to data for ED staff, federal, state, and local lawmakers, and researchers can help ED make new connections and foster evidence-based decision making. Increasing access can also spur innovations that support ED’s stakeholders, provide transparency about ED’s activities, and serve the public good. ED seeks to improve user access by ensuring that open data assets are in a machine-readable, open format and accessible via its comprehensive data inventory. ED will better leverage expertise in the field to expand its base of evidence by establishing a process for researchers to access non-public data. Further, ED will develop a cohesive and consistent approach to privacy and enhance information collection processes to ensure that Department data are findable, accessible, interoperable, and reusable.
- ED continues to wait for Phase 2 guidance from OMB to understand required parameters for the open data plan. In the meantime, USED continues to release open data; the department soft launched the Open Data Platform in September 2020 and publicly released it in December 2020.
- ED launched a public transparency portal in November 2020 disclosing expenditures and grantee performance data for the Education Stabilization Fund authorized under the CARES Act and subsequent authorities.
- ED’s FY18-22 Performance Plan outlines strategic goals and objectives, including Goal #3: “Strengthen the quality, accessibility and use of education data through better management, increased privacy protections and transparency.” This currently serves as a strategic plan for ED’s governance, protection, and use of data while it develops the Open Data Plan required by the Evidence Act. The plan includes a metric on the number of data assets that are “open by default” as well as a metric on open licensing requirements for deliverables created with Department grant funds.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- The Open Data Platform (ODP) at data.ed.gov is the ED’s solution for publishing, finding, and accessing our public data profiles. This open data catalog brings together the Department’s data assets in a single location, making them available with their metadata, documentation, and APIs for use by the public. The ODP makes existing public data from all ED principal offices accessible to the public, researchers, and ED staff in one location. The ODP improves the Department’s ability to grow and operationalize its comprehensive data inventory while progressing on open data requirements. The Evidence Act requires government agencies to make data assets open and machine-readable by default. ODP is ED’s comprehensive data inventory satisfying these requirements while also providing privacy and security. ODP features standard metadata contained in Data Profiles for each data asset. Before new assets are added, data stewards conduct quality review checks on the metadata to ensure accuracy and consistency. As the platform matures and expands, ED staff and the public will find it a powerful tool for accessing and analyzing ED data, either through the platform directly or through other tools powered by its API. The 309 data profiles included in ODP, encompassing over 3,500 individual data sets, will add to the 619 entries the Department already has in the Federal Data Catalogue once ODP takes over the data inventory feed in Q1 of FY22.
- The ED Data Inventory (EDI) was developed in response to the requirements of M-13-13 and initially served ED’s external asset inventory. It describes data reported to ED as part of grant activities, along with administrative and statistical data assembled and maintained by ED. It includes descriptive information about each data collection along with information on the specific data elements in individual data collections.
- Information about Department data collected by the National Center for Education Statistics (NCES) has historically been made publicly available online. Prioritized data is further documented or featured on ED’s data page. NCES is also leading a government-wide effort to automatically populate metadata from Information Collection Request packages to data inventories. This may facilitate the process of populating EDI and comprehensive data inventory.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- As ED collaboratively took stock of organizational data strengths and weaknesses, key themes arose and provided context for the development of the ED Data Strategy. The Strategy addresses new and emerging mandates such as open data by default, interagency data sharing, data standardization and other principles found in the Evidence Act and Federal Data Strategy. However, improving strategic data management has benefits far beyond compliance; solving persistent data challenges and making progress against a baseline data maturity assessment offers ED the opportunity to close capability gaps and enable staff to make evidence-based decisions.
- One of the first priorities for the ED Data Governance Board (DGB) in FY21 was to assess the current state of data maturity at ED. In early 2020, OCDO held “discovery” meetings with stakeholders from each ED office to capture information about successes and challenges in the current data landscape. This activity yielded over 300 data challenges and 200 data successes that provided a wealth of information to inform future data governance priorities. The DGB used the understanding gained of the ED data landscape during the discovery phase to develop a Data Maturity Assessment (DMA) for each office and the overall enterprise focusing on data and related data infrastructure in line with requirements in the Federal Data Strategy 2020 Action Plan. Data maturity is a metric that will be measured and reported as part of ED’s Annual Performance Plan. Several of these activities have been supported by ED’s investment in a Data Governance Board and Data Governance Infrastructure (DGBDGI) contract.
- The Education Stabilization Fund (ESF) Transparency Portal at covid-relief-data.ed.gov collects and reports survey data from grantees receiving funds for emergency relief from the COVID-19 pandemic and connects it with administrative data from usaspending.gov, College Scorecard, IPEDS, Common Core of Data, and ED’s G5 grants administration system. In February 2021, ED’s OCDO completed its first collection of annual performance reports (APRs) from state agencies and institutions of higher education that received CARES Act grants. OCDO created a data collection doorway in the portal to enable grantees to submit APRs on funding to institutions of higher education, State Education Agencies, and Governor’s Offices. Working in partnership with the Office of Postsecondary Education (OPE) and the Office of Elementary and Secondary Education (OESE), OCDO was able to achieve a 99.7% response rate from almost 4,900 higher education grantees and 100% response rate from State Education Agency grantees and Governor’s Offices grantees, ensuring comprehensive data on the use of funds to support student learning during the pandemic. An in-depth data quality review resulted in new technical assistance to grantees to improve reporting. The portal was updated in June 2021 to include APRs and data quality flags. OCDO and ED program offices continue to work with grantees to resolve data quality issues. ED also created internal dashboards customized to OPE, OESE, and policy leader needs for monitoring grant performance and outcomes.
- ED has also made concerted efforts to improve the availability and use of its data with the release of the revised College Scorecard that links data from NCES, the Office of Federal Student Aid, and the Internal Revenue Service. With recent updates in Q1 of FY21, the College Scorecard team improved the functionality of the tool to allow users to find, compare, and contrast different fields of study more easily, access expanded data on the typical earnings of graduates two years post-graduation, view median parent PLUS loan debt at specific institutions, and learn about the typical amount of federal loan debt for students who transfer. OCDO facilitated reconsideration of IRS risk assumptions to enhance data coverage and utility while still protecting privacy. The Scorecard enhancement discloses for prospective students how well borrowers from institutions are meeting their federal student loan repayment obligations, as well as how borrower cohorts are faring at certain intervals in the repayment process.
- IES continues to make available all data collected as part of its administrative data collections, sample surveys, and evaluation work. Its support of the Common Education Data Standards (CEDS) Initiative has helped to develop a common vocabulary, data model, and tool set for P-20 education data. The CEDS Open Source Community is active, providing a way for users to contribute to the standards development process.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- IES is collaborating with an outside research team to conduct a proof of concept for multi-party computing. The Department’s general approach is to replicate an existing data collection that involves securely sharing PII across a number of partners using the MPC framework.
- The Disclosure Review Board (DRB), the EDFacts Governing Board, the Student Privacy Policy Office (SPPO), and SPPO’s Privacy Technical Assistance Center and Privacy Safeguards Team all help to ensure the quality and privacy of education data. In FY19, the ED Data Strategy Team also published a user resource guide for staff on disclosure avoidance considerations throughout the data lifecycle.
- In FY20, the ED DRB approved 59 releases by issuing “Safe to Release” memos. The DRB is in the process of developing a revised Charter that outlines its authority, scope, membership, process for dispute resolution, and how it will work with other DRBs in ED. The DRB is also developing standard operating procedures outlining the types of releases that need to be reviewed along with the submission and review process for data releases. The DRB is currently planning to develop information sessions to build the capacity of ED staff focusing on such topics as disclosure avoidance techniques used at ED, techniques appropriate for administrative and survey data, and how to communicate with stakeholders about privacy and disclosure avoidance.
- In ED’s FY18-22 Performance Plan, Strategic Objective 3.2 is to “Improve privacy protections for, and transparency of, education data both at ED and in the education community.” The plan also outlines actions taken in FY18. ED’s Student Privacy website assists stakeholders in protecting student privacy by providing official guidance on FERPA, technical best practices, and the answers to Frequently Asked Questions.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- ED’s new Open Data Platform makes Department data easily accessible to the public. Data is machine-readable and searchable by keyword in order to promote easy access to relevant data assets. In addition, the ODP features an API so that aggregators and developers can leverage Department data to provide information and tools for families, policy makers, researchers, developers, advocates and other stakeholders. ODP will ultimately include listings of non-public, restricted data with links to information on the privacy-protective process for requesting restricted-use access to these data.
- ED’s Privacy Technical Assistance Center (PTAC) responds to technical assistance inquiries on student privacy issues and provides online FERPA training to state and school district officials. FSA conducted a postsecondary institution breach response assessment to determine the extent of a potential breach and provide the institutions with remediation actions around their protection of FSA data and best practices associated with cybersecurity.
- The Institute of Education Sciences (IES) administers a restricted-use data licensing program to make detailed data available to researchers when needed for in-depth analysis and modeling. NCES loans restricted-use data only to qualified organizations in the United States. Individual researchers must apply through an organization (e.g., a university, a research institution, or company). To qualify, an organization must provide a justification for access to the restricted-use data, submit the required legal documents, agree to keep the data safe from unauthorized disclosures at all times, and to participate fully in unannounced, unscheduled inspections of the researcher’s office to ensure compliance with the terms of the License and the Security Plan form.
- The National Center for Education Statistics (NCES) provides free online training on using its data tools to analyze data while protecting privacy. Distance Learning Dataset Training includes modules on NCES’s data-protective analysis tools, including QuickStats, PowerStats, and TrendStats. A full list of NCES data tools is available on their website.
Score
8
8
U.S. Agency for International Development
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- USAID’s data related investments and efforts are guided by its Information Technology Strategic Plan. This includes support for the Agency’s Development Data Policy, which provides a framework for systematically collecting Agency-funded data, structuring the data to ensure usability, and making the data public while ensuring rigorous protections for privacy and security. In addition, this policy sets requirements for how USAID data is documented, submitted, and updated. Guidance for USAID’s Open Data Policy may be seen in the User Guide, FAQs, and Help Videos.
- In 2020 USAID revised the Development Data Policy to require development activities to create and submit data management plans before collecting or acquiring data. The Development Data Library (DDL) is the Agency’s repository of USAID-funded, machine readable data, created or collected by the Agency and its implementing partners. The DDL, as a repository of structured and quantitative data, complements the DEC which publishes qualitative reports and information. The Agency’s data governance body, the DATA Board, is guided by annual data roadmaps that include concrete milestones, metrics, and objectives for Agency data programs. USAID also participates and leads in global compilations of data across the industry including the Global Innovation Exchange and in response to COVID-19. USAID also has a variety of stakeholder engagement tools available on USAID’s Development Data Library, including: Open Data Community Questions and video tutorials on using DDL.
- People-level indicators for development data are normally disaggregated by sex (male, female), sometimes by age and occasionally by other demographic markers. Development data rarely includes transgender, gender non-conforming, or non-binary disaggregation. In many countries it may be politically complicated or potentially unsafe to collect this data or data that asks about racial or ethnic identity. However, data can often be disaggregated by geographic location, region, or state, which can be mapped with other demographic data to build a picture of geographic disparities. Country expertise can then be applied to analyze racial and ethnic equity dimensions, as described in ADS 205.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- Launched in November 2018 as part of the Development Information Solution (DIS), USAID’s public-facing Development Data Library (DDL) provides a comprehensive inventory of data assets available to the Agency. DDL has posted the Data Inventory as a json file since 2015. Following the passage of the Foundations for Evidence-Based Policymaking Act, and in preparation for specific guidance expected in the upcoming release of Phase 2 guidance for the Act, USAID will make any necessary changes to its Comprehensive Data Inventory and continue reporting with quarterly updates as required. The DDL’s data catalog is also harvested via JavaScript on an ongoing basis for further distribution on the federal Data.gov website. Currently 456 USAID data assets are available to the public via USAID’s DDL, a 17% increase over last year.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- The USAID Data Services team–located in USAID’s Management Bureau’s Office of the Chief Information Officer (M/CIO)–manages a comprehensive portfolio of data services in support of the Agency’s mission. This includes enhancing the internal and external availability and ease-of use of USAID data and information via technology platforms such as the AidScape platform broadening global awareness of USAID’s data and information services, and bolstering the Agency’s capacity to use data and information via training and the provision of demand-driven analytical services.
- The Data Services Team also manages and develops the Agency’s digital repositories, including the Development Data Library (DDL), the Agency’s central data repository. USAID and external users can search for and access datasets from completed evaluations and program monitoring by country and sector.
- USAID staff also have access to an internal database of over 100 standard foreign assistance program performance indicators and associated baseline, target, and actual data reported globally each year. This database and reporting process, known as the Performance Plan and Report (PPR) promotes evidence building and informs internal learning and decisions related to policy, strategy, budgets, and programs.
- The United States is a signatory to the International Aid Transparency Initiative (IATI). The standard links an activity’s financial data to its evaluations. Partner country governments as well as other initiatives and websites can pull these data into their respective systems. This helps officials oversee the coordination and management of incoming foreign aid, and serves as an effective tool in standardizing and centralizing information about foreign aid flows within a country. This data can be ingested to reduce and streamline USAID’s own reporting efforts, freeing up resources for other endeavors. Further, by streamlining reporting to these partner country systems and other websites, USAID is promoting efficiency in data collection, improving the quality of data, reducing the time needed to publish updated information, as well as providing timely information to inform analysis, future decisions, and policy-making. USAID continues to improve and add to its published IATI data, and is looking into ways to utilize these data as best practice–including using it to populate partner country systems, fulfill transparency reporting as part of the US commitment to the Grand Bargain, and make decisions internally, including based on what other development actors are doing by using the Development Cooperation Landscape tool. In FY20, USAID began reporting additional data to IATI in alignment with IATI’s COVID-19 reporting guidance in order to share financial and descriptive information about USAID’s COVID-19 activities.
- USAID continues to pursue better communicating data insights. USAID’s Geocenter uses programmatic and demographic data linked with geospatial data to inform decision-making, emphasizing mapping to identify gaps in service provision and inform resource provision and decision-making (for example, to compare gender-based violence (GBV) “hotspots” and access to relevant support services; and to identify geographies and communities disparately impacted by natural disasters).
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- USAID’s Privacy Program and privacy policy (ADS 508) direct policies and practices for protecting personally identifiable information and data, while several policy references (ADS303maz and ADS302mbj) provide guidance for protecting information to ensure the health and safety of implementing partners. USAID’s Development Data Policy (ADS Chapter 579) details a data publication process that provides governance for data access and data release in ways that ensure protections for personal and confidential information. As a reference to the Development Data Policy, ADS579maa explains USAID’s foreign assistance data publications and the protection of any sensitive information prior to release. USAID applies extensive statistical disclosure control on all public data before publication or inclusion in the DDL.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- While specific data on this is limited, USAID does invest in contracts or grants that provide support to build local organizational or governmental capacity in data collection, analysis, and use. In addition, to date, 526 USAID data assets are available to the public via USAID’s DDL, a 44% increase over last year. These assets include microdata related to USAID’s initiatives that provide partner countries and development partners with insight into emerging trends and opportunities for expanding peace and democracy, reducing food insecurity, and strengthening the capacity to deliver quality educational opportunities for children and youth around the globe. Grantees are encouraged to use the data on the DDL, which provides an extensive User Guide to aid in accessing, using, securing and protecting data. The Data Services team conducts communication and outreach to expand the awareness of websites with development data, how to access it, and how to contact the team for support. In addition, the Data Services team has developed a series of videos to show users how to access the data available. The [email protected] mail account responds to requests for assistance and guidance on a range of data services from both within the Agency and from implementing partners and the public.
- Starting in 2020 Data Services’ Data Literacy Training series added equitable and accessible themes woven throughout the series, including: definitions of equitable data and accessible data; guiding principles for collecting data in ways that are equitable and inclusive; and guiding questions to determine whether shared/visualized data is equitable and accessible. These learning opportunities are designed for both internal and external audiences and will be available on public-facing web pages in late 2021 or early 2022.
Score
6
6
Administration for Children and Families (HHS)
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- ACF’s Interoperability Action Plan was established in 2017 to formalize ACF’s vision for effective and efficient data sharing. Under this plan ACF and its program offices will develop and implement a Data Sharing First (DSF) strategy that starts with the assumption that data sharing is in the public interest. The plan states that ACF will encourage and promote data sharing broadly, constrained only when required by law or when there are strong countervailing considerations.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- In 2020, ACF released a Compendium of ACF Administrative and Survey Data Resources. The Compendium documents administrative and survey data collected by ACF that could be used for evidence-building purposes. It includes summaries of twelve major ACF administrative data sources and seven surveys. Each summary includes an overview, basic content, available documentation, available data sets, restrictions on use, capacity to link to other data sources, and examples of prior research. It is a joint product of the Office of Planning, Research, and Evaluation (OPRE) in ACF, and the office of the Assistant Secretary for Planning and Evaluation (ASPE), U.S. Department of Health and Human Services.
- In addition, in 2019 OPRE compiled the descriptions and locations of hundreds of OPRE-archived datasets that are currently available for secondary analysis and made this information available on a single webpage. OPRE continues to regularly update this website with current archiving information. OPRE regularly archives research and evaluation data for secondary analysis, consistent with the ACF evaluation policy, which promotes rigor, relevance, transparency, independence, and ethics in the conduct of evaluation and research. This new consolidated web page serves as a one-stop resource that will help to make it easier for potential users to find and use the data that OPRE archives for secondary analysis.
- In 2020 ACF launched the ACF Data Governance Consulting and Support project, which is providing information gathering, analysis, consultation, and technical support to ACF and its partners to strengthen data governance practices within ACF offices, and between ACF and its partners at the federal, state, local, and Tribal levels. Initial work will focus particularly on data asset tracking and metadata management, among other topics.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- ACF has multiple efforts underway to promote and support the use of documented data for research and improvement, including making numerous administrative and survey datasets publicly available for secondary use and actively promoting the archiving of research and evaluation data for secondary use. These data are machine readable, downloadable, and de-identified as appropriate for each data set. For example, individual-level data for research is held in secure restricted use formats, while public-use data sets are made available online. To make it easier to find these resources, ACF released a Compendium of ACF Administrative and Survey Data and consolidated information on archived research and evaluation data on the OPRE website.
- Many data sources that may be useful for data linkage for building evidence on human services programs reside outside of ACF. In 2020, OPRE released the Compendium of Administrative Data Sources for Self-Sufficiency Research, describing promising administrative data sources that may be linked to evaluation data in order to assess long-term outcomes of economic and social interventions. It includes national, federal, and state sources covering a range of topical areas. It was produced under contract by MDRC as a part of OPRE’s Assessing Options Evaluate Long-Term Outcomes (LTO) Using Administrative Data project.
- Additionally, ACF is actively exploring how enhancing and scaling innovative data linkage practices can improve our understanding of the populations served by ACF and build evidence on human services programs more broadly. For instance, the Child Maltreatment Incidence Data Linkages (CMI Data Linkages) project is examining the feasibility of leveraging administrative data linkages to better understand child maltreatment incidence and related risk and protective factors. Also, in August 2021, OPRE published a brief presenting findings from the 2019 TANF Data Innovation Needs Assessment. This survey of state TANF agencies was designed to understand state strengths and challenges in linking and analyzing administrative data for program improvement. Findings from the Needs Assessment informed technical assistance provided to states through ACF’s TANF Data Collaborative. Information from the brief may be helpful to states, policymakers, and other funders in helping to support states in linking data for the purpose of evidence building.
- ACF actively promotes archiving of research and evaluation data for secondary use. OPRE research contracts include a standard clause requiring contractors to make data and analyses supported through federal funds available to other researchers and to establish procedures and parameters for all aspects of data and information collection necessary to support archiving information and data collected under the contract. Many datasets from past ACF projects are stored in archives including the ACF-funded National Data Archive on Child Abuse and Neglect (NDACAN), the ICPSR Child and Family Data Archive, and the ICPSR data archive more broadly. OPRE has funded grants for secondary analysis of ACF/OPRE data; examples in recent years include secondary analysis of strengthening families datasets and early care and education datasets. In 2019 ACF awarded Career Pathways Secondary Data Analysis Grants to stimulate and fund secondary analysis of data collected through the Pathways for Advancing Careers and Education (PACE) Study, Health Professions Opportunity Grants (HPOG) Impact Study, and HPOG National Implementation Evaluation (NIE) on questions relevant to career pathways programs’ goals and objectives. Information on all archived datasets that are currently available for secondary analysis is available on OPRE’s website.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- ACF receives privacy and security guidance from both the ACF and HHS Office of the Chief Information Officer (OCIO). Between these two offices, there are several policies and practices in place to assure all ACF data are protected and all incidents are handled appropriately. The requirements are supported by auditing mechanisms and a privacy and security training program.
- In 2014, ACF developed a Confidentiality Toolkit that explains the rules governing confidentiality of ACF data connected to many programs, provides examples of how confidentiality requirements can be addressed, and includes sample memoranda of understanding and data sharing agreements. In 2020, ACF launched the ACF Privacy and Confidentiality Analysis and Support project. This project is currently in the process of updating the Toolkit for recent changes in statute, and to provide real-world examples of how data has been shared across domains—which frequently do not have harmonized privacy requirements—while complying with all relevant privacy and confidentiality requirements (e.g. FERPA, HIPPA). These case studies will also include downloadable, real-world tools that have been successfully used in the highlighted jurisdictions. In addition, the project is exploring creating and maintaining a compendium of existing Privacy and Confidentiality laws for use by ACF staff.
- ACF also takes appropriate measures to safeguard the privacy and confidentiality of individuals contributing data for research throughout the archiving process, consistent with ACF’s core principle of ethics. Research data may be made available as public use files when the data would not likely lead to harm or to the re-identification of an individual, or through restricted access. Restricted access files are de-identified and made available to approved researchers either through secure transmission and download, virtual data enclaves, physical data enclaves, or restricted online analysis.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- ACF undertakes many program-specific efforts to support state, local, and tribal efforts to use human services data while protecting privacy and confidentiality. For example, ACF’s TANF Data Innovation Project supports innovation and improved effectiveness of state TANF programs by enhancing the use of data from TANF and related human services programs. This work includes encouraging and strengthening state integrated data systems, promoting proper payments and program integrity, and enabling data analytics for TANF program improvement. Similarly, in 2020 OPRE awarded Human Services Interoperability Demonstration Grants to Georgia State University and Kentucky’s Department of Medicaid Services. These grants are intended to expand data sharing efforts by state, local, and tribal governments to improve human services program delivery, and to identify novel data sharing approaches that can be replicated in other jurisdictions. ACF anticipates awarding another round of Interoperability Demonstration grants in FY22. Also in 2019, OPRE in partnership with ASPE began a project to support states in linking Medicaid and child welfare data at the parent-child level to support outcomes research. Under this project, HHS will work with two to four states to enhance capacity to examine outcomes for children and parents who are involved in state child welfare systems and who may have behavioral health issues. Of particular interest are outcomes for families that may have substance use disorders, like opioid use disorder. Specifically this project seeks to develop state data infrastructure and increase the available de-identified data for research in this area.
Score
6
6
AmeriCorps
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- AmeriCorps has three policies related to managing the agency’s data assets: Information Technology Management (policy 381), Information Technology Governance (policy 382) and Information Technology Data Management (policy 383).
- The CIO, the CDO, and the Director of Research and Evaluation/Evaluation Officer will be working together in FY21 to reconstitute and reconvene the agency’s Data Council and determine what kind of charter/agency policy may be needed for establishing the role of the Council with regard to managing the agency’s data assets. In essence, the role of the Council, under the direction of the CDO, will be to prioritize data asset management issues such as creating an annual Fact Sheet (so all externally facing numbers have a single authoritative source), creating a more user-friendly interface for the agency’s data warehouse/data inventory, and keeping the agency’ open data platform current.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- The agency’s Information Technology Data Management Policy addresses the need to have a current and comprehensive data inventory. The agency has an open data platform.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- AmeriCorps has a data request form and an MOU template so that anyone interested in accessing agency data may use the protocol to request data. In addition, public data sets are accessible through the agency’s open data platform. The agency’s member exit survey data was made publicly available for the first time in FY19. In addition, nationally representative civic engagement and volunteering statistics are available, through a data sharing agreement with the Census Bureau, on an interactive platform. The goal of these platforms is to make these data more accessible to all interested end-users.
- The Portfolio Navigator pulls data from the AmeriCorps data warehouse for use by the agency’s Portfolio Managers and Senior Portfolio Managers. The goal is to use this information for grants management and continuous improvement throughout the grant lifecycle.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- The agency has a new Privacy Policy (policy 153) that was signed in FY20 and posted internally. The Information Technology Data Governance Policy addresses data security. The agency conducts Privacy Impact Assessments which are a privacy review of each of AmeriCorps’ largest electronic systems which are then published online (click on the first 3 listings or PRISM).
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- AmeriCorps provides assistance to grantees, including governments, to help them access agency data. For example, AmeriCorps provides assistance on using the AmeriCorps Member Exit Survey data to State Service Commissions (many of which are part of state government) and other grantees as requested and through briefings integrated into standing calls with these entities.
Score
5
5
U.S. Department of Labor
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- DOL’s open government plan was last updated in 2016, and subsequent updates are being considered after the formal release of the Federal Data Strategy and the Evidence Act.
- DOL also has open data assets aimed at developers and researchers who desire data-as-a-service through application programming interfaces hosted by both the Office of Public Affairs and the Bureau of Labor Statistics (BLS). Each of these has clear documentation, is consistent with the open data policy, and offers transparent, repeatable, machine-readable access to data on an as-needed basis. The Department is currently developing a new API v3 which will expand the open data offerings, extend the capabilities, and offer a suite of user-friendly tools.
- The Department has consistently sought to make as much data available to the public regarding its activities as possible. Examples of this include DOL’s Public Enforcement Database, which makes available records of activity from the worker protection agencies and the Office of Labor Management Standards’ online public disclosure room.
- The Department also has multiple restricted-use access systems which go beyond what would be possible with simple open-data efforts. BLS has a confidential researcher access program, offering access under appropriate conditions to sensitive data. Similarly, the Chief Evaluation Office (CEO) has stood up a centralized research hub for evaluation study partners to leverage sensitive data in a consistent manner to help make evidence generation more efficient.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- The Department has conducted extensive inventories over the last ten years, in part to support common activities such as IT modernization, White House Office of Management and Budget (OMB) data calls, and the general goal of transparency through data sharing. These form the current basis of DOL’s planning and administration. Some sections of the Evidence Act have led to a different federal posture with respect to data, such as the requirement for data to be open by default, and considered shareable absent a legal requirement not to do so, or unless there is a risk that the release of such data might help constitute disclosure risk. Led by the Chief Data Officer and DOL Data Board, the Department is currently re-evaluating its inventories and its public data offerings in light of this very specific requirement and re-visiting this issue among all its programs. Because this is a critical prerequisite to developing open data plans, as well as data governance and data strategy frameworks, the agency hopes to have a revised inventory completed during FY21.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- DOL’s CEO, Employment and Training Administration (ETA), and Veterans Employment and Training Service (VETS) have worked with the U.S. Department of Health and Human Services (HHS) to develop a secure mechanism for obtaining and analyzing earnings data from the Directory of New Hires. Since FY20, DOL has entered into interagency data sharing agreements with HHS and obtained data to support 10 job training and employment program evaluations.
- Since FY20, the Department continued to expand efforts to improve the quality of and access to data for evaluation and performance analysis through the Data Analytics Unit in CEO, and through new pilots beginning in BLS to access and exchange state labor market and earnings data for statistical and evaluation purposes.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- DOL has a shared services approach to data security. In addition, the privacy provisions for BLS and ETA are publicly available online.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The State Wage Interchange System (SWIS) is a mechanism through which states can exchange wage data on an interstate basis with other states in order to satisfy performance related reporting requirements under the Workforce Innovation and Opportunity Act (WIOA), as well as for other permitted purposes specified in the agreement. The SWIS agreement includes the U.S. Department of Labor’s Adult, Dislocated Worker, and Youth programs (Title I) and Employment Service program (Title III); the Department of Education’s Adult and Family Literacy Act program (Title II) and programs authorized under the Carl D. Perkins Career and Technical Education Act of 2006 (as amended); and, the Vocational Rehabilitation program (Title IV). The Departments have established agreements with all 50 states, the District of Columbia and Puerto Rico.
- ETA continues to fund and provide technical assistance to states under the Workforce Data Quality Initiative to link earnings and workforce data with education data to support state program administration and evaluation. These grants support the development and expansion of longitudinal databases and enhance their ability to share performance data with stakeholders. The databases include information on programs that provide training and employment services and obtain similar information in the service delivery process.
- ETA is also working to assess the completeness of self-reported demographic data, to inform both agency level equity priorities and future technical assistance efforts for states and grantees to improve the completeness and quality of this information. ETA incorporated into FOAs the requirement to make any data on credentials transparent and accessible through use of open linked data formats.
- ETA has been working with the Department’s OCIO to build new case management systems for its National and discretionary grantees known as the Grants Performance Management System (GPMS). In addition to supporting case management by grantees, GPMS supports these grantees in meeting WIOA-mandated performance collection and reporting needs, and to enable automation to ensure programs can continue to meet updated WIOA requirements. ETA is working to integrate GPMS into the Workforce Investment Performance System (WIPS) as programs onboard into GPMS to seamlessly calculate and report WIOA primary indicators of performance and other calculations in programs’ quarterly performance reports (QPRs).
Score
6
6
U.S. Dept. of Housing & Urban Development
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- In FY21, HUD will develop a strategic data plan, which will include an open data policy. Currently, HUD’s open data program includes existing assets including administrative datasets on data.hud.gov, spatially enabled data on the eGIS portal, PD&R datasets for researchers and practitioners, a robust partnership with the Census Bureau, U.S. Postal Service vacancy data, and health data linkages with the National Center for Health Statistics. HUD’s public datasets are designed to allow analysis by race/ethnicity, gender, and other equity-related characteristics to the extent possible given the nature of the data and privacy constraints.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- HUD has extensive data sharing processes including public sharing, interagency sharing, and internal sharing, with each mode requiring specific controls and documentation. In FY21, HUD will review its existing data inventory and update it accordingly to produce a comprehensive data inventory. HUD will also revisit its data inventory schedule to ensure the agency is performing the activities necessary to develop and maintain a comprehensive data inventory.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- HUD has extensively promoted data access and data linkage, including the following approaches:
- An updated list of open data assets; numerous PD&R-produced datasets for researchers and practitioners, including tenant public use microdata samples, Picture of Subsidized Households, FMR and Income limits,Comprehensive Housing Affordability Strategy (CHAS) special tabulations of the American Community Survey; and an eGIS portal providing geo-identified open data to support public analysis of housing and community development issues using GIS tools. The eGIS portal is a comprehensive data source, covering the majority of HUD’s programs and initiatives. New mappable data added to the eGIS portal in FY21 include locations of Federally Qualified Health Centers to support COVID-19 vaccination efforts related to public and assisted housing programs.
- Data linkage agreements with the National Center for Health Statistics and the Census Bureau to enhance major national survey datasets by identifying HUD-assisted households, with updates continuing in FY21; making available major program demonstration datasets in secure environments; and producing special open-access tabulations of census data for HUD’s partners. The agreement with the Census Bureau includes three-way data matching between HUD tenant data, American Housing Survey data, and American Community Survey data.
- HUD has created a repository of properties, units and tenants that merge data across the various HUD rental assistance programs for use in research, evaluation and reporting. This allows for standardization and greater access to socio-demographic characteristics of HUD’s clients.
- Engagement in cooperative agreements with research organizations, including both funded Research Partnerships and unfunded Data License Agreements, to support innovative research that leverages HUD’s data assets and informs HUD’s policies and programs. Data licensing protocols ensure that confidential information is protected.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- HUD’s Statistical Official supports the Evidence Officer on issues related to protection of confidential data and statistical efficiency. HUD’s Evaluation Policy specifies that HUD protects client privacy by adhering to the Rule of Eleven to prevent disclosure from tabulations with small cell sizes. PD&R’s data licensing protocols ensure that researchers protect confidential information when using HUD’s administrative data or program demonstration datasets.
- The Statistical Official collaborates with statistical agencies to create data linkages and develop data products that are machine-readable and include robust privacy protections. HUD has an interagency agreement with the Census Bureau to conduct the American Housing Survey and collaborates with Census staff to examine disclosure issues for AHS public use files and the potential for “synthetic” public datasets to support researchers in estimating summary statistics with no possibility of reidentifying survey respondents. Another interagency agreement allows the Census Bureau to link data from HUD’s randomized control trials with other administrative data collected under the privacy protections of its Title 13 authority. These RCT datasets are the first intervention data added to Federal Statistical Research Data Centers (RDCs) by any federal agency. Strict RDC protocols and review of all output ensure that confidential information is protected, and the open data and joint support for researchers are currently facilitating seven innovative research projects at minimal cost to HUD.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- HUD has an updated list of open data assets, an open data program, numerous PD&R datasets for researchers and practitioners, and an eGIS portal providing geo-identified data to support public analysis of housing and community development issues related to multiple programs and policy domains using GIS tools. For example, HUD supports local governments in assessing and planning for housing needs by providing summary data files about HUD-supported public and assisted housing and about local housing needs. These accessible data assets have privacy protections. Researchers needing detailed microdata can obtain access through data licensing agreements.
- HUDExchange offers numerous resources and training opportunities to help program partners use data assets more effectively. Additional technical assistance is offered through the program, a $91 million technical assistance program to equip HUD’s customers with the knowledge, skills, tools, capacity, and systems to implement HUD programs and policies successfully and provide effective oversight of federal funding.
- In FY21, HUD will produce a study of the feasibility of HUD producing a national database of evictions that would be available to all levels of government and the general public to track and assess evictions, including eviction trends by race, gender, disability status, ethnicity, and age. If the study determines that such a database is feasible and if Congress funds its development, such a database will be an important tool for analyzing equitable treatment of renters.
Score
8
8
Administration for Community Living (HHS)
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- As an operating division of a CFO Act Agency, the U.S. Department of Health and Human Services, ACL is not required to have its own strategic data plan and utilizes HHS’s data strategy. In 2016, ACL implemented a Public Access Plan as a mechanism for compliance with the White House Office of Science and Technology Policy’s public access policy. The plan focused on making published results of ACL/National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR) funded research more readily accessible to the public; making scientific data collected through ACL/NIDILRR-funded research more readily accessible to the public; and increasing the use of research results and scientific data to further advance scientific endeavors and other tangible applications. In 2019, ACL created a council to improve ACL’s data governance, including the development of improved processes and standards for defining, collecting, reviewing, certifying, analyzing, and presenting data that ACL collects through its evaluation, grant reporting, and administrative performance measures. In 2020, its first year, the ACL Data Council produced an annotated bibliography to provide essential background information about the topic, a Primer to detail best practices in data governance specifically as they apply to ACL, a Data Quality 101 infographic to guide decision-making processes related to data quality. ACL also has an internal tracking sheet to measure ACL response to the Federal Data Strategy.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- ACL provides comprehensive public access to its programmatic data through its Aging, Independence, and Disability Program Data Portal (AGID). ACL also had two data inventories available to the public on the NARIC website. REHABDATA, a database of rehabilitation and disability literature and the Online Program Directory contains NIDILRR’s previously-funded, currently-funded, and newly-funded grants. ACL/NIDILRR has a public access plan that was first published in February 2016. Its purpose is to make available to the public peer-reviewed publications and scientific data arising from research funded in whole or part by ACL through the NIDILRR, to the extent feasible and permitted by law and available resources. The requirements outlined in this plan are being applied prospectively and not retrospectively. ACL is also creating an internal evidence inventory that staff will be able to use to search for relevant program performance and evaluation data by agency priority question.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- ACL’s Office of Performance and Evaluation has access to all of ACL’s performance and evaluation data and is able to link those data and advise programs about their availability and usability. In March 2019, ACL completed the ACL Data Restructuring (DR) Project to assess the data hosted on AGID, and to develop and test a potential restructuring of the data in order to make it useful and usable for stakeholders. In 2019, ACL awarded a follow-on contract to further integrate its datasets along the lines of conceptual linkages, and to better align the measures within ACL’s data collections across the agency. ACL funded several grants to promote data linkage including the Grants to Enhance State Adult Protective Services awarded in FY19 to increase intra- and inter-state sharing of information on APS cases and the 2020 Empowering Communities to Reduce Falls and Falls Risk to develop robust partnerships, develop a result-based, comprehensive strategy for reducing falls and fall risks among older adults and adults with disabilities living in your community and directs grantees to consider CDC opportunities to broaden and improve the linkage between primary care providers and evidence-based community falls prevention programs supported by ACL.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- As an operating division of the U.S. Department of Health and Human Services, ACL follows all departmental guidance regarding data privacy and security. This includes project-specific reviews by ACL’s Office of Information Resource Management (OIRM), which monitors all of ACL’s data collection activities to ensure the safety and security of ACL’s data assets. In FY19, ACL awarded a contract to stand up a “Data Council” to enhance the quality, security, and statistical usability of the data ACL collects through its evaluation, grant reporting, and administrative data collections, and to develop effective data governance standards. NIDILRR’s Model Systems’ data centers have extensive standard operating procedures that are designed to secure data and protect personal and confidential information. Below are a few illustrative examples from the Model Systems’ Data Centers:
- The Burn Model System National Data and Statistical Center has a Burn Model Systems’ procedures’ page that lists all of the Standard Operating Procedures that grantees contributing to this database must follow.
- The Traumatic Brain Injury Model Systems’ National Data and Statistical Center has a Standard Operating Procedures’ page that describes the procedures that all grantees contributing to this database must follow
- The Spinal Cord Injury Model Systems’ National Data and Statistical Center has a page on Using the National Spinal Cord Injury Model Systems’ Database. Descriptions of what constitutes “de-identified data” can be found on this page.
- In addition to the Model Systems’ data centers referenced above NIDILRR developed Part 2: Preparing Data and Documentation; this page and video is part of the larger training course that NIDILRR grantees must complete entitled NIDILRR Data Archiving and Sharing Training. Additional guidance is available on the ICPSR web page entitled Resources for National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR) Grantees.
- In addition, each funding opportunity announcement states that “a data and safety monitoring board (DSMB) is required for all multi-site clinical trials involving interventions” (see for example the FOA for Disability and Rehabilitation Research Projects (DRRP): Assistive Technology to Promote Independence and Community Living (Development) HHS-2019-ACL-NIDILRR-DPGE-0355).
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- ACL data sets are made publicly available through its AGID system. ACL staff provide technical assistance through presentations and ACL’s technical assistance resource centers to grantees, including state, tribal, and local governments. The resource centers providing technical assistance include: the National Resource Center on Nutrition and Aging (NRC), the Alzheimer’s Disease Supportive Services Program (ADSSP) and the University Centers for Excellence in Developmental Disabilities Education, Research, and Service. This technical assistance includes annual workshops and presentations at the Title VI National Training and Technical Assistance Conference; training available through the ACL funded National Ombudsman Resource Center; and the Disability and Rehabilitation Research Program (DRRP), which funds capacity building for minority research entities. In addition, NIDILRR has a number of resources to help the public access its data responsibly: The National Spinal Cord Injury Statistical Center, for example, has a pdf document entitled Using the National Spinal Cord Injury Model Systems Database. This same center also has an online Data Request Form that requestors need to complete before gaining access to data. The National Data and Statistical Center for the Traumatic Brain Injury Model Systems has a web page entitled How to obtain a dataset from the TBIMS. The Burn Model Systems’ National Data and Statistical Center has a page with instructions on how to access Burn Model System data.
Score
5
5
Substance Abuse and Mental Health Services Administration (HHS)
5.1 Did the agency have a strategic data plan, including an open data policy? (Example: Evidence Act 202(c), Strategic Information Resources Plan)
- The SAMHSA Strategic Plan FY19-23 outlines five priority areas to carry out the vision and mission of SAMHSA, including Priority 4: Improving Data Collection, Analysis, Dissemination, and Program and Policy Evaluation. This Priority includes three objectives: 1) Develop consistent data collection strategies to identify and track mental health and substance use needs across the nation; 2) Ensure that all SAMHSA programs are evaluated in a robust, timely, and high-quality manner; and 3) Promote access to and use of the nation’s substance use and mental health data and conduct program and policy evaluations and use the results to advance the adoption of evidence-based policies, programs, and practices.
- CBHSQ recently updated their data transfer agreement for a uniform and protected sharing of data. SAMHSA has used a DUA for data sharing for several years .
- SAMHSA’s Center for Behavioral Health Statistics and Quality (CBHSQ) is the lead Federal government agency for behavioral health data and research. As an OMB-recognized Federal Statistical Unit, CBHSQ adheres to all laws, regulations, and guidelines related to best practices of data dissemination and data stewardship, such Statistical Policy Directive Number Four on data dissemination, the Confidential Information Protection and Statistical Efficiency Act, and the 2018 Evidence Act. SAMHSA also adheres to strict scientific guidelines, as set forth in Statistical Policy Directive Number One and our parent agency’s (HHS) Scientific Integrity Principles. Additionally, SAMHSA explicitly states the agency’s public commitment to scientific integrity.
- CBHSQ uses a data transfer agreement for a uniform and protected sharing of data. This was updated in FY21. Additionally, as the main center within SAMHSA that collects, stewards, and disseminates data, CBHSQ is in the process of developing a short-term and long-term Strategic Data Plan. SAMHSA has in the past had an agency-wide “Strategic Plan and Data Strategy” that matches important agency priorities to data collected by SAMHSA. SAMHSA is also working with HHS on a department-wide data strategy including a data maturity model and policies for data governance and sharing.
- Within CBHSQ, SAMHSA partners with the National Center for Health Statistics to offer access to individuals for restricted use data for research and evaluation purposes. This is a carefully controlled process designed to ensure data, and the individuals that provide the data, are protected.
- The Office of Behavioral Health Equity within SAMHSA coordinates efforts to reduce disparities in mental and/or substance use disorders across populations. OBHE is organized around key strategies: data strategy, policy strategy, quality practice and workforce development strategy and communication strategy.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- SAMHSA’s Report and Dissemination site identifies seven data collections: the National Survey on Drug Use and Health (NSDUH): Treatment Episode Data Set (TEDS); National Survey of Substance Abuse Treatment Services (N-SSATS); the National Mental Health Services Survey (N-MHSS); Drug Abuse Warning Network (DAWN); Mental Health Client-Level Data (MH-CLD); and the Uniform Reporting System (URS). SAMHSA has also made numerous data collection and survey datasets publicly available at the Substance Abuse and Mental Health Data Archive (SAMHDA), which include online analytic capabilities and downloadable datasets.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement? (Examples: Model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c))
- The Center for Behavioral Health Statistics and Quality (CBHSQ) oversees data collection initiatives and provides publicly available datasets so that some data can be shared with researchers and other stakeholders while preserving client confidentiality and privacy.
- In FY21, SAMHSA’s CBHSQ built internal technical capacity for data collections and began the process of modernizing them. For example, the N-SSATS and N-MHSS have been combined into the National Substance Use and Mental Health Services Survey (NSUMHSS) in an effort to decrease burden and duplication of responses. In addition, CBHSQ, in partnership with the Center for Mental Health Services is now SAMHSA’s Substance Abuse and Mental Health Data Archive (SAMHDA) which contains substance use disorder and mental illness research data available from CBHSQ’s seven data collections for restricted and public use. SAMHDA promotes the access and use of SAMHSA’s substance abuse and mental health data by providing public-use data files and documentation for download and online analysis tools to support a better understanding of this critical area of public health.
- SAMHSA’s Substance Abuse and Mental Health Data Archive (SAMHDA) contains substance use disorder and mental illness research data available from CBHSQ’s seven data collections for restricted and public use. SAMHDA promotes the access and use of SAMHSA’s substance abuse and mental health data by providing public-use data files and documentation for download and online analysis tools to support a better understanding of this critical area of public health. In addition, SAMHSA partners with the National Center for Health Statistics to make restricted use data available through the Research Data Center (RDC). The National Center for Health Statistics (NCHS) operates the Research Data Center (RDC) to allow researchers access to restricted-use data. For access to the restricted-use data, researchers must submit a research proposal outlining the need for restricted-use data. In FY21, many of the procedures for the application process moved in-house from NCHS and a CBHSQ RDC website was created.
- Also, SAMHSA implements the Disparity Impact Statement (DIS). DIS is a Secretarial Priority from the Department of Health & Human Services’ Action Plan to Reduce Racial and Ethnic Health Disparities (2011). The objective is to “Assess and heighten the impact of all HHS policies, programs, processes, and resource decisions to reduce health disparities. HHS leadership will assure that: … (c) Program grantees, as applicable, will be required to submit health disparity impact statements as part of their grant applications.” The Secretarial Priority focused on underserved racial and ethnic minority populations, e.g., Black/African American; Hispanic/Latino; Asian American, Native Hawaiian and Pacific Islander; and American Indian/Alaska Native. SAMHSA’s Office of Behavioral Health Equity also includes LGBT populations as underserved, disparity-vulnerable groups.
- Through the SAMHSA Performance Accountability and Reporting System (SPARS), grantees and SAMHSA program staff monitor the performance of grantees and, when performance is below targets, provide technical assistance and support. This allows SAMHSA to support communities during the grant process. SAMHSA staff meet with grantees regularly to discuss progress and to examine data entered into SPARS ensuring a timely submission of data.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information? (Example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- The National Survey on Drug Use and Health (NSDUH) annual, national survey has developed a statistical disclosure control technique called MASSC to protect confidentiality of the data. MASSC stands for Micro-Agglomeration, Substitution, Subsampling, and Calibration. It was a disclosure limitation methodology specifically developed for NSDUH to meet the requirements of CIPSEA. SAMHSA recognizes the inherent trade-off between disclosure risk and information loss. The goal of MASSC is to control the disclosure risks while minimizing the impact of the disclosure control measures on the quality of the data in a comprehensive and integrated manner. MASSC has been successfully used to create NSDUH public use files (PUFs) since 1999.
- In addition to having a Confidentiality Officer within CBHSQ who ensures staff complete training and sign a confidentiality statement, SAMHSA offers a certificate of confidentiality (CC) that protects grantees from legal requests for names or other information that would personally identify participants in the evaluation of a grant, project, or contract. CBHSQ trains all staff in good data stewardship, whether the data is covered by CIPSEA or the Privacy Act (5 U.S.C. 552a) and the Public Health Service Act (42 U.S.C.290aa(n)).
- For the CBHSQ national data sets, SAMHSA uses multiple means to protect data and ensure the protection of personally identifiable information including encryption, multifactor identification (MASSC) and limiting access to data.
- SAMHSA’s Performance and Accountability and Reporting System (SPARS) hosts the data entry, technical assistance request, and training system for grantees to report performance data to SAMHSA. SPARS serves as the data repository for the Administration’s three centers, Center for Substance Abuse and Prevention (CSAP), Center for Mental Health Services (CMHS), and Center for Substance Abuse Treatment (CSAT). In order to safeguard confidentiality and privacy, the current data transfer agreement limits the use of grantee data to internal reports so that data collected by SAMHSA grantees will not be available to share with researchers or stakeholders beyond SAMHSA, and publications based on grantee data will not be permitted.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- SAMHSA provides both public access and restricted use access to its datasets in a variety of ways. Specific examples are highlighted below.
- The Center of Excellence for Protected Health Information (CoE for PHI) is a SAMHSA funded technical assistance project designed to develop and increase access to simple, clear, and actionable educational resources, training, and technical assistance for consumers and their families, state agencies, and communities to promote patient care while protecting confidentiality.
- CBHSQ’s various data collection’s data are available (1) as pre-published estimates, (2) via online systems, and (3) as microdata files.” A description of CBHSQ’s products can be found under the Substance Abuse and Mental Health Data Archive page (SAMHDA).
- SAMHSA partners with the National Center for Health Statistics to make restricted use data available through the Research Data Center (RDC). The National Center for Health Statistics (NCHS) operates the Research Data Center (RDC) to allow researchers access to restricted-use data. For access to the restricted-use data, researchers must submit a research proposal outlining the need for restricted-use data. The proposal provides a framework for CBHSQ to identify potential disclosure risks and how the data will be used.