Rapid scientific and technological advances have led to an explosion of research data.Reference Landau1 Researchers now commonly collect biospeci-mens for genomic analysis; real-time lifestyle and behavioral data from mobile devices; and information from electronic health records, in addition to other participant-reported data.Reference Collins, Varmus and Schadt2 This rich combination of data creates new opportunities for understanding and addressing important health issues, but also intensifies challenges to protecting research participants' privacy and confidentiality.Reference Rodriguez, Kulynych, Greely, Rothstein and Rothstein3
Unlike the uniform protection of personal data provided by the European Union's General Data Protection Regulation, in the United States, legal protections depend on how data are generated, who holds the data, and which state's law applies.Reference Tovino4 While federal laws, such as the Common Rule,5 the Privacy and Security Rules under the Health Insurance Portability and Accountability Act (HIPAA),6 and the Genetic Information Nondiscrimination Act (GINA),7 collectively impose some confidentiality obligations and limit some potential harms, they also have significant gaps that may or may not be filled by state law.Reference Wolf and Brown8
To understand better how and to what extent existing laws protect research participants in large-scale genomic research, we conducted empirical research that included two separate components: (1) interviews with a diverse group of nationally-recognized thought leaders to explore their views of confidentiality-related topics at the forefront of genome research, and (2) structured legal research assessing research-specific and general federal and state laws that may protect research participants' interests. The primary results of these two endeavors are reported elsewhere.Reference Beskow, Hammack, Brelsford, Beskow, Cohen, Lynch, Vayena, Hammack, Brelsford and Beskow9 Here, we integrate the findings and apply them to realistic research scenarios involving various privacy threats. By examining our legal findings alongside multidisciplinary expert perspectives, our goal is to illuminate the effect of law in practice and to elucidate the actual strengths and limitations of the “web” of legal protections available to research participants. Accordingly, we do not provide a normative analysis, but rather describe what the “web” of legal protections is, not what it should be.
Our analysis starts in the context of a hypothetical national gene-environment interaction study that incorporates standard confidentiality protections and does not plan to return individual research results. We describe the basic protections available for such a study (Scenario 1), including the Common Rule, HIPAA, and research project governance. We then consider the protections available if:
Researchers return individual results (Scenario 2), including analysis of the Common Rule, HIPAA, and GINA;
There is a database breach or hack (Scenario 3), including analysis of HIPAA, GINA, and research project governance; or
There is a legal demand (such as a subpoena or court order) for data access (Scenario 4), including analysis of Certificates of Confidentiality and HIPAA.
To understand better how and to what extent existing laws protect research participants in large-scale genomic research, we conducted empirical research that included two separate components: (1) interviews with a diverse group of nationally-recognized thought leaders to explore their views of confidentiality-related topics at the forefront of genome research, and (2) structured legal research assessing research-specific and general federal and state laws that may protect research participants' interests. The primary results of these two endeavors are reported elsewhere. Here, we integrate the findings and apply them to realistic research scenarios involving various privacy threats.
In this paper, we focus on risks and protections for the individual research participant, as laws typically do. However, it is important to note that thought leaders interviewed also emphasized the risks to participants' biological relatives and to socially-identifiable groups.Reference Beskow10 Moreover, the likelihood of risks actually occurring and the severity of any resulting harm depends on numerous contextual factors, including characteristics of the individual participant, study design, and socio-cultural environment.Reference Beskow11
METHODS
Detailed methodologic information is available elsewhere.Reference Beskow, Beskow and Hammack12 We describe the methods briefly below.
Qualitative Interviews
We conducted in-depth interviews (n=60) with a diverse group of prominent experts and scholars in the areas of ethics, genome research, health law, historically-disadvantaged populations, informatics, and participant-centric perspectives, as well as government officials and human subjects protections leaders (Table 1). We identified prospective participants based on leadership positions in prominent organizations, institutions, and studies across the U.S., as well as authorship of highly influential papers on relevant topics.
* Primary perspective for which we identified thought leaders; many could easily have been recognized in two or more categories
We developed a semi-structured interview guide centered around privacy and confidentiality issues and solutions in a hypothetical “Million American Study” (MAS) (Box 1). Although the MAS has similarities to the “All of Us” (AoU) Research Program now being conducted by the National Institutes of Health (NIH)13, we did not design the MAS hypothetical to be identical to AoU. Interview topics included risks and potential benefits and harms; informed consent, including emerging models of dynamic and open consent; and the strengths and limitations of a range of general and specific approaches to protecting confidentiality.
The Million American Study (MAS) is a federally-funded, large-scale research endeavor to improve understanding of health and to find new ways to predict, detect, diagnose, treat, and prevent disease. Specifically, the aim is to compile comprehensive information from a cohort of one million Americans in a repository that will serve as a rich research resource for a wide variety of studies for decades to come.
MAS will seek to enroll a representative sample of U.S. adults reflecting diversity in terms of race and ethnicity, age, and sex. Those who agree to participate will give broad consent for:
Extensive characterization (including whole genome sequencing) of biospecimens, such as blood
Ongoing access to clinical data (such as medications, test results, and imaging) from electronic health records
Real-time monitoring of lifestyle and behavioral information, such as physical activity and environmental exposures, through mobile health devices
At the time of consent, participants will be offered choices about whether they are willing to be re-contacted for various purposes (e.g., to provide additional information or specimens). Participants will be able to withdraw consent for future use of their specimens and data, with the exception that data generated in past studies cannot be withdrawn, nor can specimens and data be withdrawn from studies already begun.
Specimens will be stored in coded form in a repository at a major academic medical center in one state, while the data will be held at the coordinating center in another state. A robust data security framework will be in place, including administrative, technical, and physical safeguards. There will be a centralized governance process, comprising participant representatives, researchers, health care providers, government officials, and other stakeholders to ensure overall accountability and responsible project management.
Multiple tiers of access to MAS data — from open to controlled — based on data type, data use, and user qualifications will be employed. For example, certain information, such as some aggregate results, will be publicly available. Access to other information will be available to qualified researchers from academic, non-profit, and for-profit entities, in the U.S. and around the world, through application to a Data Access Committee. For approved projects, Data Use Agreements will be used to ensure that data and specimens are used and shared for authorized purposes only, and that privacy and security safeguards are maintained.
Information will be publicly available concerning how MAS cohort data and specimens are being used, including information about ongoing studies and summaries of research findings.
Adapted from F.S. Collins and H. Varmus, “A New Initiative on Precision Medicine,” New England Journal of Medicine 372, no. 9 (2015): 793-795; M.J. Khoury and J.P. Evans, “A Public Health Perspective on a National Precision Medicine Cohort: Balancing Long-Term Knowledge Generation with Early Health Benefit,” 313, no. 21 (2015): 2117-2118.
Interviews were conducted by telephone between September 2015 and July 2016. Professional transcriptions of the audio recordings were uploaded into NVivo for coding and analysis using standard iterative processes.Reference MacQueen14 The Vanderbilt University and Georgia State University IRBs deemed this research exempt.
Legal Analysis
We conducted searches in Westlaw and Lexis-Nexis to identify state laws that had provisions adding to the protections federal laws afford to participants in genomic research. These included laws that would apply to genetic information, tests, and biospeci-mens, as well as other health information used and held by researchers and biobanks. These also included laws protecting against unwanted use of genetic and other health information by employers, insurers, or “any person” if such information were to be disclosed, breached, hacked, or returned to the participant or their health care provider.
We used formal search strategies and the “book browse” feature to identify enacted statutes and promulgated regulations in effect between January 1, 2015, and December 31, 2017, the period of our research. We then worked in pairs to select all relevant laws across all 50 states and the District of Columbia. Pairs independently coded the selected laws using NVivo and following the codebook the team developed. Any interpretive questions were identified and discussed among the faculty members.
To integrate the findings from these two components of our research, we analyzed each of the research scenarios to determine which federal and state laws offer protections against the risks identified by the thought leaders, including identifying any gaps in legal protection.
SCENARIO 1: THE MILLION AMERICAN STUDY
For thought leaders, the long-term, open-ended nature of the MAS raised concerns beyond those typically associated with more limited or well-defined kinds of research.Reference Beskow and Beskow15 In particular, they highlighted the risk that information could be used in ways the MAS permitted but were unanticipated and potentially objectionable to some participants because of the research topic, the researcher, or non-research use of the data (Table 2). The primary protections against these kinds of risks and harms include the Common Rule, the HIPAA Privacy Rule, and related state laws, plus research governance features such as data access committees and data use agreements.
The Common Rule and Related Protections
The federal Common Rule aims to protect against some of the risks and harms of research participation, principally through IRB review and individual informed consent.16 Nearly half of thought leaders found these requirements to be reassuring, primarily citing the Common Rule's commitment to respecting autonomy and privacy (even when it disadvantages research).Reference Hammack17
IRB Review
The Common Rule specifies IRB review criteria, including that risks are minimized and reasonable in relation to anticipated benefits, if any.18 Many thought leaders, however, noted variability among IRBs and said the protection actually afforded depends on IRB quality. They further commented on IRBs' limited abilities to provide meaningful ongoing oversight throughout a long-term study.Reference Hammack19 Although the Common Rule often obligates IRBs to provide continuing review,20 it would be unusual for an IRB to revisit a study's overall design unless a problem occurs.Reference Davis and Hoffman21
In addition to the concerns thought leaders raised, there are important gaps in the Common Rule's IRB review requirement. The Rule only applies to federally conducted or funded research.Reference Mervis22 Although the MAS is described as federally funded, secondary research using MAS data may not be.Reference Wolf23 For example, citizen scientists — members of the lay public who actively take part in planning and conducting researchReference Wolley and Eitzel24 — and other non-academic researchers may not be federally funded. In addition, some research by academic investigators may not be federally funded, though their institution may voluntarily apply the Common Rule.
The Common Rule also contains several exceptions, the most relevant of which would be secondary research using coded biospecimens and/or data that were collected by the MAS, with no access to identifiers. Such research is not considered to involve human subjects because the Common Rule defines a “human subject” in terms of intervention or interaction with the person or identifiability.25 In practice, many IRBs review study information to determine whether it meets the definition of “human subjects research,” although the regulations do not require it.26 We found only one state that has a genetic-specific law requiring at least limited IRB review for secondary research when the Common Rule does not.27
Informed Consent
Because the MAS involves interaction with participants and is described as prospectively collecting and retaining participants' identifiable biospecimens and private information, the MAS itself would not fall-within one of the Common Rule's exceptionsReference Lynch, Wolf, Barnes, Lynch and Meyer28 and would require informed consent from participants. This consent requirement provides an opportunity for individuals to be apprised of the procedures, risks, and benefits and to make a voluntary decision about whether to participate in research. Thus, individuals who are generally risk averse, concerned about a specific risk, or feel particularly susceptible to harm can protect themselves by declining to participate.
Nevertheless, this protection may be limited. Many thought leaders noted that the Common Rule's consent requirements can be technically fulfilled despite insufficiencies often found in consent forms (e.g., complex language, excessive length). In other words, the protections provided depend on the quality of the consent materials and processesReference Hammack29 — including the extent to which they effectively communicate the information identified as most important to prospective participants' decisionmaking.Reference Beskow30
Moreover, in research endeavors like the MAS, participants consent broadly to their specimens and data being used in unspecified future research.Reference Grady31 Accordingly, their consent to each specific future study is not required, as long as the description provided when they give consent includes sufficient detail such that reasonable people would expect they were permitting the types of research conducted.32 Even if broad consent is not obtained, secondary research using only existing coded specimens/data, with no access to identifiers, is not considered “human subjects research” under the Common Rule. Thus, individual participants may not even be notified regarding when or how their materials are used.33
However, state laws may require informed consent when the Common Rule does not. A number of states require “any person” conducting genetic tests to obtain consent and define genetic testing broadly enough to apply to research.34 Because some of these laws refer to specific consent, consent to unspecified future research may not be sufficient. To the extent these laws are enforced (several explicitly include a private right of action), participants in these states could theoretically avoid uses they find objectionable by having greater control over each use of their specimens and data.
A critical aspect of informed consent afforded by the Common Rule is disclosure of the right to withdraw.35 However, as thought leaders noted, there are limits to this protection in endeavors like the MAS insofar as one's materials cannot be called back or removed once they have been shared with other researchers.Reference Beskow36 Some state laws explicitly require the withdrawal and/or destruction of samples when an individual withdraws consent, with penalties for failure to comply.37
HIPAA Privacy Rule and Related Protections
The HIPAA Privacy Rule and similar state laws may give research participants additional control by limiting uses and disclosure of their identifiable health information without individual authorization. The HIPAA Privacy Rule generally prohibits covered entities or their business associates from using or disclosing identifiable health information without individual authorization. However, there are several exceptions allowing for disclosure without authorization, such as for law enforcement purposes, pursuant to a court order or subpoena, or to public health or other governmental authorities.38 The HIPAA Privacy Rule includes genetic information in the definition of “health information.”39 HIPAA imposes certain obligations on covered entities and business associates, including breach notification, limits on marketing use or sale of protected health information, providing individuals a right of access to their own information, and maintaining the security of protected health information.40
The Privacy Rule requires each participant's authorization for the MAS to collect and use medical record data from his or her health care provider, a HIPAA-covered entity.41 However, this authorization provides little protection against the risk that the participant's information could be used for objectionable research. Despite regulatory language that the authorization “must include a description of each purpose of the requested use or disclosure,”42 agency guidance permits authorizations to unspecified future research if the description is “such that it would be reasonable for the individual to expect that his or her protected health information could be used or disclosed for such future research.”43 Because the requirement that an authorization contain an expiration date may be satisfied by stating “none” or “until authorization is revoked,” such authorization can be indefinite.44
Most thought leaders were not particularly reassured by the Privacy Rule's protections in the context of the MAS.Reference Hammack45 Although some highlighted the acute awareness and expectations surrounding HIPAA among researchers and institutions, others questioned whether the Privacy Rule would apply to the MAS and its research sites.
Indeed, HIPAA only applies to “covered entities” — primarily health care providers — and their “business associates” that handle identifiable health information.46 Some academic medical centers may elect to extend covered-entity status to their research activities (including biobanks, data coordination centers, and research sites), while others may not. In large-scale endeavors like the MAS, some research sites may not be covered entities at all.47 The applicability of HIPAA depends on the entity's status and does not “follow the data” (Figure 1). Thus, as MAS specimens and data are transferred to and from a centralized research laboratory (e.g., for genomic analyses) and to downstream researchers, HIPAA protections would only apply if the particular entity handling the materials is a covered entity or business associate. In contrast, we found a number of states that prohibit any person who holds genetic information, personal data, or medical data — which could include researchers — from disclosing the information without individual consent.48
For HIPAA-covered entities, the Privacy Rule's requirements establish strong standards and deterrents against unauthorized use and disclosure and may serve as an industry standard for non-covered entities. Some thought leaders were reassured by the Privacy Rule's high standards for de-identification, although many noted these standards are not infallible and that re-identification is possible.Reference Hammack49 Moreover, much of the MAS data used by researchers would be identifiable, even if in a limited dataset. A limited dataset excludes direct identifiers but, given the richness of data collected and generated by endeavors like the MAS (including genomic data), it may contain information that could be used in combination to identify individuals. Although limited datasets are not considered deidentified under the Privacy Rule, they may be used without authorization for research purposes pursuant to a data use agreement that contractually obligates the recipient to safeguard the data, refrain from reidentification or further disclosure except as provided by the agreement, and notify the covered entity of uses or disclosures not permitted by the agreement.50
The HIPAA Privacy Rule does, however, require an individual's authorization for uses or disclosures of their identifiable information for marketing, ameliorating some risks thought leaders raised regarding unwanted non-research uses (Table 2).Reference Beskow51 Non-covered entities that receive MAS data would not be bound by the Privacy Rule's proscriptions on uses or disclosures of the participant's data, but may be contractually bound to similar limitations under a data use agreement (Figure 1).
While some thought leaders referenced HIPAA's penalties as a potential deterrent against intentional or reckless violations, others noted the prominent role of human error, particularly when many people have access to the data.Reference Hammack52 Reported HIPAA breaches support this concern, evidencing both inadvertent breaches by people with authorized access as well as attacks by people without authorized access.53 If data are disclosed to entities that are not HIPAA-covered, HIPAA's protections do not apply.
A further limit on the Privacy Rule's protections is that it does not offer a private right of action to individuals whose information is disclosed without authorization.54 Aggrieved individuals' only recourse under HIPAA is to file a complaint with the Office for Civil Rights to conduct an investigation. Although this may result in corrective action or administrative penalties against the covered entity or business associate, it will not compensate the individual.
Research Project Governance
In addition to statutory and regulatory limits on data access, research platforms like the MAS often adopt rules and procedures to govern access to data and specimens, as well as to protect against misuse. This kind of research project governance, including data access committees and data use agreements, has a crucial role to play because participants who give broad consent are, in essence, entrusting decisions about and oversight of secondary research to these entities and processes.55 As described in Box 1, researchers would apply to use MAS specimens/data. If approved by a data access committee, the MAS would provide a limited dataset under a data use agreement. This agreement would give contractual protections against re-identification, disclosure, or misuse of participant data, whether or not the recipient of the dataset is a covered entity (Figure 1).
Thought leaders generally perceived data access committees and data use agreements as either weak, or helpful but insufficient.Reference Hammack56 They highlighted several limitations that echo concerns they expressed about HIPAA,57 including reliance on human behavior; barriers to monitoring, enforcement, and pursuing penalties; reactiveness (rather than prevention); and limitations associated with delegated decision making (i.e., entities making decisions about data access and use on behalf of research participants). Protections provided by data access committees and data use agreements rely on the integrity and commitments of the individuals involved.
Given the important role of research project governance in protecting participants and maintaining trust in the research enterprise, empirical research is needed to address thought leaders' concerns and identify and strengthen best practices.
SCENARIO 2: RETURNING INDIVIDUAL RESEARCH RESULTS
If the MAS did not contemplate return of individual research results, thought leaders described the risks of participation as low. They believed that a decision to return results could provide direct health benefit for a small proportion of participants — but would increase the risks and potential harms for the majority (Table 3).Reference Beskow58 They suggested that if the MAS were seeking to minimize risks and potential harms, it should either not return individual results or, alternatively, limit return to results that are clinically-actionable.59 They also discussed return of results as the mechanism by which information could eventually be used outside participants' control in ways that might be unanticipated and/or unwanted.60
The primary protections against these kinds of risks and harms include the Common Rule, the HIPAA Privacy Rule, and GINA, as well as related state laws and research project governance.
The Common Rule and Related Protections
The Common Rule, when it applies, requires researchers to disclose whether clinically relevant results will be returned.61 Presumably, the impact of returning results and plans for doing so would be incorporated into the IRB's assessment of risks when reviewing a project like the MAS.
Given their perceptions of risk, it is unsurprising that many thought leaders addressed the importance of not simply notifying participants, but providing them with the opportunity to decide whether or not they want to receive results.Reference Beskow62 Thus, during the consent process, participants who had concerns could decline to receive results or decline participation altogether.
Whether the potentially adverse consequences of receiving unwanted/unexpected results are in fact minimized depends on the quality of the IRB oversight and consent process. For example, some thought leaders foresaw participants saying “yes” without understanding that decision, or receiving results due to a perceived or actual duty on the part of the MAS to inform despite the participant saying “no.”63 To mitigate this concern, some suggested the MAS should establish a governance process to determine what types of results would be returned and detailed procedures for disclosure (e.g., providing consultation, education, referral).64
The HIPAA Privacy Rule and Related Protections
The HIPAA Privacy Rule provides individuals a right to access their own information held by covered entities in a “designated record set,” which may allow participants to access their genomic results, regardless of the approved research protocol.65 Unlike most data generated for research,66 the 2014 amendments to CLIA and HIPAA Privacy Rules provide that laboratory test reports (including genomic sequence data) may fall within the definition of “designated record set.” If the research laboratory that conducts genome sequencing for the MAS were a HIPAA-covered entity, it would have to comply with the HIPAA right of access and provide the participant with his or her identifiable genomic data upon request (Figure 1).Reference Evans67 Typically, the research laboratory would work with coded specimens, so in practice, the requested access would be provided by the MAS data repository, which holds the key to link the coded test reports to the individual's identity. In other words, even if the MAS did not plan to return of results as part of its design, allowing participants to access to their genomic data may be required if the research laboratory used is a covered entity under HIPAA.Reference Evans68
In addition to HIPAA's access rights, a limited number of states create rights to access genetic information that could be used to override researchers' decisions about whether and what kinds of results to return.69 For example, one state's law explicitly applies to research participants and grants them the right to access their genetic information.70 Several other states broadly grant individuals the right to access their genetic information.71 Some have created “property rights” in DNA, although many of these do not make clear whether this includes the right to access genomic research results.Reference Roberts72
Genetic Information Nondiscrimination Act
Thought leaders also discussed several risks and possible harms arising from subsequent disclosures of research results that have been returned, including discrimination in employment and insurance (Table 3).Reference Beskow73
With respect to employment, GINA only prohibits large employers (≥15 employees) from requesting, requiring, or using genetic information for employment decisions, which leaves approximately 15% of all U.S. workers unprotected.74 In contrast, six state genetic discrimination laws apply to employers with five or fewer employees and eleven apply to those with only one employee.75 An individual can sue for employment discrimination under GINA after exhausting administrative remedies, but recovery is limited based on employer size.76 Some states have adopted provisions, such as treble damages, statutory minimum damages, and attorneys' fees and costs, which can facilitate pursuit of a claim.77 Other states explicitly authorize aggrieved individuals to bring a lawsuit, without damage limits.
In addition, under GINA, health insurers cannot deny coverage or charge different premiums on the basis of genetic information.78 The Affordable Care Act (ACA) greatly expanded these protections by prohibiting health insurers from charging more or denying coverage based on pre-existing conditions or other health status factors.79 The ACA does not, however, provide a private right of action to enforce these health insurance rules, which are largely left to government enforcement.Reference Monahan80 A few thought leaders cautioned against long-term reliance on the ACA's protections against discrimination in health insurance in the current political climate, and many of these concerns persist as ACA opponents attempt to repeal or roll back its protections.Reference Beskow and Rovner81
Thought leaders also discussed that returning research results opens up the individual to having to disclose genetic and other health information to life, disability, and long-term care insurers, for which GINA offers no protection (Table 3). One of GINA's known gaps is that it does not prevent life, disability, or long-term care insurers from making coverage or premium decisions based on genetic information. If these insurers ask about genetic test results, participants that have received their research results would be required to disclose them.
Unlike GINA, some states prohibit use of genetic information in underwriting in long-term care insurance, disability insurance, and life insurance.82 Some of these restrict use of genetic information in underwriting unless it is “based on sound actuarial principles or actual or reasonably anticipated claims experience.”83 As many research results will not have been validated and have uncertain clinical implications, these laws may limit long-term care, disability, and life insurers from using such information. Several states create broader protections against unwanted access to and uses of genetic information by any person, not just employers or insurers. For example, several states have criminal laws that penalize acquiring medical information (which is defined broadly enough to include genetic information) without authorization.84 These laws may provide a mechanism for redress should the harm thought leaders identified be realized.
Many thought leaders were reassured by GINA; although they acknowledged gaps in protection, some perceived the risk of genetic discrimination in health insurance coverage as low and/or mostly theoretical.Reference Beskow85 Those who were less reassured pointed to the gaps as well as enforcement challenges, given the difficulty of people knowing — much less proving — they have been discriminated against in employment or insurance decisions based on genetic information. These thought leaders variously described GINA as aspirational, misleading, or promoting genetic exceptionalism.Reference Hammack86
SCENARIO 3: UNINTENDED RELEASE OF DATA WITH POTENTIAL FOR RE-IDENTIFICATION
Thought leaders considered unintended release of data that have potential for re-identification as an important risk of the MAS (Table 4).Reference Beskow87 Such releases can result from an internal failure (i.e., a breach), such as a lost laptop, or from an external attack (i.e., a hack). In either case, the concern is that the multi-faceted richness of MAS data makes it susceptible to being re-identified and used in ways that harm participants. Thought leaders foresaw this risk growing over time, given advances in “big data” science and increases in the availability of data that would enable sophisticated triangulation.
The HIPAA Security Rule and Related Protections
The HIPAA Security Rule,88 which prescribes technological, physical, and organizational requirements for maintaining the security of protected health information, serves as the primary legal tool for protecting against data breaches and hacks. Like the Privacy Rule, the Security Rule does not apply to a researcher or biobank that is not a covered entity or business associate and does not apply to non-electronic information (e.g., biospecimens).89 Nevertheless, the HIPAA Security Rule is seen as establishing an industry standard for securing sensitive electronic data that non-covered entities may follow to reduce the chance of unintended release.Reference Cohen90
Thought leaders generally described technical data security measures, such as those required by the HIPAA Security Rule, as necessary but insufficient.Reference Beskow91 They noted several limitations, such as relying on humans to understand, implement, and enforce them and on mechanisms like audit trails that discover violations only after the fact.92 Moreover, widespread data sharing, which is encouraged (and often required) for scientific purposes, increases the number of times data are transmitted, people who have legitimate access, and places data are stored — with correspondingly increased opportunities for unintended access and potential harm. The likelihood of harm depends on actors' motives, for example, criminal intent versus “white hat” researchersReference Beskow93 (although participants may be concerned regardless of the actor), and the quality of technical security measures and oversight.Reference Hammack94
Thought leaders did not address specific protective measures after a breach or hack, perhaps because they are limited. The HIPAA Privacy Rule requires covered entities to notify affected individuals of a breach and offers the possibility of administrative penalties against the covered entity or business associate who experienced the breach.95 Some states have laws that allow individuals to sue for violation of their genetic privacy.96 These provide a mechanism to seek relief from the entity from which data were released as well as any third party who misappropriates, rediscloses, or misuses genomic data — particularly when the laws also establish statutory minimum damages.97 In addition, one state prohibits re-identification or attempts at re-identification of individuals based on their protected health information.98 Another has an identity theft law that specifically includes genetic information that could provide a mechanism for redress.99
As noted in Scenario 2, GINA would prevent larger employers and health insurers from discriminating against an individual based on genetic information that had been released via a breach or hack, but it would offer no protection against genetic discrimination by other types of entities or insurers.
Research Project Governance
For researchers who receive data from platforms like the MAS, data use agreements may limit the risk of unintended release by setting standards of behavior. In the event of a breach, non-covered entity researchers would not be required to engage in breach notification under HIPAA, but could be contractually obligated to notify the MAS under the data use agreement (Figure 1).100 Such agreements would be theoretically enforceable against researchers who sign them to receive MAS data — although thought leaders were skeptical whether and how enforcement would occur — and could also be used as evidence of standards of care.Reference Hammack101
SCENARIO 4: SUBPOENA, COURT ORDER, OR OTHER LEGAL DEMAND
Thought leaders anticipated government and law-enforcement interest in MAS data, leading to the potential for legal harm (e.g., surveillance, criminal tracking, immigration, national security).Reference Beskow102 In particular, data amassed by the MAS — through collection of existing information (e.g., from medical records) as well as generation of new information (e.g., survey questions, genomic analysis) — could be the subject of a legal demand or a request from law enforcement (Table 5). The latter possibility has gained prominence since ancestry DNA databases were used to solve the “Golden State Killer” and other cold cases.Reference Fuller, Swenson, Swenson, Wolf and Beskow103 Although none of these cases involved a research databank, it is not difficult to imagine law enforcement requesting access to research data, especially as rich a resource as the MAS.
In addition to law enforcement, there may be other legal interest in data from endeavors like the MAS. Most legal demands for research data have occurred in civil matters, such as personal injury (including environmental exposures) or family law cases.Reference Wolf104 As described by thought leaders, access for these kinds of purposes could lead to consequences ranging from legal jeopardy to psychological distress (Table 5).Reference Beskow and Beskow105
Certificates of Confidentiality
Certificates of Confidentiality are congressionally authorized legal tools that provide protection against compelled disclosure of sensitive, identifiable research data “in any Federal, State, or local civil, criminal, administrative, legislative, or other proceeding.”106 Historically, researchers had to apply for this protection. Although the study did not have to be federally funded to receive a Certificate, issuance was discretionary and not guaranteed. As discussed in more detail below, the 21st Century Cures Act (enacted after our thought leader interviews were conducted) changed some of these provisions.Reference Wolf and Beskow107
Thought leaders described Certificates as an extra layer of protection but no guarantee against compelled disclosure, noting uncertainty about their legal effect.Reference Hammack108 They were especially uncertain of Certificates' protections in the context of multi-site research, including whether the protections apply to data once it is shared and whether all research sites would enforce the protections.
These concerns are partially supported by our previous research. The few written court opinions involving challenges to Certificates reveal mixed success in protecting identifiable research data. People v. Newman provides the strongest support for avoiding data disclosure in a scenario like the Golden State Killer; the Newman court refused to compel disclosure of patient photographs to identify a potential murderer because of the Certificate's protections.Reference Wolf109 However, other cases have allowed disclosure of research data, including a case involving a criminal defendant seeking data to dispute the prosecution's case and another arising in the context of child abuse and neglect.110 This variation in outcomes may reflect judges' and attorneys' unfamiliarity with Certificates and the conflict between Certificates' protections and the typically liberal discovery rules in civil litigation and criminal defendants' Constitutional rights.111
The 21st Century Cures Act implemented several changes to the Certificate authorizing statute that address some of the thought leaders' concerns.112 Issuance of a Certificate is now mandatory for federally-funded research, although it remains discretionary for non-federally funded research. Thus, NIH now automatically issues Certificates for research involving human subjects that it funds. Protections also now extend to all copies of the data in perpetuity, such as MAS data that are shared widely for research. The new provisions prohibit protected data from being admitted in evidence or otherwise used in any legal proceeding. However, due to automatic issuance, researchers who have not applied for a Certificate may be unaware of the protections and, thus, may not assert them when necessary.
The disclosure prohibition of the Certificate statute does not apply to disclosures required by federal, state, and local law. NIH discussed this provision in the context of compliance with mandatory public health reporting laws,113 but this exception is not limited to these circumstances. Given the myriad of federal, state, and local laws, there are likely to be other required disclosures, such as to protect vulnerable populations, the environment, or public safety. Once data leaves the research realm in accordance with this exception, it is unlikely that the Certificate's protections — including the provisions about admissibility — apply.114
HIPAA Privacy Rule and Related Protections
For research that, unlike the MAS, does not have a Certificate, the next level of protection — provided by the HIPAA Privacy Rule and state privacy laws — is thin.115 These laws generally allow disclosure to law enforcement to help identify a suspect or in response to a legal demand, without an individual's authorization.116 Although such laws do not require disclosure, it is not difficult to imagine that, absent another legal obligation (e.g., a Certificate), researchers would want to disclose information that could help identify a notorious murderer like the Golden State Killer. HIPAA permits disclosure of limited information to help identify a suspect without an individual's authorization or a legal demand.117 Under certain circumstances, it also allows covered entities to disclose protected health information in response to a civil legal demand for information, such as in a family law or a workplace injury case, without an individual's authorization.118
CONCLUSION
Our analysis of the “web” of protections created by federal and state laws shows that there are areas of strength — particularly where federal protections are further reinforced by state laws — but also gaps where neither federal nor (most) state law protect. Accordingly, researchers and IRBs need to be aware of those protections and gaps to be able to determine the impact of research design on the risks a study presents, as well as what information ought to be conveyed to participants during the consent process. Their task is complicated because there is uncertainty about which state laws apply in the context of national endeavors like the hypothetical MAS, where participants, researchers, and data may be located in different states that potentially afford substantively different protections and fill in gaps in the federal protections.119 Clearly and accurately conveying the information participants need or want to know so as not to provide false reassurance is challenging.Reference Catania and Check120
Our analysis of the “web” of protections created by federal and state laws shows that there are areas of strength — particularly where federal protections are further reinforced by state laws — but also gaps where neither federal nor (most) state law protect. Accordingly, researchers and IRBs need to be aware of those protections and gaps to be able to determine the impact of research design on the risks a study presents, as well as what information ought to be conveyed to participants during the consent process.
The thought leaders we interviewed were generally well aware of the protections federal laws provide and the limitations of those laws. They rarely addressed state laws, but this may reflect the primary focus of our interview guide on federal law. Additional research is needed to identify the extent to which stakeholders are aware of state laws and how they are implemented in practice, as well as the ways that stakeholders are anticipating, addressing, and resolving choice of law questions that arise in research settings. Such research could help others navigate these complex issues, as well as provide insights into crafting consent forms that take into account differences in state law.
Regardless of the apparent strength of the protections afforded by law, such protections ultimately depend on humans to understand, implement, obey, and enforce them. Thought leaders frequently commented on this reliance as the weak link. Additional research may be required to eludicate how well individuals with access to research data understand their legal obligations to protect it and how best to enforce those laws to maximize compliance. There may be educational, technological, oversight, governance, or other mechanisms to make fulfilling and enforcing these obligations easier, potentially decreasing the reliance on individuals and increasing consistency. Efforts to identify and implement effective measures and best practices are necessary to realizing the full scope of protections the laws are intended to provide.
Acknowledgments
This work was supported by a grant from the National Human Genome Research Institute (R01-HG-007733, PI: Laura M. Beskow). Professor Wolf's time was supported in part by a grant from the National Human Genome Research Institute and National Cancer Institute (R01-HG-008605, PIs: Susan M. Wolf, Ellen Wright Clayton, and Frances Lawrenz). The content is solely the responsibility of the authors and does not necessarily represent the official views of NHGRI, NCI, or NIH. The authors thank Barbara Evans and the peer reviewers for their helpful feedback.