What is Data Integrity and Why is it so Important?

A pharmaceutical company boasts about the safety of its new miracle drug. However, when the FDA inspects the offshore production facility, work is immediately halted; critical quality control data is missing. Unfortunately, this is not an isolated case of compromised data integrity. Data accuracy and consistency issues exist across all industries and can cause everything from minor inconveniences to major business problems.

Data health has become a pressing issue in this era of big data, when more pieces of information are processed and stored than ever before — and implementing measures that preserve the integrity of the data that's collected is becoming increasingly important. The first step in keeping data safe is to understand the fundamentals of data integrity and how it works.

What is Data Integrity?
The overall accuracy, completeness, and consistency of data are referred to as data integrity. Data integrity also refers to the security and safety of data in terms of regulatory compliance, such as GDPR compliance. It is upheld by a set of processes, rules, and standards that were put in place during the design phase. When data integrity is secure, the information stored in a database remains complete, accurate, and reliable regardless of how long it is stored or how frequently it is accessed. 

The importance of data integrity in preventing data loss or leakage cannot be overstated: in order to keep your data safe from malicious outside forces, you must first ensure that internal users are handling data correctly. By implementing the appropriate data validation and error checking, you can ensure that sensitive data is never miscategorized or stored incorrectly, thus exposing you to potential risk.

Data Integrity Types 
Maintaining data integrity necessitates an understanding of the two types of data integrity: 
  1. Physical 
  2. Logical. 
Both are groups of processes and methods used to ensure data integrity in hierarchical and relational databases.


1. Physical Integrity 
  • Physical integrity is the safeguarding of data's completeness and accuracy as it is stored and retrieved. Physical integrity is jeopardized when natural disasters strike, power goes out, or hackers disrupt database functions. Human error, storage erosion, and a variety of other issues can also make obtaining accurate data difficult for data processing managers, system programmers, applications programmers, and internal auditors.

2. Logical Integrity
  • Logical integrity keeps data unchanged as it’s used in different ways in a relational database. Logical integrity protects data from human error and hackers as well, but in a much different way than physical integrity does. 
  • There are four types of logical integrity:

Entity integrity
  • Entity integrity relies on the creation of primary keys — the unique values that identify pieces of data — to ensure that data isn’t listed more than once and that no field in a table is null. It’s a feature of relational systems which store data in tables that can be linked and used in a variety of ways.


Referential integrity
  • Referential integrity refers to the series of processes that make sure data is stored and used uniformly. Rules embedded into the database’s structure about how foreign keys are used ensure that only appropriate changes, additions, or deletions of data occur. 
  • Rules may include constraints that eliminate the entry of duplicate data, guarantee that data entry is accurate, and/or disallow the entry of data that doesn’t apply.


Domain integrity
  • Domain integrity is the collection of processes that ensure the accuracy of each piece of data in a domain. In this context, a domain is a set of acceptable values that a column is allowed to contain. It can include constraints and other measures that limit the format, type, and amount of data entered.

User-defined integrity
  • User-defined integrity involves the rules and constraints created by the user to fit their particular needs. Sometimes entity, referential, and domain integrity aren’t enough to safeguard data. Often, specific business rules must be taken into account and incorporated into data integrity measures.

Data Integrity Risks
  • An assortment of factors can affect the integrity of the data stored in a database. A few examples include the following:

Human error: When individuals enter information incorrectly, duplicate or delete data, don’t follow the appropriate protocols, or make mistakes during the implementation of procedures meant to safeguard information, data integrity is put in jeopardy.

Transfer errors: When data can’t successfully transfer from one location in a database to another, a transfer error has occurred. Transfer errors happen when a piece of data is present in the destination table, but not in the source table in a relational database.

Bugs and viruses: Spyware, malware, and viruses are pieces of software that can invade a computer and alter, delete, or steal data.

Compromised hardware: Sudden computer or server crashes, and problems with how a computer or other device functions are examples of significant failures and may be indications that your hardware is compromised. Compromised hardware may render data incorrectly or incompletely, limit or eliminate access to data, or make information hard to use.

  • Risks to data integrity can easily be minimized or eliminated by doing the following: 
  1. Limiting access to data and changing permissions to restrict changes to information by unauthorized parties
  2. Validating data to make sure it’s correct both when it’s gathered and when it’s used
  3. Backing up data
  4. Using logs to keep track of when data is added, modified, or deleted
  5. Conducting regular internal audits
  6. Using error detection software

General Instructions:
  • The overall goal of any data integrity is to ensure data is recorded exactly as intended and, upon later retrieval, ensure the data is the same as it was when it was originally recorded. 
  • Data integrity intends to prevent unintentional changes to information. There must be adequate controls to prevent the manipulation of data. 
  • Any unintended changes to data as the result of a storage, retrieval or processing operation, including malicious intent, unexpected hardware failure, and human error, is the failure of data integrity. If the changes are the result of unauthorized access, it may also be a failure of data security.
  • Controls and systems must be in place to ensure that data is secure and not fraudulent, that it cannot be manipulated, and that changes that occur are easy to detect. 
  • The requirements with respect to data integrity include among others the following: 
  1. The backup data shall be exact and complete. In addition, the backup data shall be secured from alteration, inadvertent erasures, or loss. 
  2. The data shall be stored to prevent deterioration or loss. 
  3. Activities shall be documented at the time of performance (contemporaneously recorded). 
  4. Records shall be retained as original records, true copies or other accurate reproductions of the original records.


  • Complete information, complete data obtained from all tests, complete record of all data, and complete records of all tests performed including the audit trail. All data created as part of a cGMP record must be evaluated by Quality Assurance as part of the release criteria. To exclude data from the release criteria a scientific justification must be valid and documented. 
  • Electronic systems administrator rights shall be with independent authority preferably IT department. 
  • Throughout the data life cycle, the custodian of each document shall be determined and assessed. 
  • Appropriate and approved review procedures shall be in place to ensure the accuracy and integrity of data. 
  • All electronic systems administrators must have appropriate access responsibilities toward data review and release 
  • Appropriate and controlled storage and retrieval procedure shall be available for both paper and electronic records. 
  • All records shall be in a durable format that can be made readily available whenever required. 
  • There shall be adequate controls to prevent the manipulation of data. 
  • Computerized systems exchanging data electronically with other systems shall include appropriate built-in checks for the correct and secure entry, processing, and storage of data, in order to minimize the risks. 
  • Any unintended changes to data as the result of a storage, retrieval, or processing operation, including malicious intent, unexpected hardware failure, unauthorized access, and human error, is a failure of data assurance and reliability and must be investigated.
  • Electronic system controls shall include the use of secure, computer-generated, time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records (with all permissible actions by users controlled by system access controls). 
  • Audit trail documentation shall be retained along with the appropriate data throughout its life cycle. 
  • Controls/procedures shall be in place, defined and protected from unauthorized access, and also tested as part of computer system validation. 
  • Linkage/cross-reference between two hard copies and/or electronic data and hard copies shall be made available and recorded on documents. 
  • Traceability of metadata, equipment used, and material used shall be made available on records. 
  • A second individual to ensure accuracy, completeness, and confirmation of procedures must check data and the reportable values.

 

A

 

Attributable

Who performed an action and when? If a record is changed, who did it and why? Link to the source data.

Who did it? Source

data

L

Legible

Data must be recorded permanently in a durable medium and be readable.

Can you read it? Permanently

recorded

 

C

 

Contemporaneous

The data shall be recorded at the time the work is performed and date / time stamps shall follow in order.

 

Was it done in “real time”?

O

Original

Is the information the original record or a certified true copy?

Is it an original or a true copy?

A

Accurate

No errors or editing was performed without documented amendments.

Is it accurate?

Complete

All information that would be critical to recreating an event is important when trying to understand the event. The level of detail required for the information set to be considered complete would depend on the criticality of the information. A complete record of data generated electronically includes relevant metadata.

Example: All data including repeat or reanalysis performed on the sample.

Consistent

Good Documentation Practices should be applied throughout any process without exception, including deviations that may occur during the process.

Example: Consistent application of data time stamps in the expected sequence.

Enduring

Part of ensuring records are available is making sure they exist for the entire period during which they might be needed. This means they need to remain intact and accessible as an indelible/durable record.

Example: Recorded on controlled worksheets, laboratory notebooks or electronic media.

Available

Records must be available for review at any time during the defined retention period, accessible in a readable format to all applicable personnel who are responsible for their review whether for routine release decisions, investigations, trending, annual reports, audits or inspections.

Example: Available/accessible for review/audit for the lifetime of the record.




Data Integrity Expectation
Attributable: 
means information is captured in the record so that it is uniquely identified as having been executed by the originator of the data (e.g, a person or computer system).

For paper-based records,
  1. Person shall put his/her initials or full signature along with the date and time of activity (as applicable).
  2. The use of a scribe to record activity on behalf of another operator shall be considered only on an exceptional basis and shall only take place where the act of recording places the product or activity at risk, e.g. documenting line interventions by aseptic area operators. In such case, the supervisory recording shall be contemporaneous with the task being performed and shall identify both the person performing the observed task and the person completing the record.

For electronic data records,
  1. Individual Login ID shall be assigned.
  2. The authorization shall be defined that link the user to actions that create, modify or delete data.
  3. An audit trail that shall capture user identification (ID), date/ time stamps and action performed
  • Do not use stored digital images of a person's handwritten signature to sign a document.

Legible: 
The terms legible, traceable, and permanent refer to the requirements that data are readable, and understandable, and allow a clear picture of the sequencing of steps or events in the record so that all GXP activities conducted can be fully reconstructed by people reviewing these records at any point during the defined record retention period.

For paper record
  1. Good documentation practices for recording of data and results shall be followed as per SOP for Good Documentation Practice.
  2. Controlled issuance and archival shall be established for logbooks/bound books, formats, and procedures. All logbooks must be in place, controlled, numbered pages, and provide adequate traceability.

For electronic records
  1. When the archival of electronic records are used, the archiving process shall be done in a controlled manner to preserve the integrity of the records.
  2. The system access (admin) permissions shall only be granted to personnel with system maintenance roles i.e. IT, and engineering that is fully independent of the content of the records (e.g. laboratory and production analysts/ management).
  3. Electronic data shall be saved at the time of recorded activity and before proceeding to the next step of the sequence of events.
  4. Audit trials shall be secured, time-stamped, and attributable to individual activities. Data overwriting shall not be allowed. 
  5. The backup of electronic data shall be validated for disaster recovery. 

Contemporaneous
Contemporaneous data are data recorded at the time they are generated or observed. This documentation shall serve as an accurate attestation of what was done, or what was decided and why, i.e. what influenced the decision at that time.


For paper record
  1. Contemporaneous recording of actions in paper records shall occur, ensuring data entries and information at the time of the activity directly in official controlled documents (e.g, log books, batch records, analytical worksheets) 
  2. Documents shall be appropriately designed to ensure recording of manual activities as occurred. 
  3. The date and time of activities shall be recorded using synchronized time sources (facility and computerized system clocks).

For electronic records
  1. Contemporaneous recording of actions in electronic records shall occur, ensure that data recorded in temporary memory are committed to durable media/permanent storage upon completion of the step or event and before proceeding to the next step or event in order to ensure the permanent recording of the step or event at the time it is conducted. 
  2. Electronic data shall be secured with time/date stamps that cannot be altered by any user/personnel. 
  3. Ensure time/date stamps are synchronized across the GxP operations. 

Original
Original data include the first or source capture of data or information and all subsequent data required to fully reconstruct the conduct of the GxP activity. The GxP requirements for original data include the following: 
  • Original data shall be reviewed. Verification checks must be established to ensure that the people performing/checking the action were present at that time 
  • Original data and/or true and verified copies that preserve the content and meaning of the original data shall be retained. 

For paper record
  • Ensure controls that ensure that personnel conducts an adequate review and approval of original paper records, including those used to record the contemporaneous capture of information. 
  • Data review procedures describe the review of relevant metadata and justified with evidence and made available when required. 
  • Data corrections or clarifications shall be done as per SOP, providing visibility of the original record and traceability of the corrections made. 
  • The original paper record shall always be reviewed by a second competent person. 
  • Controlled and secure storage areas including archives shall be provided for the storage of paper data. 
  • Handling and retention of paper records shall be done as per SOP SOP for Document distribution, Control, Storage and Disposal procedure.
  • Records shall be retained as original records, true copies or other accurate reproductions of the original records.
  • Records shall be indexed to permit ready retrieval.


For Electronic records
  • Ensure controls that ensure that personnel conducts an adequate review of original electronic records, including source data.
  • Any changes in electronic data or metadata shall be documented in audit trials or history fields, justified and available.
  • Audit trail review shall be part of the routine data review/ approval process.
  • Data corrections or clarifications shall provide visibility of the original record and traceability of the corrections made through audit trials or history fields. 
  • Controls/procedures shall be in place, defined and protected from unauthorized access, and also tested as part of computer system validation. 
  • The original electronic record shall always be reviewed by a second competent person 
  • Data shall be retained in a non-editable format or PDF format to maintain the integrity of the original data.
  • Archived records shall be locked, and cannot be altered or deleted without detection and audit trail.
  • Electronic data shall be automatically saved permanently after each separate entry.
  • Back-up copies of original electronic records shall be stored in another location as a safeguard in case of disaster.
  • The archival and backup process shall be validated. 

Accurate
  • means data are correct, truthful, complete, valid and reliable. For paper and electronic records, adequate procedures, processes, systems and controls shall be in place to ensure the accuracy of data.
  • When the activity is time-critical, printed records shall display the time/date stamp.
Activity-based, doer & checker concepts shall be in place to ensure that activities are done accurately.

Data Integrity and Security Assessment
  • Data integrity assessment audits can be performed along with scheduled internal quality audits. A separate/ additional data integrity audit may be conducted by site QA/ Corporate QA to any function/department if any observation related to data integrity is noticed either during a regulatory inspection, customer audit, periodic self-inspections or observed in routine. (Refer: SOP for Internal Audit)
  • Data assessment and review shall be performed periodically as per the “Checklist for Data Integrity assessment” in accordance with the data integrity requirements. 
  • The assessment shall be done not limited to the checklist identifying the improper practices, breach of data integrity or potential source for probable breach of data integrity. 
  • Identified breaches of data integrity shall be assessed for potential impact on the product or process. 
  • Any confirmed data integrity issue shall be documented and investigated as a deviation as per SOP for Handling of Deviation.
  • Identified sources for the breach of data integrity shall be eliminated with the appropriate procedure. Immediate rectification of breach of data integrity shall be done immediately followed by an assessment of risk related to the identified issue. 
  • The investigation of deviation for the inaccuracies in data records and reporting should include, but not limited to,
  1. Interviews of current and former employees to identify the nature, scope, and root cause of data inaccuracies 
  2. Determination of the scope and extent and timeframe for the incident 
  3. A comprehensive retrospective evaluation of the nature of the testing and manufacturing data integrity deficiencies, and the potential root cause(s). 
  4. A risk assessment of the potential effects of the observed failures on the quality of the batches involved.

Post a Comment

Previous Post Next Post
close