How is data linked for research?
Creation of Linkage IDs
To allow data about the same person to be linked across different data collections, data linkers within a Data Linkage Unit (DLU) create unique Linkage IDs (a random string of numbers and letters).
To do this, the data custodians provide the personal information portion plus the local Record ID of each record in their data collections to the DLU. The data custodian requires approval from an HREC before providing the data. The other portion of the record containing the health, education or other data (known as content information) remains with the data custodians, meaning that the data linkers never have access to this data.
Upon receiving the personal information and Record IDs at the DLU, the data linkers assign a Linkage ID to each person. These Linkage IDs are stored on secure computer servers and can only be accessed by authorised DLU staff.
Data custodians provide regular updates of the personal information and Record IDs to the data linkers. The data linkers then check the new data against the existing personal information to see if they already have Linkage IDs for these records using a statistical probability method.
For each record that is determined to be for a new person in the system, the data linkers create a new Linkage ID which is then added to the DLU's Linkage ID collection.
Provision of linkable data to researchers
Researchers wishing to access the data that data custodians hold must undergo a stringent application process requiring approval from each data custodian and also from a HREC that certifies that the study is valid and in the public interest.
Once a project is approved, the data custodians and the staff at the DLU work together to determine which records are required for the study to ensure the minimum amount of information is provided to the researcher. The data linkers then use the Linkage IDs to create Project Linkage IDs that are specific for the approved study. They then send the Project Linkage IDs along with the Record IDs of the required records to the data custodians.
Using the Record IDs, the data custodians extract the required records from their collections and replace the personal information of each record with its matched Project Linkage ID. The researcher is then provided with the content data of each record and its corresponding Project Linkage ID by each data custodian.
Using the Project Linkage ID, the researcher can determine which records from different datasets belong to the same person without having access to the personal information in order to create a merged dataset for their analysis.
The access and use of research datasets is strictly controlled and managed.