Data linkage, costs and timeframes
Data is linked via a four-step process. Read “How is data linked?” to understand the process.
There are three groups involved in data linkage and access processes
- Data Custodians are the people who look after the data collections. Data custodians work within an organisation or agency (such as a government department) and are responsible for the secure collection, use and disclosure of data. Data custodians collect and store personal information (eg. name, address, date of birth) and also content information (e.g. health information such as diagnosis and treatment details).
- Data Linkers are the people who create Linkage IDs which allow data to be linked within and between data collections. Data linkers work in a Data Linkage Unit that is either within, or associated with, a government agency.
- Researchers are the people who use the data for the purpose of analysis and research. This is only possible after an extensive application process and approval by all relevant data custodians and a Human Research Ethics Committee (HREC).
PHRN is addressing Australia’s grand challenges in health services, new therapeutics, healthy ageing, social and environmental impacts on health, and prevention of emerging diseases.
Our clients use linked data to
- Investigate the distribution, origin, associated conditions and outcomes of disease
- Evaluate policies and services
- Assess the health and wellbeing of Australians
- Better identify issues of population health importance, plan services and interventions to address these issues
- Monitor and evaluate the effectiveness of vaccines, services, treatments and interventions
Whilst there is no single answer to this question, there are some known factors that influence how long it takes to get data. These factors also impact costs. The following table summarises known factors that impact project timeframes and costs.
In Australia, all research projects using linked data must be first submitted to the data linkage unit to be assessed for technical feasibility. Formal approval must then be obtained from:
- the data custodian responsible for each data set; and
- all relevant Human Research Ethics Committee(s) (HREC).
Additional approvals may be required, depending on the data collection(s) involved and/or the jurisdiction. The approvals required for each data collection are listed in the PHRN Metadata Platform.
The process of obtaining approvals and the time involved will vary between data linkage units, data custodians and HRECs.
Data linkage unit charges usually apply for client services, linkage, geocoding or extraction requests that fall outside core activities. Each data linkage unit within the PHRN has its own independent fee structure. Funds derived from these charges assist in supporting staff and equipment used for the ongoing development of the linkage system.
The cost of linkage varies for each project. The fee is based on a number of factors, commonly including:
Data quality
- Quality of person identifiers in data collections being linked
- Format of data provided for linkage
Size of request and amount of linkage required
- Number of records that require linkage for a project
- Time period requested
The complexity of the request
- Number of jurisdictions involved
- Number of data collections involved
- Types of datasets requested
- Study design
See costs and timeframes summary for further detail on the above factors.
If you would like a quote for a grant or to scope your single jurisdiction project, please contact the Client Services Officer of the Data Linkage Unit(s) which covers the jurisdiction(s) that your data will come from. For a cross- or multi-jurisdictional linked data project, a quote can be requested via the PHRN Online Application System.
Data storage and access charges for using SURE were introduced on 1 January 2013. The fees support the operational costs of maintaining SURE and providing support services to users. Estimates of SURE access charges are calculated based on project duration, user numbers and computing resources required by users for the project. Fees indicated are exclusive of GST.
If you are preparing a grant application for a project, please send a short email to SURE with the project title, number of investigators on the research team, the expected commencement date and the expected duration of the project. The SURE Team will be happy to provide an estimate of access charges, even if your study is only in the early stages of planning. However, please note that estimates of access charges may be subject to change depending on project commencement date and finalisation of project set-up requirements.
Other secure environments
Single jurisdiction projects may be required to store and analyse linked data from a secure virtual environment nominated by the jurisdiction. There may be fees associated with data access and storage. For more information, please contact the relevant Client Services Officer of the Data Linkage Unit(s).
See also the PHRN Access and Pricing Policy.
Applying for linked data
This depends on the data linkage unit and jurisdiction. Generally data has to be stored and accessed from within Australia by a researcher employed by or affiliated with an Australian institute/organisation.
Thousands of researchers, health professionals, government policy-makers and planners have used the PHRN’s vast resource of over 300 linked datasets to answer important research and evaluation questions. Our clients represent the following sectors.
- Universities and medical research institutes
- State/territory and Commonwealth government departments and agencies
- Non-government organisations
- Industry
Researchers proposing the use of linked data will be required by the data linkage unit, Human Research Ethics Committees and the data custodian of each of the data collections for which they are requesting data to provide a detailed data security plan outlining how the personal information and the privacy of the people whose information is being used will be protected.
Personal information is defined in s6(1) of the as:
“…information or an opinion (including information or an opinion forming part of a database), whether true or not, and whether recorded in a material form or not, about an individual whose identity is apparent, or can be reasonably ascertained, from the information or opinion”.
Researchers should incorporate into their data security plan any standards, guidelines and policies on security specified by their organisations or the data custodians from which they are requesting data. Researchers will also need to be aware of relevant legislation that imposes obligations in relation to personal information security. Researchers are required to develop and implement a data security plan to eliminate, minimise or manage risks associated with data linkage projects.
Researchers proposing the use of linked data will be required by the data linkage unit, Human Research Ethics Committees and the data custodian of each of the data collections for which they are requesting data to provide a detailed data security plan outlining how the personal information and the privacy of the people whose information is being used will be protected.
Personal information is defined in s6(1) of the as:
“…information or an opinion (including information or an opinion forming part of a database), whether true or not, and whether recorded in a material form or not, about an individual whose identity is apparent, or can be reasonably ascertained, from the information or opinion”.
Researchers should incorporate into their data security plan any standards, guidelines and policies on security specified by their organisations or the data custodians from which they are requesting data. Researchers will also need to be aware of relevant legislation that imposes obligations in relation to personal information security. Researchers are required to develop and implement a data security plan to eliminate, minimise or manage risks associated with data linkage projects.
For a quick overview please see the Creating a strong application infographic, check out tips for a successful application, or read up further on the application process for more in-depth guidance .
Data analysis and publication
All jurisdictions recommend and some require the use of a secure environment for data storage, access, and analyses.
Secure environments have two key features
- Curated gateway: a checkpoint for data and files moving in and out of the environment
- Contained environment: project data, tools and software are kept in a single space, separated by project
Secure environments
As a researcher you will receive only the Project Person Numbers (PPN), Project Event Number (PPE) and their associated content variables, as listed in your approved application.
The amount of data researchers receive and how it’s structured depends on the number of data files and variables requested, the temporal scope, and the size of the requested cohort.
Depending on the data linkage unit involved, the data may be provided to the researcher already merged. In most cases the researcher will receive the data as multiple files and will be required to merge the data themselves. A separate file is usually provided for each data collection in each year. For example, a researcher applying for data from the birth registry, perinatal data collection and admitted patient data collection, for the date range 2000-2009, would typically receive 30 files in total.
The data will be delivered in a variety of different formats, depending on the data linkage unit and data collection involved. Some data linkage units may deliver the data in a standardized format that can be easily read into any statistical analysis software, e.g. tab delimited text files. The format and standards relating to core datasets (those that are routinely linked) can be accessed here. In addition to the data files, researchers may also be given metadata for each corresponding data collection, including a data dictionary. The data dictionary provides coding information to assist researchers in interpreting the data.
Yes. Most data linkage units provide information reports to researchers about data quality and the linkage processes used to produce their linked data.
These reports differ by data linkage unit, but typically describe the following:
- Datasets that were used in the linkage
- The identifiers used for linkage
- The quality of the data (e.g. proportion of missing values)
- How the data has been edited and standardised before linkage
- The type of linkage performed/linkage method
- The level of matching achieved
- A statement about the level of matching achieved and some assessment of the level of error (type I and type II) in the output
These reports are important to allow researchers to consider and assess the factors that may have influenced the linked data quality, and develop strategies that can mitigate the impact of linkage error in their analyses.
For more information about how to assess the quality of data linkage, you may like to undertake Researcher Training or contact the relevant data linkage unit Client Services team.
Prior to publishing research findings from linked data research, it is important to ensure that all the relevant people and organisations are aware of any publications and have had the opportunity to review them. See more information.
The names of the datasets used in your research must be listed in your publication. Some datasets have DOIs. These can be found on the Metadata Platform.
Data security
The PHRN has been developing a robust data information security program designed to offer the highest level of protection to data involved in linkage and research.
Information security controls used by the PHRN’s Data Linkage Units (DLUs) can be divided into four key categories:
- Physical security – DLUs must ensure strict security barriers and entry controls are in place at all locations where data records are stored
- IT security – Stand-alone networks, firewalls, password protection, anti-viral software and encryption for data transfer must be standard practice at all PHRN DLUs
- Personnel security – access to data limited to those personnel whose work responsibilities specifically require it
- Administrative security – extensive work has been completed on a range of approved written policies, procedures, standards, guidelines, security training, and risk assessments that will help guide the ongoing security management of all PHRN DLUs. External reviews have also been completed.
There are a number of security measures also undertaken to ensure the data remains safe once provided to the approved researchers. These include:
- Approval of security plans from Human Research Ethics Committees and data custodians
- Legally binding contracts and confidentiality agreements with data custodians
- Successful completion of compulsory online researcher training covering privacy and security
- Receipt of data from custodians in encrypted format, or access via a dedicated secure access environment
Secure data access
Data can be stored, accessed and analysed via a secure environment.
Consequences of Breaching Researcher Responsibilities
The consequences of breaching your responsibilities as a researcher will depend on the nature of the breach, the applicable law and your contractual obligations. Action may be taken in respect of one or all of the following: your research conduct; ethical conduct; and legal responsibilities. The types of action which may be taken include:
Australian Code for the Responsible Conduct of Research
- Disciplinary issues are matters for the institution and the Guide to Managing and Investigating Potential Breaches of the Australian Code for the Responsible Conduct of Research (2018) provides a framework to assist institutions to manage, investigate and resolve complaints.
Legislation
- Depends on the applicable law but in some cases this can result in criminal conviction, jail time or substantial fines
Other contractual arrangements
- Depends on the specific contractual arrangements but there can be financial penalties and/or loss of access to data in the future for the individual and/or their institution.
Miscellaneous
The PHRN was implemented through the National Collaborative Research Infrastructure Strategy framework, an initiative of the Australian Government. Commonwealth, state and territory government agencies and academic institutions make significant cash and in-kind contributions to PHRN activities.