Bala Nair is the Enterprise Architect at CRIO. With over 20 years of experience in the technology industry, Bala has shaped several cutting-edge engineering developments, from AOL’s Instant Messenger, to building software that delivered over 50% of video on demand content to U.S. households. At CRIO, Bala partners with other strategic owners to manage CRIO’s back-end systems. Bala holds a B.S. in Physics from the University of Massachusetts.
As the research site industry matures, larger networks are incorporating analytics capabilities to review and act upon the wide variety of data flowing through their systems. Many have set up dedicated analytics teams, along with their own Business Intelligence tools. As a result, many of our clients ask how they can access their data. Often their first question is, “Can you give us an API?,” without specifying their needs.
At CRIO, we have an open Recruiting API that our clients can use to send and retrieve data. It’s part of a broader strategy to create a full Open API approach. Additionally, we offer clients another option – direct data access. Each approach has their pros and cons, but not everyone is familiar with the difference. In this blog post, we’ll explain the key features of direct data access, how it differs from an API, and why direct data access is much better suited for advanced analytics.
What is Direct Data Access?
To provide Direct Data Access, CRIO uses BigQuery, which is an offering from Google. In technical terms, it’s a planet scale serverless enterprise data warehouse, backed by the Google cloud network. In lay terms, it’s a separate reporting database that’s updated in real-time, and is optimized for fast querying.
The Differences between Direct Data Access and an Open API
The primary differences between the two approaches center around the intent of the data provider (in this case, CRIO) and the data access methods. By definition, an API is designed to be narrowly scoped and to support bi-directional data movement between the provider and the client. APIs are meant to be “opinionated” interfaces, defining specific ways to approach a business process and limit access to data in a manner prescribed by the provider. On the other hand, a database like BigQuery is designed to hold the raw data, within a defined schema, and allows the client to decide how to access the data, and on what terms. It can enable the client to provide a highly efficient means of querying the data directly (i.e., without copying over), or of bulk transferring the data to their own database, through industry standard SQL queries. It is meant to be read-only. Thus, it’s perfect for those clients that want to build powerful analytics, and have very customized requirements about what data they wish to access, and how often.
The table below shows the differences. This table is written to discuss the differences from a global perspective, and is accurate with respect to CRIO’s instances as well.
Both Direct Data Access and API clearly have their roles and purposes. At CRIO, we give both options to our clients. Many use our Recruiting API so they can utilize their own CRM systems to manage patient recruitment and scheduling. This API sends leads into CRIO for clinical teams to enroll subjects into a trial and collects data to send back. Others use our Direct Data Access to support customized reporting and analytics programs. The freedom and flexibility offered by Direct Data Access enables much more powerful reporting. This allows our clients to continue to iterate and refine their reports as their needs evolve, and they become smarter, and more knowledgeable, about the ways they can read data and identify critical signals.