iTestResults

Test Result Storage: PostgreSQL, ClickHouse, BigQuery

Most teams treat test results like a checkbox: green is good, red is bad, ship or block. The interesting signal lives in everything that happens between those two states — runtime variance, retry counts, the same five tests showing up in every postmortem. That signal is where engineering decisions actually get made. Understanding this signal is crucial for improving software quality and development velocity.

As organizations scale, the complexity of systems increases, creating challenges in storing and interpreting test results effectively. This article addresses how to efficiently store test results using PostgreSQL, ClickHouse, and BigQuery. By leveraging these tools, teams can extract insights beyond binary pass/fail metrics, enabling more informed decision-making.

By the end of this article, you'll understand the specific strengths and weaknesses of each storage option, allowing you to implement them in your CI/CD pipeline for improved test result analysis. This is increasingly important in today's environment, where distributed systems and microservices architectures demand robust, scalable storage solutions capable of delivering actionable insights.

The recent evolution in data storage technologies and the shift towards cloud-native architectures make it essential to revisit how we store and process test data. The ability to analyze vast amounts of data efficiently can significantly impact the way teams handle software quality and reliability.

What This Actually Is

Test result storage is a critical component of modern CI/CD pipelines, providing a foundation for understanding the quality and reliability of software releases. It is not merely about storing pass/fail outcomes; it involves capturing comprehensive data about test executions, including runtime, environment, and historical trends.

PostgreSQL, ClickHouse, and BigQuery serve distinct roles in this space. PostgreSQL offers robust transactional support and complex querying capabilities, making it ideal for applications requiring strict data integrity and relational data models. ClickHouse is designed for real-time analytics and high-speed querying of large datasets, which is perfect for teams needing immediate insights from their test data.

BigQuery, with its serverless architecture, excels in handling massive datasets with ease, providing on-demand scalability and powerful analytical capabilities. It is particularly suited for organizations that require flexible, cloud-native solutions that can scale with their testing needs. Each of these tools fits into a modern test architecture by enabling data-driven decision-making, facilitating continuous improvement and efficiency in the development process.

How To Implement It

Implementing an effective test result storage system requires a clear understanding of the specific needs of your testing and development environment. For teams that prioritize transactional integrity and complex, relational queries, PostgreSQL is a strong choice. The following setup demonstrates a fundamental table structure in PostgreSQL:

CREATE TABLE test_results (id SERIAL PRIMARY KEY, test_name VARCHAR(255), status VARCHAR(10), runtime NUMERIC, execution_environment JSONB, timestamp TIMESTAMP);

This schema captures essential details such as test name, execution status, runtime, and environment details in JSONB format, allowing for flexible queries over structured and semi-structured data.

For real-time analytics, ClickHouse provides impressive capabilities. Its columnar storage format and data compression techniques optimize performance for large datasets. Here's an example of setting up a ClickHouse table:

CREATE TABLE test_results (id UInt32, test_name String, status Enum('pass', 'fail', 'error'), runtime Float32, execution_environment String, timestamp DateTime) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp) ORDER BY (test_name, timestamp);

This setup allows for efficient querying by test name and timestamp, enabling fast access to recent test results and historical trends. One team reported reducing their average triage time from 22 minutes to under 4 minutes by integrating ClickHouse with their Grafana dashboards, allowing for rapid identification of flaky tests.

BigQuery is well-suited for teams with highly variable workloads and a need for extensive scalability. Its serverless nature eliminates the need for infrastructure management, allowing teams to focus on analysis rather than maintenance. Here is an example query to analyze average runtime for failed tests:

SELECT test_name, AVG(runtime) as avg_runtime FROM `project.dataset.test_results` WHERE status = 'fail' GROUP BY test_name ORDER BY avg_runtime DESC;

This query helps identify tests that consistently run longer when failed, highlighting potential areas for optimization. By leveraging BigQuery's powerful querying capabilities, teams can gain insights into test performance patterns and make data-driven decisions to improve their CI/CD processes.

Common Pitfalls

One common pitfall teams face is underestimating the volume and velocity of data generated by a comprehensive test suite. Teams often opt for a simple SQL database, only to encounter performance bottlenecks as data volume increases. Choosing a solution like ClickHouse or BigQuery can mitigate these issues, as they are designed to handle large-scale data efficiently.

Another mistake is failing to adequately index test results, especially in relational databases like PostgreSQL. Without proper indexing, even the most powerful databases can suffer from slow query performance under heavy loads. Regularly reviewing and optimizing index usage as the test suite evolves can significantly improve query efficiency.

Security and access control are often overlooked aspects of test data management. Ensuring that sensitive data within test results is adequately protected is crucial. Implement robust access controls in PostgreSQL and BigQuery, and utilize ClickHouse's role-based access control (RBAC) to limit data access to authorized personnel only, safeguarding your data against unauthorized access.

What Most Teams Get Wrong

A widespread misconception is that pass/fail rates alone provide an adequate measure of software quality. In reality, understanding nuanced metrics such as runtime variance and retry frequencies offers deeper insights into test suite health and reliability, revealing latent issues that simple pass/fail metrics may miss.

Coverage is often misinterpreted as a direct indicator of quality. While high coverage can suggest thorough testing, it does not guarantee the effectiveness of the tests themselves. Focus on the quality and relevance of the tests executed rather than merely achieving a high coverage percentage.

Flakiness is frequently perceived as an insurmountable issue. However, with the right data storage and analysis setup, patterns in flaky tests can be identified and addressed systematically. By using tools like ClickHouse for real-time insights and BigQuery for trend analysis, teams can isolate and rectify the causes of flaky tests, improving overall test reliability and reducing false positives in CI/CD pipelines.

Choosing the right test result storage solution is crucial for extracting actionable insights from your CI/CD pipeline. As you implement PostgreSQL, ClickHouse, or BigQuery, be sure to continuously monitor and iterate on your setup. Once implemented, consider focusing on the mean-time-to-first-signal for production incidents to further enhance your engineering insights and improve incident response times.

Note: This article is for informational purposes only and is not a substitute for professional advice. If you need guidance on specific situations described in this article, consider consulting a qualified professional.

Understanding how systems actually work is the first step toward navigating them effectively.

Browse all articles