George W.

Datasaur was born out of a necessity to bridge the gap between raw survey data and academic-grade statistical insights. The challenge wasn't just displaying data, but architecting a system capable of performing complex mathematical computations on-the-fly.

Statistical Automation Pipeline

The core of the application is a robust processing engine built on Pandas and SciPy. I implemented automated workflows for non-parametric tests like Kruskal-Wallis and Mann-Whitney U, ensuring that the platform could intelligently suggest and execute the correct statistical test based on the data distribution.

Data Visualization & Export

To translate these numbers into insights, I built a visualization layer supporting everything from standard histograms to complex Box and Whisker plots. Using XlsxWriter, I developed a custom export engine that allowed users to pull processed data directly into professional-grade spreadsheets with pre-formatted statistical summaries.

Infrastructure & Monolithic Integrity

The project follows a classic monolithic architecture, which proved highly efficient for keeping memory-intensive dataframes close to the processing logic. Today, the platform is self-hosted using a Caddy reverse proxy and MongoDB Atlas, demonstrating the longevity and stability of a well-architected Flask ecosystem.

Datasaur

System Architecture Log

PROJECT LOG // ALGORITHMIC // STATISTICAL PROCESSING

The Engineering Story

Statistical Automation Pipeline

Data Visualization & Export

Infrastructure & Monolithic Integrity