Sales reporting was done manually in Excel. Someone had to pull data, format it, build charts, and email it — every cycle. It was slow, prone to errors when someone was busy or on leave, and inconsistent enough that people had started questioning the numbers.
Repeatability was the core design principle — everything else followed from that.
Built in Python using Pandas for data processing and OpenPyXL for Excel output. Running the pipeline twice on the same data produces identical output. No manual formatting steps that someone might do differently each week.
class SalesReportPipeline:
def run(self, period: str) -> Path:
# 1. Extract
raw_data = self.extract(period)
# 2. Validate
self.validate(raw_data) # fails loudly
# 3. Transform
processed = self.transform(raw_data)
# 4. Calculate trends
with_trends = self.add_trends(processed)
# 5. Generate Excel
output_path = self.generate_excel(with_trends)
# 6. Deliver
self.send_email(output_path)
return output_path
# Same input = same output. Always.Redesigned what the reports showed. Old reports were raw numbers — totals, counts. Rebuilt to highlight trends and patterns: week-over-week movement, outliers, where performance was above or below baseline. That is what makes a report useful in meetings rather than just filed away.
Report goes out on schedule without anyone having to remember to send it. If data quality issues are found, the pipeline flags them in a separate report rather than silently producing bad output.
Reporting became consistent and on-time regardless of who was available. Reports shifted from something people filed away to something people referenced in meetings.
"The shift from here are the numbers to here is what the numbers mean changed how reports were actually used."