Narrative

Sales Reporting — From Manual Excel to Automated Pipeline

hours per cyclezeromanual effort

Sales reporting was done manually in Excel. Someone had to pull data, format it, build charts, and email it — every cycle. It was slow, prone to errors when someone was busy or on leave, and inconsistent enough that people had started questioning the numbers.

PythonPandasOpenPyXLAutomationAnalytics

What Was Broken

How It Was Built

Repeatability was the core design principle — everything else followed from that.

Repeatability-first pipeline
  • Built in Python using Pandas for data processing and OpenPyXL for Excel output.
  • 📄 report_pipeline.py
Reports designed for decisions, not records
  • Redesigned what the reports showed.
Automated delivery — no human bottleneck
  • Report goes out on schedule without anyone having to remember to send it.

Repeatability-first pipeline

Built in Python using Pandas for data processing and OpenPyXL for Excel output. Running the pipeline twice on the same data produces identical output. No manual formatting steps that someone might do differently each week.

report_pipeline.py
python
class SalesReportPipeline:
    def run(self, period: str) -> Path:
        # 1. Extract
        raw_data = self.extract(period)
        
        # 2. Validate
        self.validate(raw_data)  # fails loudly
        
        # 3. Transform
        processed = self.transform(raw_data)
        
        # 4. Calculate trends
        with_trends = self.add_trends(processed)
        
        # 5. Generate Excel
        output_path = self.generate_excel(with_trends)
        
        # 6. Deliver
        self.send_email(output_path)
        
        return output_path
        
    # Same input = same output. Always.

Reports designed for decisions, not records

Redesigned what the reports showed. Old reports were raw numbers — totals, counts. Rebuilt to highlight trends and patterns: week-over-week movement, outliers, where performance was above or below baseline. That is what makes a report useful in meetings rather than just filed away.

Automated delivery — no human bottleneck

Report goes out on schedule without anyone having to remember to send it. If data quality issues are found, the pipeline flags them in a separate report rather than silently producing bad output.

What Changed

Reporting became consistent and on-time regardless of who was available. Reports shifted from something people filed away to something people referenced in meetings.

Manual Effort
hours per cycle
0
fully automated
Report Consistency
varies by person
0
repeatable
Delivery
whenever someone sent it
0
reliable
"The shift from here are the numbers to here is what the numbers mean changed how reports were actually used."

Common Questions

Validation step at the start of the pipeline — check for missing values, unexpected formats, out-of-range numbers. If quality issues are found, the pipeline flags them in a separate report rather than silently producing bad output. Clean data in, clean reports out — garbage in, you need to know about it.
Yes, and for some cases that would be the right call. I chose a Python pipeline here for control and portability — the exact output format was already standardized and expected by stakeholders. Matching an existing format exactly is easier in code than fighting a BI tool's export settings. If the requirement were exploratory dashboards, I would reach for Power BI or Tableau.