Data integrity – Sarah Schlott

Avoiding Hidden Risks: Data Integrity Best Practices with Excel Power Query

Sarah Schlott — Sun, 01 Jun 2025 21:04:11 +0000

Here’s a dirty little secret of finance: the more polished the deck, the more likely there’s duct tape holding the data pipeline together.

I’ve seen it. Flashy dashboards. Perfectly aligned KPIs. Everyone nodding in the boardroom—until someone asks, “How was that calculated?” Cue the mad scramble: Slack threads, undocumented Excel formulas, a stale mapping file last touched two quarters ago.

Here’s the reality: once you start automating with tools like Power Query, your risk profile shifts. Manual errors may go down, but hidden risks go way up. Why? Because the human eye isn’t checking each step anymore—the pipeline is.

And if that pipeline isn’t built with integrity? It can quietly deliver wrong numbers straight into your decision-making.

That’s why I tell every CFO and FP&A lead I work with: fast is easy. Trusted is hard. But if you get this right, it’s your competitive edge.

This isn’t just about “avoiding errors.” This is about engineering pipelines you can trust at scale—through board meetings, audits, and funding rounds.

Here’s how to do it.

Why Data Integrity = Business Risk (Not Just Audit Risk)

Automating fast is easy. Automating with trust? That’s leadership.

CFOs who scale on shaky pipelines lose credibility the moment a board member or investor asks: “Where did this number come from?”

Without data integrity:

Operators lose faith in reporting → they run side spreadsheets.
Boards lose faith in Finance → CFO influence shrinks.
Auditors find gaps → risk skyrockets.

With strong data integrity:

You can trace every number, every time.
Operators trust the data to run the business.
Boards trust Finance to drive strategy.

Integrity = trust. Trust = influence. Influence = impact.

The 5 Best Practices for Trusted Automated Workflows

If you want to engineer pipelines that scale with trust, start here:

1. Architect for Transparency from Day One

Every number must have a clear, documented path to source.

How to do it:

Maintain a “Raw Data” query layer.
Build flow diagrams that show source → transformations → outputs.
Use a README tab to explain key logic.
Name every query clearly (“fx_rates_cleaned,” not “Query1”).

Why: Transparency prevents confusion—and protects you when leadership changes or auditors ask questions.

2. Separate Business Logic from Data Layers

Never hardcode business logic into transformation steps.

How to do it:

Store business rules (mappings, FX rates, classifications) in versioned external tables.
Reference these tables in Power Query.
Track when tables were last updated.

Why: Business logic changes—your pipeline should adapt without breaking.

3. Build QC Checks Into the Pipeline (Not Outside It)

Trust is built on consistency—and QC checks are your frontline defense.

How to do it:

Build reconciliation queries:
- Does revenue match ERP?
- Are totals consistent with GL?
- Are there unexpected nulls, duplicates, or spikes?
Automate variance checks (“Why is this metric suddenly up 50%?”).

Why: QC inside the pipeline catches errors before they hit the board deck.

4. Version and Monitor Everything

Version control isn’t optional—it’s survival.

How to do it:

Archive raw data by reporting period.
Version mapping and business rule tables.
Timestamp every report refresh.
Document changes in a change log (yes, even for Excel!).

Why: If you can’t reproduce a prior report exactly, you’ve lost audit and board confidence.

5. Document Ownership and Change Management

Great pipelines outlast the original builder—but only if ownership is clear.

How to do it:

Assign an owner to each key query/report.
Maintain a change log: what changed, why, when, and by whom.
Review pipelines regularly—don’t let them rot.

Why: Ownership prevents “shadow IT” and ensures accountability.

Common Data Quality Pitfalls (and How to Spot Them Early)

Now that you know what “great” looks like—here’s what to watch for:

Pitfall	What to Watch For	How to Fix
Overwriting raw data	No raw layer preserved	Create a dedicated raw data query
Hardcoding business logic	Logic inside Power Query steps	External versioned mapping tables
Missing versioning	No archives, no refresh dates	Archive raw + track refresh dates
Lack of QC checks	No automated reconciliations	Build QC queries inside pipeline
Poor documentation	Query names unclear, no README	Clear names + README tab
Inconsistent data types	Errors in calculations, odd outputs	Explicit data type settings
Uncontrolled refresh timing	Pipelines break after source changes	Monitor schema + set refresh checks

Real-World Example: CFO Saves a $100M Round

I once worked with a CFO prepping for a $100M Series C.

Their dashboard looked bulletproof—until investors asked, “How was ARR calculated last quarter?”

No version control. FX rates hardcoded. No audit trail.

We rebuilt:

Raw data archived monthly.
FX rates versioned.
ARR logic modular and documented.
QC checks automated.

Result? When diligence resumed, the CFO could walk investors through every number. Series C closed. Confidence preserved.

Lesson: Data integrity = deal confidence.

Why CFOs and Operators Should Care Now

This is no longer “just an audit issue.”

Boards are savvier. Diligence moves faster. Operators demand trusted data for real-time decisions.

If your pipeline can’t:

Trace every number to source
Reproduce prior reports exactly
Explain how key metrics are calculated

…you’re flying blind when the stakes are highest.

Trusted pipelines win boardrooms. Period.

Engineer for Trust, Not Just Speed

I wrote this because too many finance teams are racing to automate—without engineering for trust.

And when the board, auditors, or investors do ask hard questions, “We’ll clean it up” is no longer acceptable.

You don’t need a perfect pipeline. But you do need one that:

Preserves raw data
Documents business logic
Builds QC checks into the flow
Version-controls outputs
Has clear ownership

That’s how you scale trust with your reporting.

If this article gave you new ways to think about protecting your data integrity, please share it. I put real time into this because I want more CFOs and finance leaders building trusted pipelines—not just fast ones.

And here’s one last question to chew on:

If your pipeline broke tomorrow—could your team explain the last board number you reported?

If not—let’s fix that. Now.

The CFO’s Guide to Scaling Financial Data Prep: From Manual to Automated Workflows

Sarah Schlott — Sun, 01 Jun 2025 15:52:42 +0000

Let me give it to you straight: most finance teams are flying their planes while building the wings. And that’s fine—until you hit turbulence.

I’ve worked with scaling companies where the first $10M in revenue was built on ad hoc Excel reports, stitched together the night before the board meeting. And hey—it worked. Until it didn’t.

You can brute-force your way through early-stage reporting. But once the business grows—more entities, more SKUs, more currency conversions, more investors asking harder questions—manual processes start to eat your time and your credibility.

That’s when it’s time to level up. Not just with a shinier dashboard, but with a real data pipeline that turns your reporting from fire drill to strategic weapon.

In this guide, I’ll show you how to move from manual Excel workbooks to automated workflows using tools like Power Query. And I’ll show you how to do it without losing transparency, traceability, or trust.

Why Scaling Data Prep Matters More Than Ever

Here’s the problem: scaling businesses don’t grow linearly—they grow exponentially in complexity.

More SKUs → more revenue streams → more edge cases in revenue recognition. More headcount → more cost centers → more variance analysis to explain. More investors → more reporting deadlines → less room for error.

If your finance function can’t scale its data prep, your team ends up trapped:

Reacting instead of driving insight
Burning cycles on cleanup instead of analysis
Missing opportunities because the data can’t be trusted

The Roadmap: From Manual to Automated Workflows

Here’s how I think about the stages of financial data prep maturity:

Stage	Key Characteristics	Risks
Manual / Ad Hoc	Copy/paste, VLOOKUP, email attachments	High error risk, zero traceability
Semi-Automated (Basic)	Linked Excel files, Power Query basics	Fragile links, version confusion
Automated & Documented	Central Power Query models, raw data refs	Clear lineage, consistent outputs
Fully Integrated Pipeline	Connected to source systems, automated refresh	Minimal manual touch, full audit trail

Most companies live in Stage 1 or 2 way too long. Let’s break down how to move forward.

Stage 1 to Stage 2: Getting Out of Copy-Paste Hell

First, kill the biggest risks:

Stop copy/pasting GL dumps. Use Power Query to pull in raw exports.
Stop building pivot tables on ad hoc data. Build them on structured queries.
Archive raw data before transformation.

Your goal: create a repeatable process where the same inputs produce the same outputs every time.

Stage 2 to Stage 3: Build Documented, Modular Models

At this stage, you want to:

Split transformations into logical steps in Power Query.
Use mapping tables (with version control) for account groupings.
Document key assumptions in a README tab.
Use consistent file paths and folder structures.

Why? Because this is where auditability starts. If you can’t explain how a number moved from source to board deck, trust erodes fast.

Stage 3 to Stage 4: Integrated Pipelines

Here’s where the magic happens:

Connect Power Query directly to ERP APIs or databases.
Automate refreshes on a schedule.
Use version-controlled output folders.
Build automated QC checks into the pipeline (balance checks, outlier flags).

Now you’re not just faster—you’re better. You can prove your numbers, reproduce past reports, and focus your time on insight, not cleanup.

Avoiding Hidden Risks: Data Integrity Best Practices with Excel Power Query

Even a great Power Query pipeline can introduce risks if you’re not careful. Here are common pitfalls and how to avoid them:

1. Overwriting Raw Data

Always preserve raw imports.
Reference them with a “Raw” layer query.

2. Hardcoding Transformations

Use mapping tables, not hardcoded logic.
Document business rules clearly.

3. Uncontrolled Versioning

Store versioned outputs in a controlled location.
Archive each reporting cycle.

4. Lack of QC Checks

Build validation queries.
Reconcile totals to ERP.

5. Poor Documentation

Name queries clearly.
Annotate complex steps.
Maintain a pipeline diagram.

Real-World Example: A $50M SaaS Company

I worked with a $50M SaaS company that was burning 2+ weeks per month on board prep.

Problems:

GL exports manually cleaned every cycle
FX rates layered in after the fact
ARR waterfall rebuilt manually from CRM dumps
No version control on board deck metrics

We rebuilt the pipeline:

Power Query connected to raw GL, CRM, HRIS exports
FX rates table updated monthly, referenced automatically
ARR model built on top of structured CRM queries
Outputs versioned monthly, with refresh dates tracked

Result? Board prep went from 2 weeks to 2 days. And the CFO could answer “Where did this number come from?” without breaking a sweat.

Why This Matters to CFOs and Operators

When your finance team is stuck in manual prep:

You burn time that should go to strategic work.
You introduce risk with every manual step.
You can’t respond quickly to new questions.

When you build an automated pipeline:

You gain consistency and trust.
You reduce audit and compliance risk.
You free up your team to focus on what moves the business.

Build for Scale, Build for Trust

I wrote this because too many good finance teams are trapped in spreadsheet purgatory. And the business is moving faster than their data can.

You don’t need to “boil the ocean.” You just need to start moving up the maturity curve:

From manual to semi-automated.
From semi-automated to documented.
From documented to fully integrated.

And Power Query is one of the most powerful tools you can use to get there—if you use it right.

If this article gave you new ways to think about scaling your financial data prep, please share it. I put real time into this because I want more CFOs and finance leaders building trusted pipelines, not just prettier dashboards.

And if you want to go deeper—whether it’s building smarter financial models, scaling your Excel and Power Query game, mastering custom formulas, or sharpening your career strategy—I offer one-on-one consulting for finance professionals ready to level up. DM me if you want to talk.

And here’s an unconventional thought to leave you with: What if your finance team’s competitive edge wasn’t faster reporting—but reporting your operators and board actually trust?

Are you building pipelines that keep up with your business—or ones that keep your team stuck in cleanup mode?