Answer
Validate Strength Score in Power BI using real production submissions. The goal is not just to confirm that a score exists. The goal is to confirm that the score actually reflects workflow quality in a way that makes sense for your program.
A good validation process checks whether:
high-quality submissions score higher
rushed or low-effort submissions score lower
the score distribution feels reasonable
the results match what reviewers would expect
Because Strength Score does not appear in ANVL Insights, validation must be done in Power BI after the workflow has been live in production long enough to collect meaningful data.
Steps
Confirm the workflow is ready for validation
Make sure Strength Score is already configured on the workflow.
Make sure the workflow is active in a production site.
Make sure users are completing the workflow as part of normal work.
Do not try to validate Strength Score using only UAT or test submissions.
Allow real usage time
Let the workflow run in production for about 2 weeks.
Allow normal user behavior during this time.
Try to collect enough submissions to see variation in user behavior and completion quality.
Strength Score needs real behavior patterns to be meaningful.
Review the score distribution in Power BI
Open the Trending Metrics or Weekly Summary report in Power BI.
Filter to:
the specific workflow
the desired time range
the relevant site or sites
Review the overall distribution of scores.
Look for:
high scores
mid-range scores
low scores
You are looking for patterns, not perfection.
Sample real workflows and compare quality
Select a sample of workflows from each score range:
about 5 high-scoring workflows
about 5 mid-range workflows
about 5 low-scoring workflows
Manually review those workflows.
Ask:
do the high scores match clearly strong submissions?
do the low scores match rushed or incomplete submissions?
do the mid-range scores feel reasonable?
is the score rewarding the behavior you intended to encourage?
Look for misalignment
Identify whether the score is behaving in a way that does not match expectations.
Common warning signs include:
strong workflows scoring too low
weak workflows scoring too high
scores clustering too tightly with little variation
one behavior dominating the score too heavily
Adjust and re-evaluate if needed
Adjust weights first if the score is overemphasizing the wrong behavior.
Adjust parameter thresholds if they are too strict or too lenient.
Avoid adding more parameters unless needed.
Republish the workflow if changes are made.
Allow more real usage time.
Review the updated results again in Power BI.
Iteration is normal.
What good looks like
Score ranges feel intuitive.
High-quality completion is consistently rewarded.
Low-effort submissions score meaningfully lower.
Reviewers and stakeholders trust the metric.
Strength Score does not need to be perfect to be useful, but it should feel fair and credible.
