Analysis Helpers¶

Overview¶

The analysis package provides five post-processing analyzers that extract deeper insights from a completed SimulationResults object. StatisticalAnalyzer computes descriptive statistics and confidence intervals; SensitivityAnalyzer ranks tasks by their Spearman rank correlation with total project duration; CriticalPathAnalyzer surfaces task criticality indices and the most frequent end-to-end critical sequences; StaffingAnalyzer recommends team sizes using a Brooks's Law–inspired communication-overhead model; and CostAnalyzer computes per-task cost sensitivity and the correlation between cost and schedule duration.

When to use this module: Use these analyzers when you need custom thresholds, deeper statistical breakdowns, or want to integrate simulation insights into your own tooling beyond what the CLI output provides.

Capability	Description
Descriptive statistics	Mean, median, std dev, CV, min/max on any duration array
Percentile calculation	Arbitrary percentile queries on simulation output
Confidence intervals	Bootstrap confidence interval at configurable confidence level
Sensitivity ranking	Spearman rank correlation of each task's duration against total project duration
Critical-path analysis	Per-task criticality index (0–1) and most frequent full path sequences
Staffing recommendations	Team-size vs. delivery-date trade-offs across senior/mixed/junior profiles
Cost sensitivity analysis	Per-task Spearman rank correlation of task cost against total project cost, and Pearson correlation between total cost and total duration

Background: Spearman rank correlation — SensitivityAnalyzer converts each task's sampled durations and the total project durations to ranks before computing the correlation coefficient, making it robust to non-normal distributions and outliers. A coefficient near 1.0 means that task's duration strongly drives the overall schedule.

Background: Communication overhead — StaffingAnalyzer models individual productivity as max(min_prod, 1 − c·(n−1)), inspired by Brooks's Law, so adding team members beyond an optimal point yields diminishing and eventually negative returns on elapsed calendar time.

Imports:

from mcprojsim.analysis.statistics import StatisticalAnalyzer
from mcprojsim.analysis.sensitivity import SensitivityAnalyzer
from mcprojsim.analysis.critical_path import CriticalPathAnalyzer
from mcprojsim.analysis.staffing import StaffingAnalyzer
from mcprojsim.analysis.cost import CostAnalyzer, CostAnalysis

These specialized analysis modules provide focused statistical, sensitivity, and staffing analysis capabilities beyond what SimulationResults provides directly.

`StatisticalAnalyzer`¶

Convenience helpers for descriptive statistics on duration arrays.

Import:

from mcprojsim.analysis.statistics import StatisticalAnalyzer

Methods:

calculate_statistics(durations: np.ndarray) -> dict[str, float] — Returns mean, median, std_dev, variance, min, max, range, coefficient_of_variation
calculate_percentiles(durations: np.ndarray, percentiles: list[int]) -> dict[int, float] — Compute specific percentiles
confidence_interval(durations: np.ndarray, confidence: float = 0.95) -> tuple[float, float] — Return lower/upper bounds for a confidence interval

`SensitivityAnalyzer`¶

Analyzes which tasks have the strongest correlation with total project duration (Spearman rank correlation).

Import:

from mcprojsim.analysis.sensitivity import SensitivityAnalyzer

Methods:

calculate_correlations(results: SimulationResults) -> dict[str, float] — Per-task correlation with total duration
get_top_contributors(results: SimulationResults, n: int = 10) -> list[tuple[str, float]] — Top N tasks by sensitivity

Example:

analyzer = SensitivityAnalyzer()
correlations = analyzer.calculate_correlations(results)

# Show top 5 sensitive tasks
top_5 = analyzer.get_top_contributors(results, n=5)
for task_id, correlation in top_5:
    print(f"{task_id}: correlation = {correlation:.3f}")

`CriticalPathAnalyzer`¶

Specialized analysis for critical paths and task criticality.

Import:

from mcprojsim.analysis.critical_path import CriticalPathAnalyzer

Methods:

get_criticality_index(results: SimulationResults) -> dict[str, float] — Same as results.get_critical_path()
get_most_critical_tasks(results: SimulationResults, threshold: float = 0.5) -> list[str] — Tasks appearing on critical path in >threshold of iterations
get_most_frequent_paths(results: SimulationResults, top_n: int | None = None) -> list[CriticalPathRecord] — Most common paths (wraps results.get_critical_path_sequences())

Example:

analyzer = CriticalPathAnalyzer()

# Tasks on critical path >80% of the time
critical_tasks = analyzer.get_most_critical_tasks(results, threshold=0.8)
print(f"Always-critical tasks: {critical_tasks}")

# Most common path
paths = analyzer.get_most_frequent_paths(results, top_n=1)
if paths:
    print(f"Most common path: {paths[0].format_path()}")

`StaffingAnalyzer`¶

Provides team-size recommendations and breaks down staffing by experience profile.

Import:

from mcprojsim.analysis.staffing import StaffingAnalyzer

Methods:

individual_productivity(team_size: int, communication_overhead: float, min_productivity: float) -> float — Per-person productivity for a given team size. Returns a value in [min_productivity, 1.0].
effective_capacity(team_size: int, communication_overhead: float, productivity_factor: float, min_productivity: float) -> float — Effective person-equivalents for a sized team, accounting for communication overhead and experience-level multiplier.
calendar_hours(total_effort: float, critical_path_hours: float, team_size: int, communication_overhead: float, productivity_factor: float, min_productivity: float, hours_per_day: float) -> float — Elapsed calendar hours for a team. Bounded below by critical_path_hours regardless of team size.
calculate_staffing_table(results: SimulationResults, config: Config) -> list[StaffingRow] — Full table of StaffingRow objects for every team size from 1 up to the observed peak parallelism, for each experience profile.
recommend_team_size(results: SimulationResults, config: Config) -> list[StaffingRecommendation] — One optimal StaffingRecommendation per experience profile (the team size with the shortest calendar duration).

StaffingRow fields:

Field	Type	Description
`profile`	`str`	Experience level name
`team_size`	`int`	Candidate team size
`individual_productivity`	`float`	Per-person productivity after communication overhead
`effective_capacity`	`float`	Total effective team capacity in person-equivalents
`calendar_hours`	`float`	Elapsed calendar hours needed
`calendar_working_days`	`int`	Calendar working days needed
`delivery_date`	`date \\| None`	Projected delivery date
`efficiency`	`float`	Calendar-time optimality: 1.0 means this team size achieves the minimum calendar duration for its profile

StaffingRecommendation fields:

Field	Type	Description
`profile`	`str`	Experience level name (e.g. `"senior"`, `"mixed"`, `"junior"`)
`recommended_team_size`	`int`	Optimal team size (minimum calendar duration)
`total_effort_hours`	`float`	Total effort basis in person-hours
`critical_path_hours`	`float`	Critical-path elapsed hours (lower bound on calendar time)
`calendar_working_days`	`int`	Calendar working days needed
`delivery_date`	`date \\| None`	Projected delivery date
`efficiency`	`float`	Always `1.0` — the recommended size is by definition the calendar-optimal team for its profile
`parallelism_ratio`	`float`	Ratio of total effort to critical-path duration (degree of parallelism)
`effort_basis`	`str`	Effort basis label (`"mean"` or a percentile label such as `"P80"`)

Example:

analyzer = StaffingAnalyzer()

# Get full table for all profiles
table = analyzer.calculate_staffing_table(results, config)
for row in table:
    print(f"{row.profile} team (size {row.team_size}): "
          f"{row.calendar_working_days} days, efficiency {row.efficiency*100:.1f}%")

# Get primary recommendations
recommendations = analyzer.recommend_team_size(results, config)
for rec in recommendations:
    print(f"{rec.profile}: recommend {rec.recommended_team_size} people "
          f"({rec.efficiency*100:.0f}% efficiency)")

`CostAnalyzer`¶

Analyzes cost drivers and the relationship between cost and schedule duration. Only useful when the simulation produced cost data (results.costs is not None).

Import:

from mcprojsim.analysis.cost import CostAnalyzer, CostAnalysis

Methods:

analyze(results: SimulationResults) -> CostAnalysis — Run cost sensitivity analysis and return a CostAnalysis data object. This is called automatically by the engine when cost data is present, and the result is stored in results.cost_analysis.

CostAnalysis fields (dataclass):

Field	Type	Description
`sensitivity`	`dict[str, float]`	Per-task Spearman rank correlation of that task's cost against total project cost. Higher absolute values indicate tasks whose cost variance most drives total cost variance.
`duration_correlation`	`float`	Pearson correlation between total project cost and total project duration. Values near 1.0 mean cost and schedule are tightly coupled; lower values suggest fixed costs or risks decouple them.

Example:

if results.cost_analysis:
    ca = results.cost_analysis
    print(f"Cost–duration correlation: {ca.duration_correlation:.3f}")

    # Top cost drivers
    top_drivers = sorted(ca.sensitivity.items(), key=lambda x: abs(x[1]), reverse=True)[:5]
    for task_id, corr in top_drivers:
        print(f"  {task_id}: cost sensitivity = {corr:.3f}")

Analysis Helpers¶

Overview¶

StatisticalAnalyzer¶

SensitivityAnalyzer¶

CriticalPathAnalyzer¶

StaffingAnalyzer¶

CostAnalyzer¶

`StatisticalAnalyzer`¶

`SensitivityAnalyzer`¶

`CriticalPathAnalyzer`¶

`StaffingAnalyzer`¶

`CostAnalyzer`¶